Notebooks on Language: Fisher: Statistical Methods and Scientific Inference (1956), chapter 5.7

Monday, June 2, 2014

Fisher: Statistical Methods and Scientific Inference (1956), chapter 5.7

In chapter V of his book on statistical inference, Fisher compares various likelihood-based approaches to coming up with a predictive distribution. Some of the details are quite obscure, and I have problems following his argument in several places.

Section 2 is dedicated to to the Bayesian solution for a series of coin flips. It contains, among other things, a strange reference to coefficients other than the binomials, suggesting that these can be replaced freely with any other suitable polynomial (p. 112). I am not quite sure what he means or how he imagines this should be justified.

Section 7 is dedicated to a particular type of frequentist prediction. The set-up is that we have observed the counts a and b (heads and tails) and that we would like to find the likelihood of a continuation having counts c and d.

In order to find this, he suggests that we compute the likelihood ratio

Pr(a, b | f) Pr(c, d | f) / Pr(a + c, b + d | f).

The logic behind this computation is that a/b is close to c/d, then the joint probability of those two ratios is approximately the same as the probability of the pooled ratio (a + c)/(b + d). If, on the other hand, they both deviate highly from the maximum likelihood estimate (in different directions), then the joint probability will be lower than the pooled reference value. However, the actual value of the parameter of the coin flip cancels out from the fraction and thus plays no direct role in the computation.

Fisher goes through an example in which a = 3, b = 16, c = 14, d = 7. In order to compute the many factorials for this example, he uses the (very rough) approximation

x! ≈ x^x.

Or at least that's what I think he does — he comments that this x^x is "among others having the same margins," whatever that is supposed to mean (p. 129).

N! (green) and the approximation N^N (red).

At any rate, we can redo his computation using both his own approximate method and the exact formula, getting a likelihood of about .004 (Fisher's result) or .001 (the exact result).

As he suggests on page 130, we can also compile a table of likelihoods for various other values of c and d adding to 21. We can collect all of these results in a table like the following:

Count	Fisher	Exact	Bayes
0	.094	.098	.048
1	.0499	.223	.109
2	.0836	.309	.151
3	.991	.336	.164
4	.964	.311	.152
5	.817	.256	.125
6	.621	.192	.094
7	.432	.133	.065
8	.277	.085	.042
9	.164	.051	.025
10	.090	.028	.014
11	.046	.015	.007
12	.022	.007	.003
13	.009	.003	.002
14	.004	.001	.001
15	.001	.000	.000
16	.000	.000	.000
…	…	…	…
Sum	5.867	2.050	1.000

As the table shows, the exact likelihoods coincide with the Bayes estimates if we normalize them. I think this is only the case because the numbers are large enough because of an issue with the normalizing constants in a Dirichlet distribution, but I don't have time to check the details now.

Notebooks on Language

Monday, June 2, 2014

Fisher: Statistical Methods and Scientific Inference (1956), chapter 5.7

No comments :

Post a Comment