- criticizes Bayesian statistics on the grounds that it is sensitive to reparametrizations of the hypothesis space;
- emphasizes the concept of likelihood and its differences from probability;
- presents, justifies, and illustrates the concept of Fisher information.
First, a centerpiece in Bayes' original paper was the postulate that the uncertainty about the bias of a coin should be represented by means of a uniform distribution. Fisher comments:
The postulate would, if true, be of great importance in bringing an immense variety of questions within the domain of probability. It is, however, evidently extremely arbitrary. Apart from evolving a vitally important piece of knowledge, that of the exact form of the distribution of values of p, out of an assumption of complete ignorance, it is not even a unique solution. (p. 325)Second Bayesian topic is ratio tests: That is, assigning probabilities to two exclusive and exhaustive hypotheses X and Y based on the ratio between how well they explain the data set A, that is,
Fisher in 1931; image from the National Portrait Gallery. |
Pr(A | X) / Pr(A | Y).Fisher comments:
This amounts to assuming that before A was observed, it was known that our universe had been selected at random for [= from] an infinite population in which X was true in one half, and Y true in the other half. Clearly such an assumption is entirely arbitrary, nor has any method been put forward by which such assumptions can be made even with consistent uniqueness. (p. 326)The introduction of the likelihood concept:
There would be no need to emphasise the baseless character of the assumptions made under the titles of inverse probability and BAYES' Theorem in view of the decisive criticism to which they have been exposed at the hands of BOOLE, VENN, and CHRYSTAL, were it not for the fact that the older writers, such as LAPLACE and POISSON, who accepted these assumptions, also laid the foundations of the modern theory of statistics, and have introduced into their discussions of this subject ideas of a similar character. I must indeed plead guilty in my original statement of the Method of the Maximum Likelihood (9) to having based my argument upon the principle of inverse probability; in the same paper, it is true, I emphasised the fact that such inverse probabilities were relative only. That is to say, that while we might speak of one value of p as having an inverse probability three times that of another value of p, we might on no account introduce the differential element dp, so as to be able to say that it was three times as probable that p should lie in one rather than the other of two equal elements. Upon consideration, therefore, I perceive that the word probability is wrongly used in such a connection: probability is a ratio of frequencies, and about the frequencies of such values we can know nothing whatever. We must return to the actual fact that one value of p, of the frequency of which we know nothing, would yield the observed result three times as frequently as would another value of p. If we need a word to characterise this relative property of different values of p, I suggest that we may speak without confusion of the likelihood of one value of p being thrice the likelihood of another, bearing always in mind that likelihood is not here used loosely as a synonym of probability, but simply to express the relative frequencies with which such values of the hypothetical quantity p would in fact yield the observed sample. (p. 326)In the conclusion, he says that likelihood and probability are "two radically distinct concepts, both of importance in influencing our judgment," but "confused under the single name of probability" (p. 367) Note that these concepts are "influencing our judgment" — that is, they are not just computational methods for making a decision, but rather a kind of model of a rational mind.
No comments :
Post a Comment