Tuesday, April 14, 2015

Chrystal: "On Some Fundamental Principles in the Theory of Probability" (1891)

I've been trying to get my hands on the following paper:
George Chrystal (University of Edinburgh): "On Some Fundamental Principles in the Theory of Probability." Transactions of the Actuarial Society of Edinburgh, Volume 2, January 1891, pages 420–439.
So far, no luck. The Cambridge Journals database has a copy, but it's behind a paywall, and my university doesn't have a subscription.

However, a number of other sources quote extensively from the paper, so I've been able to piece together an understanding of what it looks like.

Posterior Frequencies

It seems that Chrystal's main beef is with the use of Bayes' rule to update the probability of a certain set of hypotheses whose long-term frequencies are already given in advance. His reasoning seems to be that this conflates our subjective degree of belief (which may indeed change) with the objective frequency (which, by assumption, cannot).

This philosophical distinction is nicely presented in the following quote. It comes from an 1894 review (also available in book form on the Internet Archive) by somebody called G. F. Hardy, not to be confused with G. H. Hardy.
"There is," says Professor Chrystal, "in Laplace's view, a confusion between two senses of the word 'Probability', which although distinct are often more or less associated in point of fact. In common speech we say that a single event is more or less 'probable', and by this word we indicate our own mental attitude towards the event, an attitude that may be well or ill justified by facts. When an actuary says that the probability that a man of 20 will live to be 60 is $\frac{59}{97}$, he is not, strictly speaking, referring to any one event at all, but merely making an assertion to the effect that out of any considerable number of men of 20 years of age about $\frac{59}{97}$ will reach the age of 60. No one knows better than an actuary that this statement is a fact, established (under certain circumstances, and with certain limitations), by experience, and that it has nothing whatever to do with the mental attitude of anyone. Everyone will admit that we could never arrive at this result by analyzing the event of a man of 20 reaching or not reaching the age of 60 into cases regarding each of which we should be equally undecided,—mentally suspended, as it were, like Buridan's ass between the equal bundles of hay." (Hardy, p. 316)
It's not clear whether the emphases are in the original, but I'm guessing not.

Hardy does not give a page reference, but proper reference seems to be page 423. I got that figure from the manuscript of a 1893 presentation by a certain John Govan, F.F.A. (whose name, location, and date fit the proselytizing businessman John George Govan).

Govan's rendition of the quote occurs on his page 212. He doesn't use the emphases.

The Burial of Bayes

Several sources also report that Chrystal summarizes his discussion with the following tirade:
… both from the point of view of practical common-sense, and from the point of view of logic, the so-called laws of Inverse Probability are a useless appendage to the first principles of probability, if indeed they be not a flat contradiction of those very principles.
This is cited by a number of authors, including E. T. Whittaker, F.R.S. (in a footnote to a 1920 presentation, p. 165), Andrew I. Dale (1999, p. 485) and Sharon McGrayne (2011, p. 37).

Dale reports that this quote is found on page 438. Whittaker apparently reports it as page 421 (but that would be at odds with Dale's description of the quote as a occurring near the conclusion of the essay). McGrayne doesn't give a page number.

According to Whitaker, the quote continues as follows:
The laws of Inverse Probability being dead, should be decently buried out of sight, and not embalmed in text-books and examination papers.
McGrayne further reports the following conclusion:
The indiscretions of great men should be quietly allowed to be forgotten.
Checking the relevant footnote of McGrayne's book (note 9 of ch. 3, p. 260), this turns out to be a recycled quote from Anders Hald's A History of Mathematical Statistics, page 275. That book doesn't have a Google Books preview, and it's not in my library.

Sort-of-Long-Term Frequencies

Hardy's review continues to quote Chrystal's discussion of how probability is to be defined:
"The notion of probability is always attached to a class or series of events, which usually have more or less of other attributes in common, but are always distinguished by this mark, that certain phases of them, although not predicable with the smallest certainty in any individual case, are predicable with more or less uniformity in a certain proportion of cases in the long run. The fundamental features of this series are statistical uniformity combined with irregularity of every conceivable kind in the individual instance. The number of the events in the series must be large. Its extension both as to space and time is arbitrary, and in certain ideal cases infinite. It is in this last respect alone that probability has anything to do with our mental attitude; we may choose our standpoint, and this determines the probability to which our knowledge may make a better or worse approximation. As the series is varied the probability alters. . . .  We are thus led to the following abstract definition of the probability or chance or an event. If, on taking any very large number, N out of a series of cases in which an event A is in question, A happens on pN occasions, the probability of the event A is said to be p." (Hardy, p. 317)
Again, no page number is given.

Posterior Priors

The examples in Chrystal's paper seem all to be of the same kind: He describes a set-up in which certain a priori frequencies are given, and he then tells us that no amount of evidence should be able to change those frequencies; the only mental operation we can perform is to exclude logically impossible cases, not to compute posterior probabilities.

Govan thus quotes him as discussing a situation in which you draw two white balls from a bag of black and white balls. Then:
"Any one," says Professor Chrystal, "who knows the definition of mathematical probability, and who considers this question apart from the Inverse Rule, will not hesitate for a moment to say that the chance is $\frac{1}{2}$; that is to say, that the third ball is just as likely to be white as black. For there are four possible constitutions of the bag . . . each of which we are told occurs equally often in the long run, and among those cases there are two . . . in which there are two white balls, and among these the case in which there are three white occurs in the long run, just as often as the case in which there are only two." (Govan, p. 208)
According to the text of Govan's discussion, this quote must be on or around page 435 of Chrystal's text.

Another very similar example is attributed to Chrystal's page 437:
"A bag contains five balls which are known to be either all black or all white—and both these are equally probable. A white ball is dropped into the bag, and then a ball is drawn out at random and found to be white. What is now the chance that the original balls were all white?" Professor Chrystal asserts that the chance is precisely what it was before, viz. $\frac{1}{2}$. (Govan, p. 208–209)
"The ball drawn out," says Professor Chrystal, "may have been the one we put in, it may not; and this is all that any one can say." (Govan, p. 209)
Note that this is quite upside-down compared to how we usually think about frequentism: Here, Chrystal tells us to ignore the likelihoods and put all our confidence in the priors. We are used to thinking about frequentists as doing the exact opposite.

The Essential Tension

Govan objects to Chrystal's principle of rejecting the evidence:
… let us say we have two bags before us, one containing six white balls, the other five black balls and one white. There is nothing to indicate which is which. We draw from one of the bags chosen at random a ball which proves to be white. It is difficult to believe that any man in the possession of his faculties, say if his life depended on his guessing aright from which bag the ball had come, would hesitate to guess the former. Even according to Professor Chrystal he would be right 6 times out of 7 in the long run. Yet, again according to Professor Chrystal he would would be just as likely to be wrong as to be right. (Govan, p. 209)
Although they are talking past each other, this is certainly the core of the issue: The distinction between optimal, adaptive gambling behavior and fixed, objective frequencies.