Notebooks on Language: von Mises: Probability, Statistics and Truth (1951), ch. 1

Wednesday, March 26, 2014

von Mises: Probability, Statistics and Truth (1951), ch. 1

Richard von Mises; from Wikipedia.

Richard von Mises was an important proponent of the frequentist philosophy of probability.

In his book Probability, Statistics and Truth, he militates against the use of the word "probability" for anything other than indefinitely repeatable experiments with converging relative frequencies (pp. 10–12). He also compares probabilities to physical constants like the velocity of a molecule (p. 21) and asserts that the law of large numbers is an empirical generalization comparable to physical laws like the conservation of energy (pp. 16, 22, and 26).

Reference Class Relativism

A consequence of this frequentist notion of probability is that specific events do not have probabilities. Only infinite classes of comparable events can have probabilities.

For instance, when your coin comes up heads at 10 o'clock, that's a different event from the coin coming up heads at 11 o'clock in infinitely many ways. Only because you choose which properties from the situation to select can you identify the two events as equivalent.

As a kind of argument for this reference class relativism, von Mises asserts that a specific person has a different probability of dying depending on the reference class (e.g., people over 40, men over 40, male smokers over 40, etc). We thus have to explicitly select the reference class before we can talk about "the" probability.

He comments:

One might suggest that a correct value of the probability of death for Mr. X may be obtained by restricting the collective to which he belongs as far as possible, by taking into consideration more and more of his individual characteristics. There is, however, no end to this process, and if we go further and further into the selection of the members of the collective, we shall be left finally with this individual alone. (p. 18)

Even from a frequentist perspective, I'm not sure this makes sense. The fact that we have narrowed down our reference class so much that there is only a single real person left in it should not change the fact that we still have an intensional definition of the class. In so far we do, we should be able to apply that definition to the outcome of any sequence of candidates, like an infinite sequence of people or experiments. In reality, it is only data sparsity that keeps use from going "further and further."

So I think von Mises has a theoretical choice to make: Either, he must require that reference classes be actually infinite, or he must merely require that they be potentially infinite.

"Randomness" and Insensitivity to Subsequence Selection

Von Mises spends a large part of the lecture elaborating a notion of "randomness" which is intended to capture the difference between asymptotically i.d.d. sequences and not asymptotically i.d.d. sequences with the same limiting frequencies. He does so by adding the requirement that the limiting frequencies are independent of subsequence selection.

A possibly more intuitive way of stating that definition would be in terms of a Topsøe-style game: A structure-finding player is tasked to pick infinitely many places in a sequence based on past data and is rewarded when the empirical frequencies fails to converge to a given distribution; a structure-hiding player is tasked to select the sequence and is rewarded when the frequencies do converge to the given distribution.

If the structure-hider then introduces any systematic dependence between the experiments, the structure-finder can exploit these regularities to outgamble the structure-hider. Thus, only asymptotically i.d.d. sequences are part of an equilibrium.

I haven't checked the details, but this game seems to be the same as that suggested by Shafer and Vovk, although (if I remember correctly), they only consider fair (that is, maximum-entropy) i.i.d. coins, not arbitrary biases. But at any rate, coin flipping is, like distributions on a finite set, one of the cases in which there is a maximum entropy distribution even in the absence of an externally given mean.

Notebooks on Language

Wednesday, March 26, 2014

von Mises: Probability, Statistics and Truth (1951), ch. 1

Reference Class Relativism

"Randomness" and Insensitivity to Subsequence Selection

No comments :

Post a Comment