Tuesday, March 18, 2014

Lewis: "Humean Supervenience Debugged" (1994)

In this paper, David Lewis wrings his hands at a phenomenon he calls "undermining." He considers a probabilistic model as "undermined" if it assigns a positive probability to a data set that would cause a rational agent to adopt a different model.


To say this in a vocabulary closer to Lewis', suppose that C(F | E) is the posterior probability a rational believer would assign to the event F in light of the evidence E. Suppose further that we are looking at the specific case in which E reveals the actual parameters of the world (e.g., "This coin has bias 0.3"), and F is a possible future which would produce a different subjective belief in our hypothetical observer (e.g., "The empirical frequency will be 0.4")

The question is then: Given E, does the possible future F have zero probability, or positive probability? Without giving any argument, Lewis asserts that
there is some present chance that events would would go in such a way as to complete a chancemaking pattern that would make the present chances different from what they actually are. (p. 482)
But he contrasts that with the following "argument":
But F is inconsistent with E, so C(F/E) = 0. Contradiction. (p. 483)
The former of these quotes seem to indicate that he is thinking about a finite sample from the model (consistent with the example he gives on p. 488). The latter argument, on the other hand, seems to assume that he is talking about a limiting frequency from an ergodic process or something like that — unless he seriously believes that empirical frequencies cannot differ from parameter values.

The Super-Objectivist

But this way of putting the argument is of course alien to Lewis. He has no concept of a statistical model, and he thinks that the credence of a rational agent is a unique and well-defined concept that doesn't require any assumptions:
Despite appearances and the odd metaphor, this is not epistemology! You're welcome to spot an analogy, but I insist that I am not talking about how evidence determines what's reasonable to believe about laws and chances. Rather, I'm talking about how nature—the Humean arrangement of qualities—determines what's true about laws and chances. Whether there are any believers living in the lawful and chancy world has nothing to do with it. (pp. 481–82)
This is even stronger and more absurd than classical objectivism. Instead of just discarding certain models as inconsistent with the evidence, Lewis assumes that the evidence suggests a single optimal model out of its own accord. For no apparent reason, he also wants "nature" to do this in a retrospective manner even though there is no reason to, given that he has expelled all subjective observers from the universe.

The Big Flip

Lewis' own solution to the "paradox" is to say that credences should be conditioned on "theories" as well as data — but "theory" doesn't quite mean what it sounds like. This is evident from the example he gives towards the end of the paper.

In this example, he assumes that a coin has exhibited a frequency of 2/3 heads in the past, and he assumes that this means that our hypothetical rational agent estimates its bias to be 2/3.

The "theory" T that he wants us to consider is then that the next 10,002 coin flips exhibit a frequency of exactly 2/3 heads, i.e., 6,668 heads and 3,334 tails. This event has the binomial probability
Pr(T) = B(6,668; 10,002, 2/3).
He then asks us to consider a possible future A in which the next four coin flips come up heads. Still using the parameter estimate of 2/3, this has the binomial probability
Pr(A) = B(4; 4, 2/3).
What is the conditional probability Pr(A | T)? Since the "theory" T did not change the parameter estimate 2/3, one might think that it equals the unconditional probability Pr(A). But for no apparent reason, Lewis decides to take the four coin flips in A from the coin flips in T, producing an amputated event T' with 3 fewer heads and 1 fewer tails. Even more oddly, he computes Pr(A, T') as if the two events were independent even though the observation of either would clearly change the parameter estimate used to compute the conditional probability of the other.

So according to his logic, the "joint probability" of A and T' is then
Pr(A, T') = B(4; 4, 2/3) B(6,665; 9,998, 2/3).
By dividing this by Pr(T), he supposedly finds the "conditional probability" of A given T.

This computation is, of course, completely absurd. If the parameter had been 1/3 instead of 2/3, it would have produced a "probability" larger than 1. So I'm afraid the example isn't doing much good.

No comments :

Post a Comment