## Thursday, October 22, 2015

### Bell: "Bertlmann's Socks and the Nature of Reality" (1980)

There seems to be a lot of mystery surrounding Bell's theorems. Scholarpedia has a whole section devoted to "controversy and common misunderstandings" surrounding the argument, and a recent xkcd comic took up the topic (without mentioning any specific misunderstanding).

 Source: xkcd.com/1591

I've also been told in person that I understood the theorem wrong. So it seems about time for some studying.

This time, rather than pick up Bell's original article, I read this more popular account of the argument, which covers more or less the same ground. If I understand it correctly, it's actually simpler than what I first thought, although my hazy understanding of the physics stood in the way of extracting the purely statistical part of the argument.

### Background

Here's what I take to be the issue: We have a certain experiment in which two binary observables, $A$ and $B$, follow conditional distributions that depend on two control variables, $a$ and $b$:\begin{eqnarray}
a &\longrightarrow& A \\
b &\longrightarrow& B
\end{eqnarray}Although the experiment is designed to prevent statistical dependencies between $A$ and $B$, we still observe a marked correlation between them for many settings of $a$ and $b$. This has to be explained somehow, either by postulating
• an unobserved common cause: $\lambda\rightarrow A,B$;
• an observed common effect: $A,B \rightarrow \gamma$ (i.e., a sampling bias);
• or a direct causal link: $A \leftrightarrow B$.
The purpose of Bell's paper is to rule out the most plausible and attractive of these three options, the hidden common cause. This explanation is ruled out by showing that a certain measure of dependence would exceed a logically necessary bound under this type of explanation.

### Measurable Consequences

The measure in question is the following:
\begin{eqnarray}
C(a,b) &=& +P(A=1,B=1\,|\,a,b) \\
&   & +P(A=0,B=0\,|\,a,b) \\
&  &  -P(A=1,B=0\,|\,a,b) \\
&  &  -P(A=0,B=1\,|\,a,b).
\end{eqnarray}This statistic is related to the correlation between $A$ and $B$ but different due to the absence of marginal probabilities $P(A)$ and $P(B)$. It evaluates to $+1$ if and only if the two are perfectly correlated, and $-1$ if and only if they are perfectly anti-correlated.

 Contours of $C(a,b)$ when $A$ and $B$ are independent with $x=P(A)$ and $y=P(B)$.

In a certain type of experiment, where $a$ and $b$ are angles of two magnets used to reveal something about the spin of a particle, quantum mechanics predicts that
$$C(a,b) \;=\; -\cos(a-b).$$When the control variables only differ little, $A$ and $B$ are thus strongly anti-correlated, but when the control variables are on opposite sides of the unit circle, $A$ and $B$ are closely correlated. This is a prediction based on physical considerations.

### Bounds on Joint Correlations

However, let's stick with the pure statistics a bit longer. Suppose again $A$ depends only on $a$, and $B$ depends only on $b$, possibly given some fixed, shared background information which is independent of the control variables.

 The statistical situation when the background information is held constant.

Then $C(a,b)$ can be expanded to
\begin{eqnarray}
C(a,b) &=& +P(A=1\,|\,a) \, P(B=1\,|\,b) \\
&   & +P(A=0\,|\,a) \, P(B=0\,|\,b) \\
&   & - P(A=1\,|\,a) \, P(B=0\,|\,b) \\
&   & - P(A=0\,|\,a) \, P(B=1\,|\,b) \\
&=& [P(A=1\,|\,a) - P(A=0\,|\,a)] \times [P(B=1\,|\,b) - P(B=0\,|\,b)],
\end{eqnarray}that is, the product of two statistics which measure how stochastic the variables $A$ and $B$ are given the control parameter settings. Using obvious abbreviations,
$$C(a,b) \; = \; (A_1 - A_0) (B_1 - B_0),$$and thus
\begin{eqnarray}
C(a,b) + C(a,b^\prime) &=&
(A_1 - A_0) (B_1 - B_0 + B_1^\prime - B_0^\prime)
& \leq & (B_1 - B_0 + B_1^\prime - B_0^\prime); \\
C(a^\prime,b) - C(a^\prime,b^\prime) &=& (A_1^\prime - A_0^\prime) (B_1 - B_0 - B_1^\prime + B_0^\prime)
& \leq & (B_1 - B_0 - B_1^\prime + B_0^\prime).
\end{eqnarray}It follows that
$$C(a,b) + C(a,b^\prime) + C(a^\prime,b) - C(a^\prime,b^\prime) \;\leq\; 2(B_1 - B_0) \;\leq\; 2.$$Since $(B_1 - B_0)\geq-1$, a similar derivation shows that
$$| C(a,b) + C(a,b^\prime) + C(a^\prime,b) - C(a^\prime,b^\prime) | \;\leq\; 2|B_1 - B_0| \;\leq\; 2.$$In fact, all 16 variants of this inequality, with the signs alternating in all possible ways, can be derived using the same idea.

### Violations of Those Bounds

But now look again at
$$C(a,b) \;=\; -\cos(a-b).$$We then have, for $(a,b,a^\prime,b^\prime)=(0,\pi/4,\pi/2,-\pi/4)$,
$$\left| C\left(0, \frac{\pi}{4}\right) + C\left(0, -\frac{\pi}{4}\right) + C\left(\frac{\pi}{4}, \frac{\pi}{4}\right) - C\left(\frac{\pi}{4}, -\frac{\pi}{4}\right) \right| \;=\; -2\sqrt{2},$$which is indeed outside the interval $[-2,2]$. $C$ can thus not be of the predicted functional form and at the same time satisfy the bound on the correlation statistics. Something's gotta give.

### Introducing Hidden Variables

This entire derivation relied on $A$ and $B$ depending on nothing other than their own private control variables, $a$ and $b$.

However, suppose that a clever physicist proposes to explain the dependence between $A$ and $B$ by postulating some unobserved hidden cause influencing them both. There is then some stochastic variable $\lambda$ which is independent of the control variables, yet causally influences both $A$ and $B$.

 The statistical situation when the background information varies stochastically.

However, even if this is the case, we can go through the entire derivation above, adding "given $\lambda$" to every single step of the process. As long as we condition on a fixed value of lambda, each of the steps still hold. But since the inequality thus is valid for every single value of $\lambda$, it is also valid in expectation, and we can thus integrate $\lambda$ out; the result is that even under such a "hidden variable theory," the inequality still holds.

Hence, the statistical dependency cannot be explained by a shared cause alone, since the functional form of the probability densities for $A$ given $a$ and $B$ given $b$ are of a wrong form. We will therefore need to either postulate direct causality between $A$ and $B$ or an observed downstream variable (sampling bias) instead.

Note that the only thing we really need to prove this result is the assumption that the probability $P(A,B \, | \, a,b,\lambda)$ factors into the product $P(A \, | \, a,b,\lambda)\, P(B \, | \, a,b,\lambda)$. This corresponds to the assumption that there is no direct causal connection between $A$ and $B$.