Dichotomies and Tests
Suppose two probability measures $P_0$ and $P_1$ on a set $\Omega$ are given, and that we are given a data set $X=x$ drawn from $\Omega$ according to one of these two distributions. Our goal is now to define a function $T: \Omega\rightarrow \{0,1\}$ which will return, for any data set, a guess at which of the two hypotheses the data came from.Such a test will thus itself be a random variable $T=T(X)$. The quality of this test can be measured in terms of two error probabilities,
$$
P_0(T=1) \qquad \textrm{and} \qquad P_1(T=0).
$$Minimizing these two sources of error will generally be conflicting goals, and can be seen by considering the constant functions $T=0$ and $T=1$.
Likelihood Ratio Tests and the Neyman-Pearson Lemma
One family of tests we always have at our disposal are the likelihood ratio tests. A likelihood ratio test at threshold $r$ is the indicator variable that returns the value 1 when the evidence supports hypothesis 1 by a factor of more than $r$:$$
R(x) \; = \; \mathbb{I}\left( \frac{P_1(x)}{P_0(x)} \geq r \right).
$$The content of the Neyman-Pearson lemma is that these likelihood ratio tests are the only optimal ones, in a certain sense.
Specifically, suppose any other test $T$ is given. The lemma then states that if
$$
P_0(T=1) \; \leq \; P_0(R=1) \qquad \Longrightarrow \qquad P_1(T=0) \; \geq \; P_1(R=0).
$$In other words, when the probability of an error of the first kind is held below a fixed level, we achieve the lowest rate of errors of the other kind by choosing a likelihood ration test. You cannot achieve a lower error rate under $P_0$ without increasing your error rate under $P_1$.
From $1\times 2$ to $2 \times 2$
The best way of proving this is to directly compare the regions of $\Omega$ on which the two tests disagree. There are two of these regions, $T=1 \wedge R=0$ and $T=0 \wedge R=1$. The difference in the measure of these two regions under the two hypotheses determine the difference in error rates for the two tests.A paraphrase of the lemma could then be that if
$$
P_0(T=1 \wedge R=0) \; \leq \; P_0(T=0 \wedge R=1),
$$then we also have
$$
P_1(T=0 \wedge R=1) \; \geq \; P_1(T=1 \wedge R=0).
$$This alternative formulation can be translated back into the original form by adding back in the corner regions $T=1 \wedge R=1$ and $T=0 \wedge R=0$ on which the two tests give the same result.
Ratio Conversions
Using this formulation, a proof strategy suggests itself: Looking at cases according to the value of $R$. Specifically, if an event $A\subseteq \Omega$ satisfies the condition$$
A \; \subseteq \; \{x:\;R(x)=0\} \; = \; \left\{ x:\; \frac{P_1(x)}{P_0(x)} \; < \; r\right\},
$$then $\frac{1}{r}P_1 < P_0 $ on $A$. We can therefore obtain a lower bound on the $P_0$-measure of $A$ by performing the integration using the smaller measure $\frac{1}{r}P_1$ instead:
$$
\frac{1}{r} P_1(A) \; < \; P_0(A).
$$Similarly, if
$$
A \; \subseteq \; \{ R=1 \} \; = \; \left\{ \frac{P_1}{P_0} \; \geq \; r\right\},
$$then $\frac{1}{r}P_1 \geq P_0$ on $A$, and
$$
P_0(A) \; \leq \; \frac{1}{r} P_1(A).
$$Since the two sets $R=0$ and $R=1$ form a partition of $\Omega$, we can split any set $A$ up according to its overlap with these cells and thus translate bounds on $P_0$ into bounds on $P_1$.
The Extended Sandwich
Now, by applying these considerations to the condition$$
P_0(T=1 \wedge R=0) \; \leq \; P_0(T=0 \wedge R=1),
$$we obtain the result that
$$
\frac{1}{r} P_1(T=1 \wedge R=0) \; \leq \; \frac{1}{r} P_1(T=0 \wedge R=1).
$$By cancelling the $\frac{1}{r}$, we thus get the result.
I like this formulation of the lemma, because it makes it vivid what it is that's going on: When somebody splits up $\Omega$ into $\{T=0\}$ and $\{T=1\}$, we use our ratio test $R$ to further split these up into two subregions each. This allows us to use the translation between the two measures, and thus to compare the $P_0$ error rates with the $P_1$ error rates.
It also gives better intuitions about the potential differences between the tests $R$ and $T$: The more the set $T=1$ overlaps with the set $R=1$, the more the two tests are alike, and the more squeezed will the sandwich of inequalities be.
No comments :
Post a Comment