Friday, April 10, 2015

A note on the Kolmogorov-Dynkin extension theorem

I've been somewhat confused about Kolmogorov's extension theorem, the uniqueness theorem for stochastic processes, but this note by Cozma Shalizi cleared up things a little bit.

Here's the set-up: We have a universe of values, $\Omega$, and this gives rise to a universe of sample paths, $\Omega^\mathbb{N}$. This universe of sample paths can be equipped with a probability measure. Such a measure maps bundles of sample paths, $B \subseteq \Omega^\mathbb{N}$, onto probabilities.

The measure has to respect the usual restrictions, including countable additivity. However, it doesn't have to (and in fact can't) accept all bundles as inputs.

The set of measurable bundles is defined inductively on the basis of one-dimensional base cases and some recursive operations. Specifically, a bundle is measurable if it is
1. the preimage under a coordinate projection of some measurable set $A \subseteq \Omega$;
2. constructed out of other measurable bundles by means of complementation, finite intersection, or countable union.
These rules define a system $S$.

We could also define a smaller system, $S^{-}$, by only allowing finite unions. A bundle in this smaller system corresponds to a proposition about a finite-dimensional projection of the stochastic process. If we take the $\sigma$-algebra generated by the smaller system, we recover the full system, $\sigma(S^{-})=S$.

The content of the Daniell-Kolmogorov exention theorem can now be stated as follows: Suppose that two measures $P$ and $Q$ assign the same values to all bundles in $S^{-}$; then they assign the same values to all bundles in $S$. This is equivalent to saying that if two stochastic processes have the same distribution under any finite-dimensional projection, then they have the same distribution.

Not surprisingly, the proof involves some facts about the limit behavior of measures on increasing sets. Specifically, when a series of sets $B_1, B_2, B_3, \ldots$ converges to some limit set $B$ from below, then the measures $\mu(B_1), \mu(B_2), \mu(B_3), \ldots$ must converge to the limit $\mu(B)$. This implies the uniqueness of the infinite-dimensional extension of a set of finite-dimensional projections.

Here is the proof as given by Shalizi: Let $W$ be the set of bundles under which $P$ and $Q$ agree. Then by assumption, $S^{-} \subseteq W \subseteq S$. We want to show that $\sigma(S^{-}) = S \subseteq W$. We do this by showing that $W$ is a Dynkin system, and that $S^{-}$ is closed under finite intersection, since this will allow us to use Dynkin's $\lambda$-$\pi$ theorem.

A Dynkin system is a system of sets which contains the whole universe, is closed under complementation, and closed under disjoint, countable unions (or equivalently: increasing, countable unions). The set agreement system $W$ satisfies these three properties, since
1. $P(\Omega^{\mathbb{N}})=Q(\Omega^{\mathbb{N}})=1$, so $P$ and $Q$ agree on the bundle consisting of all sample paths;
2. For any bundle $B\in W$, $$P(B^c)=1-P(B)=1-Q(B)=Q(B^c),$$ so if $P$ and $Q$ agree on on a bundle, they also agree on its complement;
3. Finally, if $P$ and $Q$ agree on every bundle in an increasing series of sets $B_1, B_2, B_3, \ldots$, then the two series $$P(B_1), P(B_2), P(B_3), \ldots$$ $$Q(B_1), Q(B_2), Q(B_3), \ldots$$must have the same limit. Hence, whenever $P$ and $Q$ agree on a list of increasing bundles, they thus also agree on the set-theoretic limit of that series.
Thus, $W$ is a Dynkin system. This means that it is large enough to contain all the additional sets we put into the system $S^{-}$ by closing it under $\sigma$-operations. In other words, $\sigma(S^{-}) \subseteq W$. We thus have both $S \subseteq W$ and $W \subseteq S$ (which is true by definition), and thus $W=S$.

Here's an intuitive version of that proof:

The system $S$ on which $P$ and $Q$ are defined must itself be defined by certain recursive operations: Every set in $S$ is constructed from a finite-dimensional base case using only well-behaved operations.

The properties of probability measures guarantee that we can match each of these constructions, operation for operation and step by step, by a corresponding axiom which enforces a certain unique result. Because of this parallelism between the construction of the sample space and the propagation of necessity, no measurable set in $S$ actually has any wiggle room when the bases cases are identical.