Tuesday, August 27, 2013

Ardeshir and Ramezanian: "A solution to the surprise exam paradox in constructive mathematics" (2012)

The surprise examination problem concerns a teacher who announces the following:
One of the next five days, you will have an exam; and I will pick the day of the exam such that when it comes, it will come as a surprise.
The problem with this announcement, according to one line of reasoning, is that it is in fact impossible to satisfy:

The teacher can't rationally place the exam on Friday, because it would then be completely predictable given that it hadn't occurred Monday through Thursday; but the teacher also can't place it on Thursday, because it would then be completely predictable given that it hadn't occurred Monday through Wednesday, and that the teacher will be rational on Friday; and similarly, it can't take place on Wednesday, etc. In this way, we end up concluding that the exam can't take place at all.

This impossibility argument rests on a number of assumptions: For instance, if the teacher is not guaranteed to be rational in the future, the argument collapses, just as in the case of the centipede game. Another assumption is that the teacher can only use pure (deterministic) strategies, a restriction that rules out the solutions to many games, including rock-paper-scissors.

Backwards Induction in Constructive Mathematics

However, the approach taken in the paper by Mohammad Ardeshir and Rasoul Ramezanian is to deny constructive validity of the argument above. They do so by translating the teacher's choice of examination day into a bounded and increasing series whose maximum identifies the day of the exam.

For instance, if the exam takes place on Thursday, the students might observe the following series:
1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, …
Since this series can is increasing and only takes values among the number {1, 2, 3, 4, 5}, it has a maximum according to classical reasoning. However, in constructive mathematics, this maximum is said not to exist because we can't explicitly identify which of the five numbers it is.

Given this translation, the teacher can now truthfully say:
I can construct a bounded and increasing series with values in {1, 2, 3, 4, 5} such that you cannot constructively infer its maximum from any finite segment.
That's doable; the standards of existence imposed by constructive mathematics are so high that as long as you haven't observed a 5 anywhere in the sequence, you have no constructively valid conclusion about the maximum. So even if the exam is placed on Friday, there is no specific natural number k such that you can with certainty infer the maximum of the series after having observed k items.

In other words, the students have no reliable algorithm for inferring what day the exam will be held. This conclusion does, however, rest crucially on the assumption that there can be arbitrarily large time lags between the jumps in the sequence.

Infinite Input Turing Machines

Towards the end of the paper, the authors introduce something called "Type 2 Turing machines." These are machines that can take infinite inputs and produce infinite outputs.

The reference they give for this concept is a textbook by Klaus Weihrauch. I will have to take a look at that at some point.

Monday, August 19, 2013

Goldin and Wegner: "The Interactive Nature of Computing" (2008)

By definition, Turing machines either halt after writing a finite amount of output, or do not halt at all. This means that if we feed them an infinite input tape, they will either ignore an infinite tail of it, or never halt. From an algorithmic perspective, all the non-halting computations will be categorized as equivalent.

This is, however, not a very good model with respect to many important types of computation. For instance, operating systems are non-halting computations, but it seems weird to consider them all equivalent on the grounds that they "have no output." So it would make sense to use a more fine-grained theory to handle such situations.

New Concepts, New Problems

Once we take this different perspective, there are a lot of conceptual pitfalls to beware of. For instance, the 1-tape Turing machine will no longer be equivalent to the 2-tape machine when the inputs are allowed to be infinite: The usual simulation tricks such as using every second slot to simulate a second tape doesn't work because they require infinite time for preprocessing. When the input is infinite, the 1-tape machine is thus effectively finite-state.

This means that we should be careful about what we mean when we refer to "the Church–Turing hypothesis." There are several different notions of equivalence on the table, and many of these introduce new distinctions. Machines that we are used to think of as "equivalent to the Turing machine" are no longer guaranteed to be equivalent to Turing's original 1-tape machine. We should thus be more careful than usual in stating exactly what we mean by "the" Turing machine, and how we compare it to other architectures.

Equivalence Relations on Sets of Machines

If we want to determine whether two non-halting computations are equivalent, we can't simply check for equality of the outputs, since there will never be any output. There are several alternative options, but one of them is to introduce a limit concept for computations, and then define equivalence as equality in the limit.

Such a limit concept could for instance be the following:
  • A tape t converges to a state w if any square on t eventually contains the same symbol as w — i.e., for all i there is a k such that for all jk, t(j) = w(j). A computation converges to a state w if its output tape does.
When a computation converges, any finite segment of the output tape will thus remain static after a finite amount of time. If we further require output tape to be unidirectional and write-only, this definition corresponds to an ongoing computation occasionally invoking a print command.

Machine Components

In addition to stating explicitly what notion of equivalence we  are using, we should further specify what type of machine we have in mind when we talk about "the" Turing machine. This means that we have to make the following decisions:
  • How many storage devices (tapes) does the machine have, and which of these are infinite?
  • In the initial configuration, do the infinite tapes contain only blanks from some point on? And if they do, can these tapes also contain arbitrarily long sequences of blanks elsewhere on the tape? (If they can, they will essentially have to be treated as infinite.)
  • What commands to the tapes accept? For instance, are they unidirectional or bidirectional, and are they read-only, write-only, or read/write? (A stack defines a slightly more involved combination of these options.)
  • Which tape is the designated output tape?
  • Which tape is the designated input tape?
  • What function defines the control flow?
I have written that the control flow is defined as a function because I think this makes the memory demands of the machine clearer. For instance, the processing unit of a finite state automaton can be conceptualized as a function with access to an external tape of finite length. By externalizing the memory components like this, we have to be more explicit about their size.

It should also be clear that once we have defined the different kinds of storage devices that we are interested in, we can combine them modularly in all sorts of ways. Just as in the classical theory, many of these combinations will yield equivalent machines, but not quite as often.

What About the Paper?

In the paper "The Interactive Nature of Computing," Dani Goldin and Peter Wegner make many of the above observations. Their starting point is not the difference between machines with finite and infinite inputs, but rather that of ongoing computations as opposed to closed input-output operations.

However, the two perspectives are equivalent as far as I can see, although it is difficult to assess because they refer to a different paper for technical details, and I haven't read that paper yet.

Along the way, they also refer to work by Jack Copeland, Peter Kugel, Jan van Leeuwen and Jiri Wiedermann for (what the text suggests is) additional mathematical detail.