The Wikipedia page on Turing Machines shows the following picture as an implementation of a Turing Machine:
I didn't see any problem with that picture until I got into an intricate discussion with my colleague and friend Paul Levy about 'realistic' models for costs of computation. One of the basic premises for such a model is that individual atomic operations have fixed and constant costs in time, space, energy, etc.
This is why the machine above cannot be a proper TM! As the machine is supplied with more and more tape, the spools will grow larger and larger, and their mass will make the (fixed) motor move the tape gradually slower and slower. The cost of computation (time and energy) in the early stages of computation would be lower than in the later stages of computation when the spools of tape are more massive.
This is why the common descriptions of TMs, (see for example the one in the SEP) present the tape fixed, more like a track, with the head gliding left and right along it. With this design it no longer matters how long the tape is, since it is the fixed-size head which is moving.
It is amusing to notice that Turing himself seems to have made this mistake, implying that the head of what he called an a-machine is fixed and the tape is moving:
I have little doubt that in the current reviewing climate such an oversight would have earned his "On Computable Numbers, with an Application to the Entscheidungsproblem" a strong reject.
Joking aside, I think such details are important when it comes to analysing cost of computation. For example, the moving-head-style TM makes it much more complicated to reduce a two-tape TM to a one-tape TM. With mobile tapes the task is trivial -- but mobile tape TMs are not right. With mobile head the task is more subtle since the communication overhead between the two heads and wherever the machine state is stored is no longer fixed! And this problem does not seem very easy to solve, at least not in a straightforward or clean way.
Before I conclude this pedantic rant, this is not the only thing about computability theory that puzzles me. Ever since I was un undergrad learning about the asymptotic complexity of sorting algorithms I couldn't understand how an array element access can be considered to be performable in constant time. To do that we need to read the index in constant time. But how can you read an arbitrarily large natural number, in whatever representation, in constant time? I still can't pretend I understand this.
Mostly programming language semantics with a dash of armchair philosophy. The views expressed here are personal and in no way reflect those of my employers.
Subscribe to:
Post Comments (Atom)
Understanding the issue of equality in Homotopy Type Theory (HoTT) is easier if you are a programmer
We programmers know something that mathematicians don't really appreciate: equality is a tricky concept. Lets illustrate this with a str...
-
I like reading history of science and, as a working academic, I am amused by theories that used to be widely accepted yet false, such as th...
-
Monads have exerted a curious fascination on programmers who turn to Haskell. The incomprehension of the newbies is matched in intensity onl...
-
For the first time in my 15 years as a lecturer I got to teach Compilers . I have written compilers before and I have published a bunch of r...
Hi Dan,
ReplyDeleteThe answer to your puzzlement about sorting algorithms is actually quite subtle!
1. First, note if an array to be sorted can be stored in RAM, then the word-size must be at least logarithmic in the input length -- otherwise you couldn't address enough memory to see the whole input.
2. The usual assembly-like operations in the RAM model are assumed to be unit time. This assumption is taken to be reasonable for two reasons.
2.1. First, the unit operations in the RAM model (eg, addition, comparison, multiplication, and so on) are all in the TC0 complexity class, which can be computed by circuits of constant depth and polynomial in the size of the input (which in this case is logarithmic in the array size). So it is reasonable to assume that there is enough bit-level parallelism in real hardware to make that log factor invisible.
2.2. Next, if you do assume that the unit operations are actually log-time, then it turns out that the problem of copying an array of n elements ends up having super-linear time. This is regarded as weird, since the design intent of the RAM machine was that you could prove this operation was linear! See Arnold Schoenhage's 1988 JACM paper, "A Nonlinear Lower Bound for Random-Access Machines under Logarithmic Cost".
3. Moreover, if you take the logarithmic word size into account, then an n element array takes at least n log(n) space. With this more accurate accounting, then the cost of sorting is still O(m log(m)), where m = n log(n)!
In general, small changes to the machine model can make log factors come and go, so complexity theorists tend to divide into two tribes. One tribe tries to shave log factors to find hard lower bounds for particular models, and the other regards this as a useless waste of time. Depending on what you are interested in, either side can be correct, so this dispute will never be resolved.
I totally agree! So if my post gave the impression to the contrary they I didn't express myself well. I would be happy enough if all assumptions were spelled out more clearly, including the fact that some of the assumptions are over-idealisations resulting in Mickey Mouse models of computation.
DeleteInteresting post. I Have Been wondering about this issue, so thanks for posting. Pretty cool post.It 's really very nice and Useful post.Thanks view more
ReplyDelete