(This lecture is going to appear out of order, unfortunately, but I was incredibly bored of PDAs and I really wanted to still get a post and some writing done to hit my word count goals. I’m serious, y’all, if you think pushdown automata and constructions on them are kinda boring as students just imagine trying to get up your enthusiasm about lecturing on it!)

Finally we come to Turing machines, which are the main construction we’ve been building up to this entire time. Unlike the machines we’ve been dealing with previously in this course, these will encompass the *entirety* of the computable functions. In a sense, honestly, we *define* computability based off of what can be done by a Turing machine since it’s the most general model of computation we have.

Now this might seem circular to you, since up til this point we’ve been trying to define exactly what languages can be described by different forms of machine and *now* we just throw in the towel to say that “golly gee whiz, this must be as strong as it gets”? Well not exactly, because as we’ll talk about briefly while it’s only a *hypothesis* that the entirety of the computable functions are described by the Turing machines, it’s a hypothesis that has a lot of evidence going for it. Namely, that every other notion of computation humanity has ever been able to devise is either *equivalent* to Turing machines or, in fact, is a subset of what the Turing machines can do. The lambda calculus, the partial recursive functions, etc. are all equivalent to Turing machines. We know this since we can write implementations of Turing machines in these other computational models **and** we can simulate these other models in Turing machines as well. Just as with our DFA/NFA equivalence or our PDA/CFG equivalent, we know that if we can provide constructions going “both ways” between two different models of computation then we know the models of computation are equivalent in power. The extension of this observed fact to the conjecture that all computational models that attempt to capture the set of all computable functions will be equivalent to Turing machines, and hence to each other, is called the “Church-Turing Thesis”.

One might question, though, “what about quantum computers?” and that would be a very good question given what I’m currently asserting. The reality is that quantum computers *can’t do anything more than a Turing machine*. We can see this by the fact that we can simulate quantum computers on ordinary classical-mechanics inspired computers we know and love. It would seem, honestly, since we can perform simulations of physics on computers to any observable accuracy that, maybe, all physical processes are in a sense computable. This has actually been hypothesized before, but again there’s no real evidence beyond coincidences and gut instincts for any of these things. Who knows? Maybe there will be a discovery some day that will show the Church-Turing Thesis wrong. Your humble lecturer doesn’t find this *terribly* likely though.

Given that lengthy preamble, we now come to what Turing Machines *actually are*. As usual, we’ll describe it informally first. To start, let’s imagine having a machine that’s like a DFA-with-scratch-paper. We only have a finite number of states, as usual, but we have a mechanism for looking at the scratch paper, moving across the scratch paper, and writing on the scratch paper. Imagine that the scratch paper is graph paper: made up of cells into which the data is neatly written, which makes it different than every piece of graph paper I’ve used in my life. When we take a computational step, we’re allowed to look at both our internal state and at the particular cell of graph paper that our machine was pointed at at the beginning of the step. When the machine has decided what to do next, it can make any needed notes in that cell of the graph paper and then move the reader to an adjacent cell of the graph paper. Configuration of input to the machine will be done, rather than with some magical input like for DFAs and PDAs, by giving the machine a piece of scratch paper that already has some data on it. For example, if we have a machine that will do arithmetic problems, then the initial state of the scratch paper will be a problem such as “3*6 + 5 = ” and then we’ll use the scratch paper below the equation to actually figure out *what* “3*6 + 5” reduces to and then at the end of the process write in our answer on the right hand side of the equation.

Now, how much scratch paper do we actually have to work with? This is a slightly delicate question when it comes to making sure we’re being “computable” by our informal definition. Let’s assume, for the moment, that our machine will always stop in a finite amount of time. Then, since it can only move one step across the scratch paper per step then we have a bounds on the amount of graph paper we need: it will use a number of cells less than or equal to the amount of steps the machine runs. If we know, then, that the machine will halt no matter on what input it’s given we know that it will always take a finite amount of paper even if that amount of paper is arbitrarily large. So, we’ll assume an “infinite” supply of paper because if all’s going well we’ll only use a finite amount of it anyway. The supply of paper is infinite in the way natural numbers are infinite, not the way the real numbers are infinite. There may be an infinite *quantity* of natural numbers, but the process of building a single one of them is finite. This discussion might seem nitpicky, but given that we’ve been insisting on finite data and finite rules this entire time I think it’s important to argue that we’re not violating those principles that have gotten us this far.

What we’ve described is, essentially, a Turing machine albeit perhaps one a bit more flexible than we really need. Just to make things even simpler when describing our state machine and what it does, we’ll assume a 1-dimensional scratch paper, which by convention is always called tape. This tape will have cells on it just like our graph paper did and, instead of being able to move everywhere on the graph paper, we’ll be restricted to just moving left and right on the tape.

Where does this idea come from? Fundamentally, it’s inspired by the good-old-days when “computer” was a job title, not an inanimate object, and a job mostly done by women I should add. A computer was someone who did tedious but important calculations for a living, more or less, often employed by the military. Turing’s idea was inspired by the fact that when computers were doing their work they always used a finite amount of scratch paper and a computer could take a break and then eventually come back to her work and continue it. The fact that you could take a break and come back to your work without error, in a sense, means that you must be relying somewhat on your own internal memory but also that you’re looking at where you were in the calculation. These people were doing calculations that they all understood how to do, and that there was some *set of rules* for how they proceed.

We’ll continue next post with the *formal* definition of Turing machines. We’ll walk through examples of Turing machines, talk about different levels of descriptions for Turing machines, and maybe even talk a little bit about the rather depressing life of Alan Turing himself.