# Lecture 1: What is computation?

(So I’ve decided to post my rough lecture notes for the class I’m teaching next term to this blog before I actually give the lectures. I may go back and edit them after the fact once they’re done, but that’s an issue of time I won’t commit myself to at the present.)

What is computation? That’s a question I’d like to discuss a bit so we can set up the trajectory of this class. Now, wikipedia defines computation as “any type of calculation or use of computing technology in information processing”, which is fine but rather dry and not illustrative. A loose, but perhaps evocative, definition of computation would be “anything you can do with a finite amount of data, using a finite number of rules, in a finite length of time”. This definition encompasses a variety of thigns such as

• doing arithmetic problems by hand with a pen and paper
• counting on your fingers or using physical objects
• running a computer program

If it’s not obvious how all three of those fit my definition of computation, perhaps take a moment to think about how all of them involve finite data, finite rules, and finite time. Here’s another interesting example to ponder: is writing a mathematical proof a form of computation?

In this class, we’ll be examing various models of computation by trying to answer “how can you tell if a given string belongs to a given language?” Where, “language” means a, possibily infinite, set of strings and by “string” we mean a sequence of characters in an alphabet. Bear with me for a second as we examine what “alphabet” means: an alphabet is just a finite set of symbols that we use as the characters. Why finite? It goes back to our definition of computation: we need to only use finite data in our computations.

Now, if our language $L$ is finite then deciding whether a string $w$ is in $L$ is obviously computable. We can hold the entire language $L$ “in memory” and manually check if $w$ is equal to some string in $L$. That’s finite time, finite rules, and finite data used, so is computable.

What if our language $L$ is infinite, though? Without knowing something about $L$ that could give us some hints, it’s not obviously something we can compute. We can’t have an infinite “lookup table” for a brute force search, and even if we could the process isn’t guaranteed to take finite time. On the other hand, if our language is “all finite strings” then this is a simple and computable process: take a finite string as input, then return “true”. This is because the language is “simple” to describe, and thus computationally “simple” to test if a string is in the language. On the other hand, let’s assume that $L$ is an infinite set of completely randomly chosen strings. There is obviously no simpler description of the language than simply listing the infinite number of strings in the language. Checking that a string is in $L$ is thus not computable.

Most languages, on the other hand, are somewhere between these two extremes. We’ll be looking at a sequence of models of computation, starting with DFAs, as we build up to Turing machines, which are a model of computation that makes our “finite time, rules, and data” definition rigorous. We will, by the end of this course, have discussed just exactly what things are computable and give you at least some of the tools for showing that something isn’t computable. We will also tie the “is string $w$ in language $L$ ” problem back into a more general notion of computation, revealing that we haven’t lost anything by considering such a specific class of problems.

Now, all of that being said there are some necessary mathematical preliminaries that need to be addressed. First off, this will be a very proof heavy course. Sometimes when this course is taught, the assignments tend to be “5 million ways to build a DFA” (apologies to The Coup), but the three of us who are teaching this term are pushing for a more mathematically mature approach.

As such, we should discuss proof techniques. There’s three major techniques that are useful in this class: proof by construction, proof by contradiction, and proof by induction. We’ll tackle each of these in turn.

Proof by construction is when you show that a theorem is true by constructing some artifact that manifestly demonstrates the property. For example, if we had the theorem “there exists a number greater than 10” a very easy way to prove this is to say “$10+1 > 10$, therefore there is a number greater than 10\$. An example of proof by construction that we’ll see later and repeatedly in the course is showing that some class of languages is closed under properties such as union or intersection by constructing some machine that recognizes the new language.

Proof by contradiction is when you prove that a theorem is true by assuming that it is not true, then showing that this contradicts some other fact that we already know is true. For example, let’s prove that there’s an infinite number of natural numbers. Assume that this is not true, then there must be a largest number which we’ll call $n$. Since the natural numbers are closed under addition, then we know that $n+1$ is a natural number that is bigger than $n$ contradicting the assertion that $n$ is the maximum number and thus contradicting the assertion that the natural numbers are finite.

Finally, we come to proof by induction. You have likely seen natural induction in some class prior to this. Natural induction states that if we have a property we’re trying to prove, $P(n)$, which is indexed by natural numbers, then if we can prove $P(0)$ and $P(n) \to P(n+1)$ then we’ve prove $P(n)$ for all $n$. In other words, if we can show that the theorem is true for 0 and that if it’s true for one number then it’s true for the next number, then we can combine these two properties to show that the property is true for any number. Proof by induction is a far more general principle though and works for many types of data, such as binary trees or lists. In every instance, though, we’ll have the same structure for the proof: there will be “base cases” (like $P(0)$) and there will be “inductive cases” (like $P(n) \to P(n+1)$), the “inductive cases” being the ones where you’re allowed to assume the property is true for some “smaller” piece of data that you can use to make a “bigger” piece of data. As we come to different examples of inductive data, we will make the rules for an inductive proof over that data very explicit.