Having seen Turing machines and talked a bit about what they are, what it means for a language to be Turing recognizable or decideable, and even spending a little bit of time talking about time complexity for Turing machines, let’s finish out the course with a bit on a different notion of computation that is equivalent to the Turing machines: the lambda calculus.

The lambda calculus was, essentially, invented by Alonzo Church for investigations into constructive logic but while it turned out that the untyped lambda calculus, as a proof theory for a logic, is entirely inconsistent and unsuited to the task it *did* turn out that the lambda calculus describes all computable functions in the same way that the Turing machines do. We know this for certain because we can encode any Turing machine as a program in the lambda calculus and visa-versa, so they have equivalent power. The lambda calculus, though, has a very different character than Turing machines. While Turing machines were obviously finitary machines that had a nice, physical, interpretation that intuitively makes sense as a computable process, the lambda calculus is more akin to a real programming language whose rules, while simple, aren’t obviously realizable in a computable way. (When I say “obviously realizable” I mean that it’s not clear just by glancing at the rules that there isn’t anything non-computable going on, since we know quite well how to interpret the rules computably as there are many many ways to write a program that is an interpreter for the lambda calculus. I’ve written at least a dozen such interpreters so far.)

Conceptually, in the lambda calculus everything is a function. Well, almost. Everything is a function or a variable. There are only two things you can actually *do* in the lambda calculus: you can make a new term out of another term with *lambda abstraction*, which is how you make functions, or you can apply a term to another term. If you’re used to most programming languages, the “apply any term to any other term” might seem kinda weird, but in this case it’s okay because Everything is a Function and thus every term is capable of accepting a term as an argument.

We’ll describe the lambda calculus as an inductive structure, much like we did with RegExps so long ago. First, though, we fix a countably infinite set that we’ll call , the variables of the calculus. The inductive type of syntax for the lambda calculus, , is thus built up by the following rules

- , for
- where

and that’s it, we can all go home. Oh, sorry, right we need to make sure these rules actually mean something. First, let’s say in words what these syntactic forms are actually going to mean. The very first one says that if we have a term in the lambda calculus, , which we can think of as the body of the function then we can wrap it up in the *lambda abstraction* and bind the variable to be the argument of the function. This means that whenever we pass an argument to the function then the argument will be *substituted* in for everywhere in . We’ll make more rigorous the notion of substitution soon. The second syntactic form means that a lambda calculus term can be made up of applying one term to another. The final syntactic form means that a variable is a term in the lambda calculus.

Perhaps some examples are in order. Recall how we define functions in, say, high school mathematics where we have things like . Well, what exactly *is* the act of saying ? We’ve been using this notation so much over our lives I’m not sure how often it occurs to us that there *should be* a formal meaning to the syntax. The lambda calculus gives us a way of describing what means. It’s really a shorthand for saying that , i.e. it’s giving a name to an act of lambda abstraction. Now, the bare untyped lambda calculus doesn’t *have* a way to give names to terms like that but we can fake any use of names just by using *more lambda abstractions*. In this case, even just saying could be considered short hand for $(λ f. \text{rest of program}) (λ x. \ldots).

The act of substituting in a value into a function, like when we say that given our above, is how we actually *evaluate* the act of applying one lambda calculus term to another lambda calculus term. Again, this is something we’ve been doing so long in math classes we might have learned to not question what exactly it means to substitute the arguments into the body of a function. In the lambda calculus, we’ll represent substitution syntactically as , which in words means that we substitute in for in the term .

Defining substitution over the syntax of lambda calculus terms might seem pretty simple. Let’s give it a shot!

- where

Okay, well this seems really neat and simple. Now we can also inductively define how evaluation of lambda terms works which goes a lil’ something like this:

- if
- if
- if

That’s really all we need to evaluate the lambda calculus. Since this all looks pretty simple let’s just try a few examples.

Okay, looking good so far, so let’s try one more example that should *totally work* because I haven’t been setting this up at all for a weird surprise:

But this is clearly *wrong* because if we just used a different variable for the binding in the lambda abstraction then it would have been

which doesn’t behave even remotely the same way. So we’re left in the rubble of a fallen calculus, clutching it to our collective chest and crying out “whhhhyyyyyyyyy”.

Wait, no, sorry there’s a way to fix this. We make use of the fact that whenever isn’t already bound in , basically because the particular variable chosen for a binding site doesn’t actually matter. In other words, is the exact same function as .

To this end, we redefine our notion of substitution to be

- where we ensure that is renamed to some other variable that is different any variable in , , or , if this condition isn’t already true. This is what we technically call capture-avoiding substitution.

With substitution fixed, we now have our evaluation of the untyped lambda calculus defined. What’s really the point, though? We don’t have any data types, just a way to build functions that can return other functions. This is where the magic happens: it turns out that with only functions we can actually fake any other kind of datatype that we’d want.

Let’s take the simplest possible example: booleans. We’re going to play a little bit of a trick and think of booleans by what they *do* not what they *contain*. At it’s heart, what is a boolean value but a way to do branching? If you have truth, then you take one branch and if you have false you take the other branch. How would we encode this in the lambda calculus? All we have is functions, so let’s have functions that take two arguments, branches, and return one or the other

which means that really an if-statement is just running the boolean

- if-then-else

Now, in order to test that these really do represent booleans we need to be able to define boolean operations on them such as and and not. Defining not is easiest so let’s start with that. Not should take a boolean and return the opposite value, or in terms of booleans-as-branching it should flip which branch is taken.

similarly, since we know that “and” should branch ‘true’ if both arguments are true and false otherwise, which if we think of how we’d implement as branching with nested if-statements will look something like

and or follows very similar logic

So we’ve demonstrated that we have a notion of booleans with only functions. What else can we build? Well, let’s try natural numbers. First, though, we need to think about what natural numbers are in terms of behavior. That might seem a bit odd: numbers are just things we add and subtract etc., right? Well, no, at least not natural numbers. Natural numbers are a way to count things, a way to repeat behaviors a certain number of times, in other words natural numbers are a kind of for-loop. A natural number in mathematics is either zero or one plus another natural number. A natural number “behaviorally” is either just returning the base case of the recursion, or performing an operation on the recursive argument. To this end we define

etc. or in other words we just have

- <- base case
- <- recursive case

but let’s make a quick check to ensure we’ve done this right

We can even define operations such as +,*, or – on these numbers that will do the right thing. For example, + can be defined as

- or in other words that adding and is given by apply adding 1 to a total of times.

The one last topic before we wrap up this somewhat informal treatment of the untyped lambda calculus is how we do recursion in general. We know that we need the ability to define arbitrary recursion because this calculus is equivalent to Turing machines. We do this using what’s called the combinator, which is defined as

This might be a very intimidating and confusing term when we first look at it, but in the end it satisfies a very important property, namely that which we can prove directly

In other words, the trick of apply terms to themselves gives us this weird self-replicating feature of the Y-combinator. Of course, the Y combinator will just be an infinite loop unless the function has some kind of base case that short-circuits out.

We’ll try a basic example of using the Y-combinator after we introduce a few more pieces

- <- the trick here is that if is 0 then it’ll just return it’s second argument which will be

Now we can define the factorial function with the combinator

So I think we’ll go ahead and leave this here because I’m not sure if we’ll even manage to cover all of this is a single lecture!

Your notion of capture-avoiding substitution does not seem right.

The case of M[v/x] where M = \x.N is simple: x is not free in M hence M[v/x] = M.

OTOH, the dangerous case is (\y.N)[P/x] where y is free in P. Then the naive substitution would capture this free occurrences. Hence you must rename y to something not free in P.

Yup, I forgot to say not in v either up above. Thanks for catching that.