# Lecture: Equivalence of PDAs and CFGs, CFL pumping lemma

So we’ve introduced PDAs and gone through a few simple examples of them. We’ve also asserted, repeatedly, that the PDAs are equivalent to CFGs in describing the context free languages. Not we need to make good on that assertion. We’ll really only cover one side of the equation in detail, since it’s the more mechanically interesting side as it tells us how to convert a CFG where matching can be seen as a proof search problem to a straight forward machine where the computational time is going to be proportial to the size of the input.

So we’ll show how to convert a CFG into a PDA. Conceptually, we want to “simulate” the CFG’s rules as part of the rules of the PDA. What we’ll do is let both variables and terminals be a part of our stack alphabet $\Gamma$, but our input alphabet $\Sigma$ will simply be the set of terminals. When we have a variable on the top of the stack we’ll pop it and push back on, non-deterministically, the right hand side of one of that variable’s expansion rules. Whenever we see a terminal on the top of the stack, we consume it. Finally, when the stack is empty we move to the accept state. Gosh, if we need the stack to be empty at the end of an accept state that means we should push on a special symbol before we begin our computation. Let’s call it $!$. We also need to push onto the stack, before doing anything else, the start symbol of the grammar in order to get the whole simulation primed. This means that we’ll have three “main” states, and other states in order to handle the pushing of symbols involved.

An example might help things make more sense. Let’s consider, again, our language of matched 0s and 1s. We already know how to make this as a PDA, but let’s do the conversion and let’s see how it matches up with the direct construction. As a reminder, our grammar is

• $A \to 0A1$
• $A \to \epsilon$

We’ll allow ourselves a little bit of a cheat at first, and push multiple symbols at a time, and then we’ll backtrack and show what it looks like if you take the cheat back out again. Consider it notational shorthand for the real graph!

You can see how we pushed multiple symbols at once and had a transition for “A” every time we saw it on the stack. Now let’s do a run through in the style of the last lecture where we look at input buffer, stack, and state

• $(0011,\epsilon,M_1)$
• $(0011,A!, M_2)$
• $(0011,0A1!, M_2)$
• $(011,A1!, M_2)$
• $(011,0A11!, M_2)$
• $(11, A11!, M_2)$
• $(11, 11!, M_2)$
• $(1, 1!, M_2)$
• $(\epsilon, !, M_2)$
• $(\epsilon, \epsilon, M_3)$
• accept

Since what we’re doing is a straightfoward simulation of the the CFG on the stack of the PDA, hopefully it’s pretty clear that this will decide the same language as the CFG did. For completion, let’s include here what the PDA looks like without our cheat for pushing multiple symbols

As hopefully is clear this is just expanding out the push onto the stack into multiple states that do nothing with the input and simply add symbols onto the stack.

Now, I won’t really cover the reverse direction of PDA to context-free grammar. It’s not super interesting and spiritually reminds me a lot of the conversion of NFAs into RegExps. We first massage the automata into a particular format that’s nice and then build up the syntax of the CFG from the transitions of the PDA. You can look it up in Sipser if you particularly care about it. The important point is knowing that it exists and thus PDAs and CFGs are equivalent. The PDA to CFG direction, on the other hand, is interesting because it tells us how to implement CFGs easily as a program.

Finally our last topic on context-free languages: the context free version of the pumping lemma. As before, we’ll state it then prove it, then do some simple examples with it.

So the pumping lemma for context free languages states that if a language is context free then

• there exists some number $p$, called the pumping constant such that
• for all strings $w$ in the language such that $|w| \le p$, then
• there exists $u,v,x,y,z$ such that
• $w = uvxyz$ and
• $|vxy| \le p$ and
• $|vy| \le 0$ and
• forall $i \ge 0$, $uv^ixy^iz$ is in the language

Okay, so this looks an awful lot like the pumping lemma for regular languages except that we now break things up into 5 pieces instead and the “looping” parts occur in two places in the string $v$ and $y$ rather than just one. Why is that? Well, in a sense the more flexible kind of recursion we can do with CFGs that allows us to do more than the regular languages explains it pretty neatly. You don’t just have simple loops in the CFLs, which would correspond to productions such as

• $A \to BA$
• $A \to \epsilon$

which would give us the simple kind of $xy^iz$ kinda like with the regular languages, however we can also have recursion that does something like

• $A \to BAC$ or
• $A \to AB$ etc.

and a grammar can mix all of these together. That means that the part of the string that’s the “loop” can come before, after, or both from the base case of the recursion. That’s why we have this restriction that $|vxy| \le p$ but we can “pump” $v^i$ and $y^i$ simultaneously.

The basic idea of the proof is similar to the regular language version, where we take the pumping constant to be some size that forces there to be a repetition by the pidgeon hole principle and then we mercilessly exploit that repetition. What number can we exploit? Well, we don’t have states like in the DFA case, but we do have a limited number of variables. If we can show that there are a number of expansions larger than the number of variables, then we know that there must be a repeated variable in there somewhere.

First, let’s look at a property of parse trees for context free grammars: the height of a parse tree is the height of the longest path from start node to ending node, or in terms of strings the largest number of expansions from the start symbol to one of the terminals in the resulting string. If we choose our pumping constant to be $b^{|V|} + 1$, where $V$ is the set of variables and $b$ is the largest fanning of any expansion in the grammar, then we know that the height must be greater than $|V|$, and if it’s greater than the number of variables then we know that there must be a repeated variable. Let’s call that repeated variables $R$. Then there is some path in terms of recursion from $R$ back to itself, and we can either cut out that subtree entirely, leaving only the base case of the recursion ($x$ above in our breakup of the string) or you can arbitrarily repeat the subtree “under itself” to pump up the repeated part of the string on either side of $x$, i.e. the $v^i$ and $y^i$ components.

Let’s consider an example before we close the book on this topic. Let $L = \{ w | w \text{ is a palindrome and the number of 0s and 1s are equal}\}$.

Assume our pumping length is $p$, then we pick our string to be $0^p1^{2p}0^p$. Now, since this string is longer than the pumping length we know that there must be some way to break up the string into $u,v,x,y,z$ such that $0^p1^{2p}0^p = uvxyz$, $|vxy| \le p$, $|vy| \ge 0$, and for all natural numbers $i$ then $uv^ixy^z$ should be in $L$. Let’s consider all the ways we could break up our string into these pieces. This is a little more complicated than the regular case because we have the freedom to pick $u$ to be as long as we want rather than having the loop be forced to occur in the first $p$ characters of the string. There are three proper cases

• $vxy$ occurs entirely in the first or last $0^p$, but then pumping means that we’ll break the invariant that it’s a palindrome
• $vxy$ is a mixture of 0s and 1s, but since it can only be wide enough (at most $p$ width) to grab 0s from one side, hence pumping will make it no longer a palindrome
• $vxy$ is made up of entirely 1s, but then pumping can keep the string a palindrome but can’t make the string still have an equal number of 0s and 1s.

and thus we’ve shown that the language is not context free.

Well, that pretty much wraps it up for everything we’re intending to cover about context free languages in this course. There’s plenty more to say, really, but it’s mostly in the context of parsing or how linguists use them which is all pretty wide outside the scope of this course where we just want to treat them as an intermmediate model of computation. Onward to Turing machines! (which are chronologically before these notes, but wevs)