Lecture: Time Complexity

So in this post we’re going to try and give a smattering of the important parts about time complexity that will actually be covered in lecture. This will be fairly high-level and overview-y because, honestly, most of the time this course never gets to this material anyway so I figure even just spending an entire two-hour lecture on it is doing better than typical. This also leaves room for a solid two-hour introduction to the lambda calculus as a final treat for those interested and a lecture off for those who ain’t.

All that being said, let’s begin.

So until this point we’ve been splitting languages into three categories— decidable, recognizable, and unrecognizable—based upon the existence of a Turing machine that can decide or recognize the language. We’ve also shown that, with respect to this stratification of languages, there is no real difference between deterministic and non-deterministic Turing machines. This categorification is useful from the perspective of figuring out what a computer can do, but there’s another important set of questions in Computer Science: what can a computer do feasibly?

Of course, it depends on what we mean by “feasible”. Does “feasible” mean “finishes faster than it takes to drink a cup of coffee”? Does it mean “takes less than a gig of RAM”? When we talk about feasible and unfeasible problems in Computer Science what we mean is “scalable”, because in general even the most complex of problems are able to be handled in a short time if the specific instance of the problem is “small”. Of course, what do we mean by small? When we’re talking about things like sorting of lists, it’s a bit obvious that a small instance of the problem is going to be one with a short list. There are other problems where the size of the input may not be as intuitively obvious. Since we’re dealing with Turing machines though there’s a fairly obvious way we can define size. The size of the input to a Turing machine is simply the length of the input string on the tape. Also, for the remainder of this discussion we’re only going to talk about decidable languages. If a language is not decidable, then we’re allowing for the possibility that it could run forever on an input with us, the user, having no way of actually knowing if it’s still-working-but-taking-a-long-time or if it’s stuck. We rule out that as a possibility for convenient computational use and only stick to things that are decidable and compute in a finite time.

Now there are two different measures we could be using for scalability: time and space. We’re really only going to discuss time, not because space usage isn’t also important but in a sense time usage is the more important one. We’re getting better every year at having more memory with which to work on our computers, but we haven’t yet figured out how to cram more seconds into a day. We can use swap space and give ourselves massive amounts of memory (to a point, before we just slow ourselves down to a crawl) but better and better processors don’t buy you much when your algorithms scale poorly. Of course, what do we mean by scale poorly? To answer that question, we’ll introduce the notion of time complexity and “big O” notation that you’re probably familiar with but we should review anyway.

The “time” that a Turing machine takes on an input is the number of steps it runs on that input before it terminates. Again, we’re dealing only with decidable languages so that it will always terminate in finite time. We’re just trying to describe how finite is finite. We say that a Turing machine M ‘s running time is described by a function f if the maximum number of steps that M takes on an input of length n is f(n). In other words, for every length n, f(n) gives us the maximum number of steps that the machine takes for an input of that length. Now, ultimately, if we’re talking about scalability we don’t really care about constant factors so we want to treat f(n) = 2*n and f(n) = 3*n the same, as well as f(n) = n + 10. This might seem silly because maybe you would care about a constant difference of 10 or 100, maybe that feels like it should change a problem from feasible to unfeasible but look at it this way: we’re trying to distinguish problems that are in principle computable in a reasonable time as the input grows, and a factor of 10 or even 100 could be overcome at some point by improved hardware, from problems whose number of steps grows exponentially from small to longer-than-the-lifetime-of-the-universe just by making the input 10 times longer.

To the end of making this distinction, we define that for f and g that f(n) = O(g(n)) when there exists positive integers c and n_0 such that f(n) \le c*g(n) for all n > n_0. In other words, f(n) is, for sufficiently large n always bounded by g(n) up to some constant factor.

This means that we can finally describe the time complexity class of a function t, \text{TIME}(t(n)), to be all languages that can be decided by a Turing machine that runs in O(t(n)) time. Note that this is cumulative in the sense that if a TM runs in O(t(n)) time and t(n) < t'(n) for all n, then we have that the TM runs in O(t'(n)) time as well. This little detail might be important to those of you paying close attention to how we define the complexity classes P and NP shortly.

The problems we’ll call “feasible” in time complexity are the ones that run in polynomial time, i.e. \text{TIME}(n^k) for some k, since those are the ones we can reasonably expect to scale well, more formally we have that P = \bigcup_k \text{TIME}(n^k). Now, there a two ways to prove that a language A belongs to P. We can either find a Turing machine that decides A and then check it’s complexity to show that it’s O(n^k) for some k or we can give a reduction to a language we already know is polynomial. Why? Because if we have a language A, a reduction A \le_m B, and know that B \in P then we can build a decider for A automatically that will have time complexity O(f_R + n^b) where f_R is the time complexity of the reduction and b is the polynomial factor for the decider of B. Ah, but wait, there’s a slight problem here! What about the complexity f_R? If that’s not polynomial as well then we’ve blown the whole thing out of the water. To this end, we insist that the reduction A \le_m B run in polynomial time as well and we’ll use the notation A \le_p B to mean that there exists a polynomial time reduction of A to B.

This is all well and good, but now we come back to the question of the difference between deterministic and non-deterministic Turing machines. To this end, we define the complexity class NP which are the problems that can be solved in polynomial time by a non-deterministic TM. How do we define the time taken by a non-deterministic machine, though? Well, since we’re dealing only with deciders which, for non-deterministic machines, means that every single possible computational path terminates in finite time then we can just choose the “number of steps” taken by the machine to be the longest number of steps of any of the branches. This encapsulates the intuition we’ve been building for what a non-deterministic machine does in the first place: it tries all possible computational paths simultaneously and when they’re all done computing, the machine is done with the input. Now, every non-deterministic machine can be turned into a deterministic machine, which we know because we’ve seen the construction before. What does this do to the running time? Well, if a non-deterministic machine runs in time O(t(n)) then the naive conversion to a deterministic machine that simulates the non-determinism is going to run in O(2^{t(n)}) time, which is exponentially longer.

Does this mean, then, that polynomial-time non-deterministic machines necessarily have no polynomial-time determiminstic counterparts? Certainly not! A long-standing problem in computer science is if P = NP, or in other words, if for every language that can be decided with a polynomial non-deterministic machine then there exists a polynomial deterministic machine that decides the same language. This is an open question, but informally I think most people would be surprised if the answer was yes. Non-determinism buys you so much when it comes to Turing machines that, honestly, it would be counterintuitive if it was never necessary for efficiency.

Now, it might be obvious the kinds of things that are in P: sorting algorithms, binomial search, etc. that you’ve dealt with in intro programming courses but what kinds of problems are in NP? We’ll consider a particular, useful example: the clique problem.

Formally the language CLIQUE is defined as \text{CLIQUE} = \{(G,k) | G \text{ is a graph with a clique of size } k\} but what exactly is a clique? A clique is a subgraph where every node in the subgraph has an edge to every other node in the subgraph. Now, there is no known way to, in polynomial time with a deterministic machine, to determine if a given graph has a clique of a given size. On the other hand, it’s very easy to show that \text{CLIQUE} \in NP. Our machine simply simultaneously checks all possible subgraphs of size k or greater, which means that the longest possible thing to check would be the entire graph itself and thus the algorithm is n^2 for a non-deterministic machine. Fantastic!

There’s another way we can think of NP, though, that’s a bit more useful in terms of proving that languages are in NP by construction. We equivalently say that a language is in NP when there exists a polynomial time verifier for the language, which is a machine that takes

  • a proposed element of the language
  • a piece of data, called a certificate, meant to witness the verification

and is allowed to use both of them in determining if the object is in the language. For example, a polynomial time verified for CLIQUE is a machine that takes a pair of a graph and a set of nodes in the graph that should correspond to the required clique. The verifier simply checks that the subgraph is a clique, which is polynomial in the size of the input.

How is this equivalent to our previous definition? Well, we need to then construct a polynomial non-deterministic Turing machine from a polynomial verifier and then visa-versa. If we have a verifier, we can construct a non-deterministic machine by having the non-deterministic machine simultaneously guess at all possible certificates and then run the verifier. If we have a non-deterministic machine, then we can make a verifier by using the correct path through the computational tree of the machine as the certificate. (I’m eliding some details, obviously)

The final topic I want to hit briefly is the idea of NP-hard and NP-complete problems. Essentially, a NP-hard problem is one that every NP problem can be reduced to in polynomial time, an NP-complete problem is an NP-hard problem that is also in NP. This seems like an unusual property to have, but it turns out that there really are NP-hard and NP-complete problems. CLIQUE, as discussed above, is actually an NP-complete problem. This means two important things: first, that if there is a way to solve CLIQUE with a polynomial deterministic machine then P=NP and second, that an easy way to check if a problem is NP is to try reducing it to CLIQUE.

I think that’s pretty much all I’m hoping to cover in this very very sketchy overview of time complexity. The one lecture after this will be the untyped lambda calculus, just for fun.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s