Today, I’m going to give you a short, simple proof of Godel’s First Incompleteness Theorem — the one that says there are true statements in arithmetic that can’t be proven. The proof is due to Gregory Chaitin, and it is far far simpler than Godel’s original proof. A bright high-schooler can grasp it instantly. And it’s wonderfully concrete. At the end, we’ll have an infinite list of statements, all easy to understand, and none of them provable — but almost all of them true (though we won’t know which ones).

Chaitin’s proof, like Godel’s, is inspired by a classical paradox — in this case, Berry’s Paradox as opposed to the Liar Paradox (both of which I described yesterday).

We start by observing that some numbers are more complicated than others. The 10,000 digit number that starts off 10101010101010101….. and continues the same way is somehow less complicated than a random 10,000 digit number.

Kolmogorov formalized this notion by defining the “complexity” of a number as the length of the shortest prescription (in some fixed language) for writing it down. The number above has a 48-character prescription: “Write down a 1, then a 0, then repeat 5000 times”. That makes it pretty simple.

Now suppose we want to find a more complicated number — one that requires at least, say, 60 characters to prescribe. We know there must be many such numbers, but we’d like to find a specific example — together with a **proof** that our example can’t be prescribed in less than 60 characters. So we write a computer program, called Finder60, which searches for examples-with-proofs. Ideally, the program outputs something like: “I have found a proof that the number 2834932709472398472328923478902342903848927189374901742309842398742 cannot be prescribed with fewer than 60 characters.”. Finder60 searches systematically, so if there’s such a number/proof combination to be found, Finder60 will find it. Otherwise, it keeps on running forever.

We can also, of course, write programs called Finder90, Finder120, Finder10000 and so on, to find ever-more-complicated numbers together with proofs that they are complicated.

Once you’ve written the code for Finder60, you only have to tweak it slightly to get the code for Finder10000 — and the code gets only slightly longer in the process. Basically, you just change all the 60′s to 10000′s, replacing two-digit strings with five-digit strings. Not much difference.

So if the code for Finder60 is, say, 5000 characters long, then the code for Finder10000 is just a bit more than that — say 5200 characters.

Now suppose M is any number that provably requires more than 10000 characters in its prescription. Then Finder10000 will find it and print it out. Which means we have the following less-than-10000-character prescription for M:

**Run the following program: [insert the 5200-character code for Finder10000 here]**

Uh oh! If prescrbing M provably requires more than 10000 characters, then M can be prescribed in fewer than 10000 characters. Contradiction!

Conclusion: There must be no such M. That is, **no number can be proved to require more than 10000 characters in its prescription**.

But surely some numbers **do** require more than 10000 characters; after all there are only finitely many ways to prescribe a number with fewer than 10000 characters, which leaves infinitely many numbers left over.

So what’s a true statement that can’t be proven? Answer: Any true statement of the form “The number M cannot be prescribed in fewer than 10000 characters”. No matter what M you plug in , this statement must be unprovable. If you plug in M=1 or 2, you’ll get a false statement. But for most values of M (though we don’t know **which** values!) you’ll get a statement that’s true, but unprovable.

Isn’t that about the coolest trick ever? Well, it turns out that you can add another twist to make it even cooler. That’s where the surprise examination paradox comes in. I’ll explain it all one day later this week, though you won’t know which day till you log in that morning.

Something escapes me: The phrase you used yesterday when describing the paradox “Hence there must be some natural numbers that can’t be described by any short phrase” seems just wrong to me if you are going to allow phrases of the kind “the smallest natural number that can’t be described by any short phrase”; ie phrases that do not describe a pattern in the digits allowing to write the number down, but that just give a property of the number. Allowing this, I can describe ANY natural number using less than 9 words (if I’m counting right):

Think of any number x. I can say of x:

“x is the nth number of the m-timetable” (find whichever m and n are)

If x is a prime number, just say:

“x is the nth prime number”

Not directly to the point, but related: Algorithmic complexity was invented a few years before Kolmogorov by Ray Solomonoff. Unfortunately, Ray died recently, and a conference was held in his honor at Monash just a couple of weeks ago.

Ray’s work was done from the point of view of AI: he was trying to understand how the ultimate learning machine would work. His approach was to have a machine search through all Turing machines (each of which is of course a bit string placed on the tape of a universal Turing machine) for one which reproduced the string of bits that the robot’s sensors had produced to date. The idea is that whatever that shortest machine produced next was the best bet as to what the future would hold. (It’s actually a weighted combination of machine outputs, look for details if interested.)

Godel’s actual proof is worth studying for other reasons than metamathematics. He essentially invented LISP (the Godel numbering that lets you represent arbitrarily complex formulae with a single value, and compose them), and wrote the first algorithmic theorem prover.

Unlike Chaitin’s scheme as described here (or Solomonoff’s AI machine), Godel’s proof-search functions were computable (although intractable to non-quantum physical computers). That’s because the proof search functions had to be given a bound up to which they would search; they couldn’t simply run forever, looking.

Since 10,000 characters can only describe a finite set of numbers, why can’t the finder program just list the numbers described by every combination of 10,000 characters and then spit out one not on that list? Is it that given certain languages, we can’t know if some combinations of 10,000 characters describe numbers because they describe programs that may or may not run forever?

Just wondering if the assumption that you can actually write Finder60 isn’t questionable?

Thomas Purzycki: I believe the reason is that there is no way to reliably determine, for a given number, how many characters are needed to describe it.

Also, the Kolmogorov complexity must be slightly more complex than described by Steve. Suppose there was a tribe of people believed that the founder of their tribe, named “Abe,” had lived to 2834932709472398472328923478902342903848927189374901742309842398742 years old. Then, that number could be described (assume their “fixed language” is English) simply as “The age at which Abe passed.”

“… though you won’t know which day till you log in that morning.” A meta paradox! Thanks for another interesting post.

As Thomas P notes, in Steve’s post what counts as a ‘proof’ is left a bit vague.

Here’s a sharper definition. Define a formal language that allows you to write computer programs [Turing machines].

For a number N define K — for Kolmogorov — as the length of the shortest program that run with an empty input string produces N

This is perfect well defined.

Call a TM “compact” if 1) it halts on empty input

AND 2) it is the shortest program producing its output.

So if we have a compact program TML then it produces some integer L and K(L) = length(TML).

Now imagine a program CHAITIN that does the following.

You feed it a program TM and it tells you if TM is compact.

That means CHAITIN does two things: it always produces an answer (always halts),

and it produces 1 if the TM is compact and produces 0 if it is not.

This is what it means to ‘prove’ formally a program is compact.

Chaitin the man proved CHAITIN the program does not exist.

(If you know about the Halting Problem this isn’t really a big surprise.

It requires one program to outthink the other.)

The proof is what Steve outlined, by contradiction.

Assume that CHAITIN exists.

Consider program KRUGMAN. It takes a number N and lists all Turing Machines longer than N.

It then (aside: using a diagonal construct to avoid halting concerns) grinds away

applying CHAITIN time after time to these machines until it finds one that CHAITIN says is compact.

Call this OBAMA. Then KRUGMAN runs OBAMA and prints OBAMA’s output, which is K(OBAMA).

You should convince yourself KRUGMAN will actually produce something and not just run forever:

KRUGMAN will eventually end up saying what OBAMA says.

This is true because CHAITIN always halts.

Now run KRUGMAN with a number somewhat larger than length(KRUGMAN). What will happen?

Well KRUGMAN will find some program KEYNES that CHAITIN says is compact and will print K(KEYNES).

But (as anyone who has seen their pictures knows) KRUGMAN is shorter than KEYNES.

K(KRUGMAN) = K(KEYNES) and length(KRUGMAN) < length(KEYNES).

So KEYNES is not compact after all. CHAITIN lied.

So CHAITIN does not exist.

So there are compact programs that cannot be proven to be compact.

Whew. I re-read it and it all looks right.

Oops. So much for proof-reading.

I wrote

K(KRUGMAN) = K(KEYNES)I should have written

output(KRUGMAN) = output(KEYNES) = K(KEYNES)My only excuse is that KRUGMAN’s output is always confusing.

Andy:

Just wondering if the assumption that you can actually write Finder60 isn’t questionable?It’s easy to write Finder60. First list, in an infinitely long row, all the symbols that exist in whatever language you’re using. Below that, list, in another infinitely long row, all the two-symbol strings in that language. Then all the three-symbol strings and so on. Most of these strings will be nonsense. A few will be meaningful. A very few will proofs of theorems. A very very very few will be proofs that some number requires more than 60 characters in its prescription.

Now work your way systematically through all these strings. First check the (one-character) string in the upper left. Then check the string next to it, followed by the first string in the second row. Then the third string in the first row, the second string in the second row, and the first string in the third row (so you’re going along a diagonal). Then the next diagonal, and so forth. When you get to a string, check it to see whether it’s a proof that some number requires more than 60 characters. If so, print out that number; otherwise, move along. By working systematically along the diagonals, you’ll eventually check every possible string and hence every possible proof.

“When you get to a string, check it to see whether it’s a proof that some number requires more than 60 characters. ”

How are you going to do that? Checking to see whether a string is a proof of something can be undecideable.

@Ken Aromdee: Yeah, Steve’s proof seems a little unclear to me largely for the reason you finger. Which is why I gave another version. Consider the M Steve produces. Does FinderM terminate and give an answer? Well, no it does not. So is FinderX really so easy? [What about FinderM+1 -- Steve's construction *as given* won't work as it runs forever when it hits M. Need another diagonization of the computations.] Does the contradiction Steve demonstrates arise from his assumptions about the Finder or the existence of M …

I think the proof I gave — cribbed in large measure from Wikipedia — is clearer in that regard.

Ken Arromdee:

How are you going to do that? Checking to see whether a string is a proof of something can be undecideable.I check each line to see whether it’s an axiom and/or follows by standard rules of logic from previous lines. If each line passes this test, then we have a proof of something, namely the last line. (This assumes we have a “newline” symbol, which does not seem problematic.)

That is an excellent proof outline of Godel’s theorem, much simpler to understand and prove than the one I knew from Hofstadter & from University. Thanks for posting!

Now hang on…..

You’ve implicitly assumed that, say, Finder60 contains the number ’60′ in its code. You’ve explicitly assumed it’s trivial to transform Finder60 into Finder10000, and this doesn’t significantly increase the length of the program.

However, as far as I can see, this proof doesn’t prove that Finder10000 doesn’t exist, only that it has more than 10000 characters.

I can easily imagine that FinderX might exist for any X, but each FinderX has a completely different structure from the others, and the shortest one has O(X^2) length. Then your purported unproveable statements “N can be described with less than X characters” are not in fact unproveable unless actually false.

So, what’s missing from the proof or my understanding of it?

KenB doesn’t that also prove that KRUGMAN doesn’t exist?

@david – unless all numbers require less than 5 words, the description

“n times m”

might take more than 9 words. But if they take less than 5 words, you could just say

“x”

and that’s less than 5 words already.

I think you’ll find there are numbers that need more than 5 words to describe. Since English has only about 10^6 words, there are less than 10^54+10^48+..+1 numbers that can be described with 9 words or less. There must be some that need more words than that.

Ok, I should have read all the comments before expressing doubt. I see you’ve given an explicit Finder60 program, which can indeed be easily adapted to FinderX for any X.

@MikeH: Yes it does, since KRUGMAN relies on CHAITIN.

I think Steve’s Finder is confusing some people because they are implicitly assuming FinderX gives an answer. But it might not. If there is no proof for Q then FinderQ will run forever and you will never know if it is about to find a proof or not. So if there is no proof for Q you will never know it from running FinderQ.

“I check each line to see whether it’s an axiom and/or follows by standard rules of logic from previous lines.”

How are you going to figure out if the line follows by standard rules of logic from previous lines? Determining if one statement follows by rules of logic from other statements can still be undecideable.

Ken Arromdee:

What I should have said is this:

Finder60 can systematically generate every possible proof of length 1, then every possible proof of length 2, and so on, until it finds a proof that some number has complexity greater than 60. If there is such a number, Finder60 will eventually discover it. Otherwise it runs forever.

As a layperson here (hoping to be a bright HS student), it strikes me that the “trick” is simply a willingness to substitute program code for a ‘literal’ direct description of a number = a leap of faith in one’s programming abilities. Maybe this is what is meant by the idea that we can only assume our rules of logic (arithmetic) are consistent. (Maybe some would say this applies to our ability to directly describe a number as well?)

“Now suppose M is any number that provably requires more than 10000 characters in its prescription. Then Finder10000 will find it and print it out.”

Finder10000 will not print out M, it will print out M and a whole load of unrelated crap too:

-the number A cannot be described with 10000 characters

-the number B cannot be described with 10000 characters

… and finally,

-the number M cannot be described with 10000 characters

I would not consider the above output to be equivalent to a prescription for the letter M, because to pick M out of that list you need to know already know what M is.

“Finder60 can systematically generate every possible proof of length 1, then every possible proof of length 2, and so on, until it finds a proof that some number has complexity greater than 60.”

Even that doesn’t help. How are you going to determine that a particular proof “is a proof that some number has complexity greater than 60″? Determining whether a proof proves X can be an undecideable question.

In order to get this to work you would have to redefine complexity. For instance, the complexity of a proof of X is equal to the length of a proof plus the number of steps of the proof that has to be executed to prove X. That way, although determining whether a proof proves X can be undecideable, determining whether a proof proves X with less than C complexity is not undecideable.

@Ken Arromdee and Steve:

Here is the source of the unclarity I mentioned earlier. Steve wrote

So if the code for Finder60 is, say, 5000 characters long, then the code for Finder10000 is

just a bit more than that — say 5200 characters.

Now suppose M is any number that provably requires more than 10000 characters in its prescription. Then Finder10000 will find it and print it out.

Which means we have the following less-than-10000-character prescription for M:

Run the following program: [insert the 5200-character code for Finder10000 here]

From this Steve derives a contradiction. BUT there are two assumptions here either of which could be the problem.

1) M and its proof exist

OR2) Finder1000 exists and halts on M.

Note that it must halt for Steve to be able to “run” it and see the answer.

But if Finder1000 does not halt then that *also* establishes the result since it means no proof exists.

Any proof has to be of finite length so it will be found eventually.

This still leaves another possibility. The contradiction could arise from 1 and 2 together;

either alone might leave a consistent theory. That remains unproven.

This is why I think the proof based on CHAITIN above is cleaner.

Ken Arromdee:

Determining whether a proof proves X can be an undecideable question.Not at all! A proof proves X if and only if X is the last line of the proof. It’s easy to check the last line, compare it to X, and see if the two strings are identical.

“Which means we have the following less-than-10000-character prescription for M:

Run the following program: [insert the 5200-character code for Finder10000 here]”

Surely this doesn’t uniquely describe M?

@jj and Lib: To solve this, you can program it to spit out the first number it finds with this property. Then, if it finds M, it spits out either M or some other number A. Either way, a contradiction occurs.

@Ken B: It is pretty clear from Steve Landsburg’s description that the program exists and, if M and if proof exists, halts on M (or some other number). But perhaps you find the chain of reasoning involved there unclean?

@Will:

My post form 3:12 identifies where I think Steve’s proof could be clearer. Not ‘correcter’ as I note the proof works, just clearer. A lot of people are having trouble with just the point about Finder halting.

Steve explained Finder. Then he *changed* the explanation to quite a different approach. This in response to Ken Aromdee’s objection which was about halting even if Ken did not use that word.

“Cleaner proof” is quite meaningful to a mathematician. It is a matter of opinion. In my opinion the proff I gave, which explicitly defines ‘description’ and explicitly deals with halting, and avoids the two cases I identified earlier, is cleaner. Your mileage may vary.

“A proof proves X if and only if X is the last line of the proof. It’s easy to check the last line, compare it to X, and see if the two strings are identical.”

This doesn’t rule out proofs based on inconsistent premises (which of course can prove anything). Assuming you want to rule those out, then figuring out whether the proof has inconsistent premises is still undecideable (even if those premises are just ordinary mathematics).

@Ken Aromdee:

I think Steve’s code works like this. You supply me with a putative proof. This is a numbered sequence of statements. In addition we specify that each statement has an annotation which is either

1) a reference to the axiom list showing the statement is an axiom

2) a reference to the axiom list showing the statement is an instatiation of an axiom schema

3) a reference to priors statements in the proof and a reference to the rules of inference citing the rule of inference showing the current statement follows.

This is all quite determinate, and finite. A halting subroutine can be written to do it. The subroutine can check your annotations, the subroutine can compare your statements to the axioms and rules. It can reject malformed statements. [This assumes the theory has a finite number of axioms and axiom schemas (more or less) -- one of the restrictions I alluded to in an early comment, and true of ZFC].

Any proof can be so annotated, and for every valid proof there *is* a correct annotation, so it suffices to consider just these proofs.

If you give me a correct proof with a correct annotation the sub-routine will accept it. If you give me a bogus proof or a wrongly annotated one the sub-routine will reject it.

Thus is you actually hand me a correctly annotated proof for M I can give it a thumbs up or thumbs down definitively.

This is waht Steve needs.

But I will never know when I can stop looking if there is no proof so far, I can never identify unprovable statements this way.

i can describe an infinitely long number in just two characters. It’s called “pi”.

* which is of course just one character in the Greek alphabet

Charles Pye:

i can describe an infinitely long number in just two characters. It’s called “pi”.First, I’m not sure what an “infinitely long number” is. But putting that aside, your ability to describe any particular number in a small number of characters of course depends on what symbols are available (and their meanings) in whatever formal language we all agree upon at the start.

by infinitely long I meant an infinite number of decimals that never repeat. Most people would intuitively think of that as a “complicated” number, but we could also just give a name to the super long number given here as an example. My point is that the beginning premise of this proof doesn’t seem to be rigorously defined.

@Charles Pye:

The details are not and cannot be fully spelled out in a blog post. Since you want perfect clarity the best approach is to define everything in terms of Turing machines, but Godel didn’t and you don’t need to. We have a finite alphabet. Loof at your keyboard, that’s a good example. We define rules for combining “letters” — members of the alphabet so & counts as a letter — and a way to interpret them. “English” is an informal example, but if you want full formality get a hold of Principia Mathematica by Russel & Whitehead. In any event it can be done quite precisely. Everything can be doen here quite precisely. Using English as an example makes the argument clearer, but as you note “English” is not a formal system — but Godel’s theorem applies to formal systems.