The Big Surprise

Back in the 1930′s, Kurt Godel proved two amazing facts about arithmetic: First, there are true statements in arithmetic that can’t be proven. Second, the consistency of arithmetic can’t be proven (at least not without recourse to logical methods that are on shakier ground than arithmetic itself).

Yesterday, I showed you Gregory Chaitin’s remarkably simple proof, of Godel’s first theorem. Today, I’ll show you Shira Kritchman and Ron Raz’s remarkably simple (and very recent) proof of Godel’s second theorem. If you work through this argument, you will, I think, have no trouble seeing how it was inspired by the paradox of the surprise examination.

Start with this list of statements:

  1. It takes more than 10000 characters to specify the number 1.
  2. It takes more than 10000 characters to specify the number 2.
  3. It takes more than 10000 characters to specify the number 3.

and so forth.

Obviously, statements 1, 2 and 3 are all false, since it only takes a single character (namely “1″) to specify the number 1, and a different single character (namely “2″) to specify the number 2. But if you continue this list long enough, you’ll eventually get to some true statements. Yesterday we saw a proof that none of those true statements is provable.

But suppose that, undeterred by the proof, we are determined to identify a specific true statement on the list. Here’s our strategy:

  • First we write down the first gazillion statements on the list, where a gazillion is some number so big that we’re sure to have included some truths.
  • Next we go down the list, eliminating false statements. Notice that if we’re willing to work long enough, every false statement eventually gets eliminated — we just keep trying shorter-than-10000-character prescriptions and crossing off the numbers they describe. This leaves us a list of candidates.
  • Suppose there’s just one true statement on the list. Then eventually we cross all the others off, and learn that the one remaining candidate is true (because the list was so long there had to be at least one true statement). In fact, we’ve just proved (by process of elimination) that this statement is true. But we learned yesterday that no such proof is possible. Conclusion: There can’t be just one true statement on this list.
  • Suppose there are just two true statements on the list. Then eventually we’ll cross all the others off, leaving two candidates, at least one of which is true. But we’ve already proved that there can’t be just one true statement, so these must both be true. In fact, we’ve just proved they’re true. Which is impossible. Conclusion: There can’t be just two true statements on the list.
  • Do you see where this is going? There also can’t be just three true statements on the list, or four, or five, or any other number. Yet we know there are true statements on the list! Something’s wrong here!!!!!

Okay, what went wrong? Answer:

  1. Our argument relies on the fact that every statement on our list must be unprovable. Why should we believe that? Well, we proved it yesterday; that’s why!
  2. Actually, our argument relies on a bit more — it relies not just on the fact that every statement on our list is unprovable, but that we can prove their unprovability. But again, we did that yesterday.
  3. The only way out of this is to conclude that there’s some gap in yesterday’s proof.
  4. But there’s only one thing we used yesterday without proving it: We began from the assumption that our arithmetical reasoning is consistent.
  5. So that must be where the gap is — and it must be impossible to fill that gap! In other words, it must be impossible to prove that arithmetic is consistent.
  6. Tada!
  7. There’s just one other way out: If arithmetic actually is inconsistent, then all bets are off and we can prove its consistency — but in that case, of course, our conclusion will be wrong. In any event, there are very few people who think this is a contingency worth worrying about.

Share/Save

26 Responses to “The Big Surprise”


  1. 1 1 J Storrs Hall

    In other words, no theorem is true, but some are useful…

    The key useful thing to Godel, in my humble opinion, is that he showed that you can make statements of arithmetic (more specifically R&W’s PM axiomatization) self referential. I.e. that you can have a system of representation that says things like “this number can be represented …” or “this statement is provable.” There is a HUGE difference in power between systems which can do that and those that can’t.

    Von Neumann claimed that there was similarly a huge watershed in power between physical systems that can reproduce themselves and those that can’t. And surprise: Godel wrote the first self-reproducing program.

  2. 2 2 Jonathan Campbell

    Apparent typo: “If arithmetic actually is inconsistent, then all bets are off and we can prove its consistency”

    Big difference between this and surprise exam is that in this case, we know that there can’t actually be only 1 true statement, whereas with exam, the problem is that we can’t know exactly which day exam is (if it is to be a surprise). This one seems to me to be a paradox about actual facts, whereas the exam one seems more like a paradox about statements about facts.

  3. 3 3 Jonathan Campbell

    Actually, that doesn’t seem to be a typo! Sorry.

  4. 4 4 Ken B

    Remarkable.

  5. 5 5 Ken B

    Remarkable.

    @J Storrs Hall: This proof deos rely on that ability. You need to translate references about ‘we proved …’ into arithmetic to work the formal apparatus behind this proof. And if you glance at the linked pdf you see some Godel numbering.

    @Thomas Bayes: so the true masochist still needs to learn Godel numbering!

  6. 6 6 Harold

    This is all very elegant. I will have to read it all at least one more time, probably lots. The one thing that strikes me at the moment is step 4 and 5. We assumed our arithmetic reasoning was consistent, not that we could prove it was consistent. If this is where the gap is, why do we conclude that we can’t prove consistency, rather than that the arithmetic reasoning WAS inconsistent?

  7. 7 7 Steve Landsburg

    Harold: Suppose we eliminate all but one candidate from our list of possibly-true statements. I said that we now have a proof that the remaining statement is true, which is impossible. But on closer examination, all we really have is a proof that if arithmetic is consistent, then the remaining statement is true. We’re allowed to have a proof of that, as long as we can’t deduce from it that the remaining statement actually is true. In other words, we must not be able to prove that arithmetic is consistent.

  8. 8 8 Ken B

    Remarakble.

    J Storrs Hall: “In other words, no theorem is true, but some are useful…”. No. If the Peano axioms are consistent the result is true. The Peano axioms ARE consistent. The point of Godel’s work is not that there is not truth, quite the reverse. It is that there is truth and it’s richer than our ability to formalize.

  9. 9 9 BTR

    The consistency part is more explicit in the paper

    “T |- Con(T) -> etc”
    That is “If T can prove T is consistent then …” This result is used in the proof

  10. 10 10 Ken B

    J Storrs Hall:
    “In other words, no theorem is true, but some are useful…”

    No, not at all. The point is that truth is much ‘richer’ than what we can formalize.

  11. 11 11 Ken B

    A beautiful proof.

    One nit needs picking. Steve wrote

    5.So that must be where the gap is — and it must be impossible to fill that gap! In other words, it must be impossible to prove that arithmetic is consistent.

    Not quite I think. It must be impossible *using the same axioms and rules of inference we used in the rest of the proof* to prove that the *whole system we used* is consistent. As discussed at great length in other posts, you can prove the peano axioms are consistent.
    I suppose casually speaking Steve’s comment is fine but Steve and I disagreed on whether the theorem was hard or simple. I count this nit as evidence it is hard!

  12. 12 12 Ken B

    I expressed the view that Godel’s theorem was hard. Steve demurred and said only the proofs were hard.
    He was polite enough to ask me to let him what I thought after he presented these two lovely proofs.
    so here goes.

    1. The proofs are wonderful, much simpler than Godel’s proofs and simpler even than some of the other earlier proofs.
    They are also more *instructive* than Godel’s proofs.
    2. I must point out that that Steve has actually not stated either theorem correctly!
    The point is that the theorems do not talk about the consistency of arithmetic — which Steve has in the past identified with the consistency of the Peano Axioms.
    They talk about the entire formal theory T in which the proofs are conducted, under the assumption T can prove the PA.
    This is true for any such formal including any theory with the axiom “PA is consistent”.
    Let’s say we have such a theory, which for clarity I will call B. Then in B we *can* prove PA is consistent.
    What we cannot prove in B is that *B* is consistent.
    This seems like a pretty good argument for ‘hard’.

    3. Even here I have simplified leaving out some other restrictions on T. To really understand the theorem you need to know them.
    4. I believe some posters here — who are genrally intelligent folk — have in comments this week shown they still don’t quite get what the theorems say.

    Conclusion: Godel’s theorem is hard! Not as hard as I believed but harder than Steve thinks. It is for instance a lot harder to understand than the Ham Sandwich Theorem!

    Apologies for some of the earlier redundant comments; there was a glitch in the comment filter.

  13. 13 13 CC

    Sorry if I’m being dense, but…
    “Next we go down the list, eliminating false statements. Notice that if we’re willing to work long enough, every false statement eventually gets eliminated — we just keep trying shorter-than-10000-character prescriptions and crossing off the numbers they describe.”

    Is it really this easy? If you can “evaluate” each shorter-than-10000-character prescription, then you’ve proved that all the remaining numbers require more than 10000 characters, right?

    Unless by “description” you mean computer programs, which may or may not terminate. (That’s what I thought you meant in the previous post.) But in that case, you can’t just “evaluate” each shorter-than-10000-character prescription. What did I miss?

  14. 14 14 Steve Landsburg

    CC: What you’ve missed is that you never know how long it will take to eliminate any given false statement, so there never comes a time when you can be sure that all the false ones have been eliminated.

    It *is* true that the false ones are all eventually eliminated, but there’s never a time when you can say “okay, the evaluations are done now, so the remaining statements are all true”.

  15. 15 15 Ken B

    @CC & Steve: I don’t think Steve addressed CC’s worry. CC seems to point out that if your mehtod of testing an N-character program is to *run it* then Steve’s scheme is in trouble since the program might never stop. JHowever there is a way to get around this, using another “daigonalization”. We have the programs ordered 1,2,3 4 etc in some order as Steve outlined. Execute one setp of program 1. The go back to the beginning, execute one step of program 1, one step of program 2. The go back to the beginning and execute one step of program 1, one step of program 2, one step of program 3. Then go back to the beginning …. If a program is halted then executing is do nothing.

    This way you eventully finish any program you name which actually does halt. Like Finder60. The non-halting of an earlier program does not stop you from finally running all the steps of Finder60.

  16. 16 16 CC

    Got it. I was better off just thinking of “descriptions” as computer programs (as I originally understood Chaitin’s argument anyway). I guess your “descriptions” can include iterative procedures that may never halt, and the ones the do halt could take arbitrarily long.

    Very cool stuff.

  17. 17 17 Jonathan Campbell

    “so there never comes a time when you can be sure that all the false ones have been eliminated.”

    I thought CC was saying that that time has come once you’ve evaluated every possible shorter-than-10000 character description. I also don’t see how it can be impossible to have done that.

  18. 18 18 Jonathan Campbell

    CC thanks for the explanation. So does that mean that (a lower bound on) the lowest value of M (in the current case M=10000) for which this line of reasoning starts to work is given by the length of the shortest program for which we cannot determine whether it halts without running it?

  19. 19 19 Dave Johnson

    Blimey. I’ve had to reread this a few times. The devil really is in the detail. I can now understand why when I’ve heard the Incompleteness Theorem described before it’s always conditional on the fact that the axiom system must be sufficiently complex. It seems it needs to be complex enough to support these sorts of statements.

    Unfortunately it’s not helped me to resolve the surprise exam problem…

  20. 20 20 Rick

    I read here frequently. I hate finding people who are so much smarter than I, that I don’t even know what they’re talking about. Oh well.

  21. 21 21 Steve Landsburg

    Rick: What one understands is usually less a matter of intelligence than of what one has chosen to spend a lot of time thinking about.

  22. 22 22 J Storrs Hall

    Ken B: look up George Box. I was repurposing a fairly famous quip. If we can’t prove arithmetic consistent, we can’t prove that our proofs in arithmetic are not fallacious because we could have proved anything from a contradiction.

  23. 23 23 Ken B

    @J Storrs Hall: I recognize the allusion though I did not Box was the originator. But what he said is right, and what you said is wrong. If you had said “All our proofs are unreliable but some are useful” you might have a point.

  24. 24 24 J Storrs Hall

    Ken B: ah, but I have a poetic license. The license number is the smallest number that cannot be specified in a WordPress comment box.

  25. 25 25 Ken B

    J Storrs Hall: Anyone who likes LISP is entitled to some poetic licence!

  26. 26 26 Rick

    Ahh, you are kind as well as wise. BTW, I think there is more to learn and ponder over in Armchair Economist than in any book its size I’ve ever read. I’ve been reading it off and on, checking with it, refreshing my ideas, etc. for about 15 years now.

Comments are currently closed.