Here are some thoughts on last week’s absent-minded driver problem.
First a recap of the problem, with a bit more detail than last week:
Each day, Albert leaves his office (at the bottom of the map), gets on the Main Highway and attempts to drive home to his house on Second Street. If he turns too soon (onto First Street) or if he overshoots (going all the way to the north end of the Main Highway), he is mauled by dinosaurs.
Obviously, Albert’s best strategy is to go straight at the first intersection and turn right at the second. Unfortunately, both intersections look identical. Doubly unfortunately, Albert can never remember whether he’s already passed the first intersection.
Since Albert can’t tell the intersections apart, he needs a single strategy for both of them. Strategy A is to turn at every intersection. This delivers him directly to the First Street dinosaur mob. Strategy B is to go straight at every intersection, putting him on a direct route to the North Side crew. Neither of these strategies has any chance of getting him home.
Therefore, Albert adopts Strategy C, which is to turn right with some probability p at every intersection. He wins if he goes straight at First Street (which happens with probability 1-p) and right at Second Street (which happens with probability p. His overall chance of making it home is (1-p)p, which is maximized when p=1/2. So he chooses p=1/2 and expects to make it home with probability 1/4.
Given this strategy, Albert is certain to make it as far as First Street, but has only a 1/2 chance of making it to Second Street. Thus when he pulls up at an intersection, the odds are 2 to 1 (i.e. the chances are 2 out of 3) that he’s at First Street. If he now revises his plan, turning right with some new probability q, then he reasons as follows:
- If I’m at First Street, I need to go straight and then right, which I’ll do with probability (1-q)q.
- If I’m at Second Street, I need to go right, which I’ll do with probability q.
- There’s a 2/3 chance I’m at First and a 1/3 chance I’m at Second.
- Therefore my chance of getting home is (2/3) x (1-q)q + (1/3) q.
- I can maximize this by choosing q=3/4.
Albert therefore switches to a strategy of turning right with probability 3/4. But we’ve already computed that he can do no better than turning right with probability 1/2. Should he switch strategies or shouldn’t he?
1) Suppose first that Albert can commit himself to turning at each intersection with some probability p. To get home, he must go straight at First Street (he gets this right with probability 1-p) and then turn at Second Street (with probability p). So his overall chance of making it home is (1-p)p, which is maximized when p=1/2. His chance of getting home is therefore 1/4.
However, when Albert pulls up to an intersection, he calculates his chance of getting home as 1/3, not 1/4. There are two ways to see this.
First way to see it: Albert’s strategy gets him as far as First Street every time but as far as Second Street only half the time. So the intersection he’s currently approaching has 2-to-1 odds (i.e. a 2/3 probability) of being First Street. If it’s First Street, his chance of getting home is 1/4, but if it’s Second Street, his chance of getting home is 1/2 (since now he only has to get one decision right, not two). His overall probability is therefore (2/3) x (1/4) + (1/3) x (1/2) = 1/3.
Second way to see it: There are eight states of the world. State One is [First coin says "straight", second says "straight", this is the first intersection] (call this SS1). State Two is [First coin says "straight", second says "straight", this is the second intersection]. Et cetera. Initially, all eight states are equally likely and only two (ST1 and ST2) get him home eventually, which is why Albert initially calculated his chances at 2/8 = 1/4. But now, the fact that Albert is approaching an intersection rather than cooking on a dinosaur stove rules out two of the bad possibilities (TS2 and TT2). So there are now six equally likely possibilities, two of which are good, and his chances are therefore 2/6 = 1/3.
In other words: The unconditional probability Albert will get home is 1/4. The probability conditional on the fact that the dinosaurs haven’t gotten him yet is 1/3. So far, no paradox.
2) Now we want to allow Albert to change strategies midstream. At the moment he’s approaching an intersection, there are three relevant probabilities:
- p is the probability he’s used in the past
- q is the probability he’s using now
- r is the probability he’ll use in the future
You are pretty sure to go astray unless you maintain a rigorous distinction between these three variables.
3) When Albert reaches an intersection, he can compute as follows:
- The probability that Albert is at First Street is 1/(2-p).
- The probability he’ll get home, conditional on being at First Street, is (1-q)r.
- The probability that Albert is at Second Street is (1-p)/(2-p).
- The probability he’ll get home, conditional on being at Second Street, is q.
Therefore the overall probability he’ll get home is [(1-q)r + (1-p)q]/(2-p).
Albert seeks to maximize this probability—or the expected value of this probability—by choosing q, taking as given both the value of p and the method he’ll use to choose r.
The solution to this problem depends both on Albert’s beliefs about p and Albert’s beliefs about r. Not surprisingly, different assumptions lead to different conclusions.
So we have several models to consider.
Mark I. Albert remembers p, and can commit himself to choosing r=q. (That is, he updates once and forever.) This leads him to maximize [(1-q)q + (1-p)q]/(2-p), which occurs at q = 1 – p/2. In particular, if p = 1/2, then q = 3/4. In the language of last week’s post, this is “Strategy D”. If Albert is to behave consistently, he must choose q=p, which requires p=2/3, yielding a 2/9 chance of getting home.
Criticism.It seems inconsistent to assume that Albert can lock himself into choosing r=q but was not able at the outset to lock himself into choosing q=p. [On second thought, what's really going on is that Albert believes at the outset that he's locked himself into choosing p, so it's not entirely inconsistent for him to believe at the intersection that he's locking himself into r=q.]
Mark IA. Mark I applies, but in order to trick his future self into choosing q=1/2, Albert “commits” at the outset to p=1. That is, he promises to always go straight. Then when he arrives at the first intersection, he believes he’s equally likely to be at either of the two intersections, leading him to choose q=1/2 and maximize his chance of getting home. This was nicely explained by Joeythepea in comments on the original post. Joeythepea, incidentally, is the same person as the standup comedian Joe Podwol whose YouTube clips are well worth a few minutes of your time.
Mark II. Albert always believes, perhaps incorrectly, that p=1/2, and can accurately predict that he’ll believe this in the future. This leads him to maximize (q + 2r – 2qr)/3, taking r as given. If he believes r is less than 1/2, this is maximized by taking q=1; if he believes r is greater than 1/2, this is maximized by taking q=0.
In either of these cases, Albert must believe that r differs from q, even though they are both solutions to the same problem. This seems implausible, so we can assume we’re in the only remaining case, namely r=1/2. Then Albert gets home with probability 1/3 independent of q, and so might as well choose q=1/2 also.
Comment. The Mark II model predicts that Albert never deviates from 1/2 and gets home 1/4 of the time. Thus according to this model there is no paradox.
Mark III. Albert remembers p, and expects to remember q in the future. He expects to choose r according to some rule r=f(q). This leads him to maximize
so he chooses q to satisfy f ( q ) – ( 1 – q ) f’ ( q ) = 1 – p (assuming the rule f is differentiable). Consistency requires q=f(p) so f must be a function satisfying
Offhand, I’m not sure what the general solution to this equation might look like.
There are also Marks IV, V, VI and onward. I’ve decided to omit them in a futile attempt at brevity.
Some Conclusions. First, to avoid confusion, I think it is quite important to make careful distinctions among the variables p, q and r. A quick web search turns up several “solutions” that ignore these distinctions. Second, it’s quite important to be clear about one’s assumptions (which in turn requires maintaining careful distinctions among p, q and r). Third, it’s not surprising that different assumptions can yield different conclusions.
Fourth, the case that really seems to be paradoxical at its core is the case where Albert commits at the outset to p=1/2, arrives at the first intersection remembering this commitment, and then abandons it. At first I thought that the assumptions in this case were self-contradictory, because Albert is initially 100% sure he can commit to p=1/2 and then gets a chance to de-commit. Nobody—at least nobody in a good economic model—is 100% sure of anything unless that thing *is* 100% certain. But I no longer believe we can dismiss this case so simply. Even if we hold Albert to his original commitment (thus justifying his 100% certainty) there’s still an apparent paradox in his desire to de-commit.
Fifth: Several very smart economists have written about this problem without reaching any conclusions that are widely recognized to be definitive. (For example, here, here and here.) This should give pause to those who think the whole problem is based on some kind of cheap trick.
And sixth, I suppose that at some level it’s not terribly surprising that absent-minded people, doing the best they can with their limited cognitive skills, might get mauled by dinosaurs a lot.