Two days ago, we asked whether lawyers (or anyone else) can recognize reasonable doubt (or its absence) when they see it. Yesterday we stepped back to ask the question: What is the right standard for “reasonable doubt”? Should you convict when the probability of guilt is 70%? 80%? 90%? Of course the answer might vary with the crime and the prospective punishment. To keep this discussion concrete, I’ll focus on capital punishment for murder, and to keep it simple, I’ll pretend that no other punishment is available (so that the only choices are “acquit” and “execute”).

Today I want to take yet another step back and ask: What is the right standard for deciding the right standard? Here are two possible meta-standards:

1. Minimize the suffering of innocents.

2. Minimize suffering.

The difference is in whether we choose to care about the suffering of murderers.

Let’s be clear about what it means to apply the first of these meta-standards:

Lowering the reasonable doubt standard from, say, 95% to 90% means that more innocents will suffer from false convictions while fewer innocents will suffer from being murdered (either by killers freed to kill again or through the weakening of deterrence). To assess the trade-off, suppose for illustration that in a land of 300 million people, there are 300 cases a year where the defendant just clears that 90% hurdle. Of these, 30 (that is, 10 percent) will be innocent. Any given citizen has a 30/300,000,00 = .0000001 chance of being among these innocents. (That’s six zeroes.) On the other hand, freeing all 300 of these defendants means there will be (say) 3 additional murders. Any given citizen has a 3/300,000,000 = .00000001 chance of being among those victims. (That’s seven zeroes.) If you’re out to minimize the suffering of innocents, then the question is: Which do the innocents prefer — a tiny chance of being murdered, or a ten-times-greater chance of being falsely convicted? If people generally prefer the first, we should lower the reasonable doubt standard to 90% (or possibly lower). If people generally prefer the second, we should maintain a higher cutoff.

To turn this into a concrete answer, you’d of course want to use more realistic numbers, which would include a realistic estimate of the deterrent effect, and a realistic estimate of which risk is preferable.

The second meta-standard is similar, but it also accounts for the fact that murderers, like the rest of us, prefer not to be convicted of murder. And, of course, any one of us might turn out to be a murderer someday (or at least might have been born into circumstances that would have led us to become murderers). So the tradeoff becomes: A 300/3,000,000 = .00001 chance of a murder conviction (possibly deserved, possibly undeserved), versus a .0000001 chance of being murdered. (That’s seven zeroes versus five zeros.) Would you prefer a tiny chance of being murdered or a hundred-times-greater chance of being convicted, possibly falsely, possibly not? Based on this calculation, you’d be a lot less likely to lower the reasonable doubt cutoff.

Which is the right meta-standard? Blackstone said it was better for ten guilty men to go free than for one innocent to suffer, suggesting that there’s something special about the suffering of innocents. (And also, though it’s not relevant here, apparently forgetting that the entire tradeoff he was addressing is between the suffering of innocents on the one hand, and the suffering of other innocents on the other hand.) But it’s not completely obvious how you’d justify this distinction. Our personalities and our life circumstances are assigned to us randomly. If John happens to be born with a more murderous personality than Jane, it does not clearly follow, as a matter of public policy, that we should put less weight on his interests.

No matter which meta-standard you adopt, you’ll probably also want to adjust it for the fact that when the reasonable-doubt threshhold is set too low, it invites abuse by manipulative prosecutors. It’s easier to fake evidence that someone is 70%-likely to be a murderer than to fake evidence that someone is 90%-likely.

Which meta-standard seems right to you? Or do you have a better one?

Share/Save

#### 50 Responses to “Reasoning About Reasonableness”

1. 1 1 Bennett Haselton

Is this a typo?

“Lowering the reasonable doubt standard from, say, 95% to 90% means that fewer innocents will suffer from false convictions while more innocents will suffer from being murdered (either by killers freed to kill again or through the weakening of deterrence).”

If you lower the reasonable doubt standard, doesn’t that *increase* the rate of convictions, which means *more* innocents will suffer from false convictions while *fewer* innocents will suffer from being murdered?

2. 2 2 Tony Cohen

I have no way to even vaguely attempt to calculate this, but the trade-off may be psychological.

Since both of these situations have A MINUTE chance of happening, perhaps the mental cost makes it better for an innocent to go free. (Purely speculating here)

Would it be more psychically damaging to know about an innocent man rotting in jail, or to hear about a murdered man who was killed by someone who had been previously arrested but released due to the stringent aformentioned standard.

The numbers say we should be convicting more people for the numerical benefit of all. However, the ‘psychic’ cost may be greater if we do that…

3. 3 3 improbable

I think that something of the spirit of Blackstone’s remark would be captured by a formula like this:

Minimize (the suffering of innocents due to state action) x 10 + (the suffering of innocents due to “natural” causes)

In other words, we should care not only about some total suffering, but also _who_ inflicts it. And I think we do care about this. To have someone in your family die in a car crash, or be murdered, is a tragedy. To have them falsely executed for murder is something much darker.

If I were more awake I would think harder about your case of total suffering of murderers…

4. 4 4 John Faben

Steve, is there any reason to assume that the false positive and the false negative rate are strongly related? Are you assuming that we get a fixed number of convictions per murder?

5. 5 5 Ryan

I don’t like the way these ideas have been put forth. If 1 person in 10,000 is a murderer, it does not follow that “every newborn has a 1/10,000 chance of being a murderer.”

You can’t assign prevalence rates to matters of human will.

6. 6 6 Clifford Nelson

Although more people would be convicted if the standard was lowered, your assumption that the threat to innocents is reduced is not true. It just means that more innocent people are in jail and more guilty people are on the street. More seriously, I think the effect of lowering the standard is to increase the threat to innocents because prosecutors are concerned with convictions … little else. Lowering the standards allows them to prosecute a wider range of individuals for a crime. This is very disruptive to society. Not only does it put innocents in jail and leave criminals on the street, it also increases the sense of unfairness which in turn can result in violent conduct and harm to more innocents and also chills human behavior (as people become aware of the risk they choose to stay home).

7. 7 7 Thomas Bayes

Four things can happen once we’ve acquired evidence about a person and must decide whether or not to convict:
(convict, guilty)
(convict, innocent)
(acquit, guilty)
(acquit, innocent)

For each of these events, we can assign a cost to society: Ccg, Cci, Cag, Cai. I’m going to assume that Cci is greater than Cai, and that Cag is greater than Ccg. That is, it is worse to have someone convicted and innocent that it is to have them acquitted and innocent; and it is worse to have someone acquitted and guilty than it is to have them convicted and guilty.

My criterion — the Bayes’ Criterion — for making a decision, then, is to minimize the expected cost. Based on this criterion, the test for guilt is simple. We will declare guilt when:

Pr[guilt | evidence]/Pr[innocence | evidence] > (Cci-Cai)/(Cag-Ccg)

The threshold for conviction to minimize the expected cost, then, is

Pr[guilt | evidence] > (Cci – Cai) / (Cag – Ccg + Cci – Cai).

Set your costs and see where the threshold turns out for you. Or, set your threshold and see what that implies about your costs.

Cag will reflect the potential cost to innocents when we release guilty people.
Cci will reflect the cost to innocents when they are convicted.
Cci will probably be much larger than Cag.

Because Ccg and Cai are the costs associated with getting it right, maybe you’ll set those to zero?

8. 8 8 Harold

As I look at it, at 90% we convict all the 300. At 95% we release them all. At 90% we get the greater chance of conviction (including false conviction). At 95% we get the extra murders. Going from 95% to 90% means we get more innocents convicted and fewer murders. Using your numbers, if we prefer a tiny chance of being murdered we should opt for the 95%. If we prefer the 10 x chance of false conviction, we should opt for 90%. This is the opposite of your statement. What is going on – have I mis-understood again?

The outcome of both being murdered and convicted is death, so we should care little between them. With the numbers given we should prefer the one with the smallest probability, which is 95% standard. If we change the numbers to say that there will be 40 extra murders at 95%, then the opposite is the case.

Except for a few things. The chance of being murdered is not uniform. If you are a young, poor black male or a convenience store owner the chances are much higher. Also the chances of being falsely accused are not uniform. If you are a young, poor black male the chances are much higher. So convenience store owners probably would prefer 90%. For young poor black males the effects may cancel out.

There is a point about what the 90% etc means, I was groping about this yesterday and think I have a better grasp now. A few people have eluded to it. I now think I have a better way to put it as follows:

If you tell the jury to convict if they are 90% certain, you may falsely convict more than 10%, i.e. you may get a lower than 90% standard. This is because of the way investigations work, and does not necessarily require malice or fraud on behalf of the investigators. What we don’t know is what level of “doubt” will lead to the 90% standard.

Havn’t even thought about the other meta-standard yet.

9. 9 9 Al V.

Re. “… whether we choose to care about the suffering of murderers”, in the U.S. we care – at least somewhat – about the suffering of murderers, as the U.S. Constitution forbids cruel and unusual punishment.

Also, the deterrent effect is minimal, I think. Most murders are committed in the heat of passion or in the course of another crime. When someone kills in the heat of passion, they are not likely to consider the consequences. And in fact, at least anecdotal evidence is that criminals aren’t deterred by punishment anyway. I was watching a television program some years ago where a criminal was sentenced to life in prison under California’s “three strikes” law. He was asked why he committed the crime he did knowing the risk under the law. His response was “Obviously, I didn’t think I would get caught.” I suspect that is true in most cases – if someone thinks at all, he thinks he won’t get caught.

10. 10 10 Ken B

The point about prosecutors being able to more easily fake a 70% case misses an important point. If John happens to be born with a more duplicitous personality than Jane, it does not clearly follow, as a matter of public policy, that we should put less weight on his desire to frame her for a crime she did not commit.

11. 11 11 Harold

Have had a quick look at the other meta-standard. It is very counter-intuitive, as are so many of these things. It suggests we should prefer a high standard of proof because we may one day need to be falsely aquitted. We can simplify the calculation by ignoring the number of people in the country and counting the number who suffer.
In meta-standard 2, 300 people suffer at the lower standard (by being convicted), and only 3 suffer at the higher standard (by being murdererd). On this basis we should prefer the higher standard.

If we extend this to its logical conclusion, then we should only convict if releasing 1 accused will cause more than 1 extra murder, even if we were 100% certain of guilt. This seems to be obviously wrong, if only because it ignores the “retributive” aspect of the justice system, the suffering of the survivors. The vast majority would prefer a system that correctly convicted the guilty, because the vast majority are not guilty.

In meta-standard 1, then 30 suffer at the lower standard (by being falsely convicted) and 3 suffer at the higher standard (by being murdered by one of the released). We should prefer the higher standard again. However, in this case, this changes if the released 300 murder more than 30 people.

12. 12 12 Ken B

SL ignores a serious source of suffering: the families and friends of victims watching a guilty man (ex hypothesi) going free. Or of a wrongly convicted innocent (although by hypothesis he will seem guilty so this effect might be less). This is not a small matter because as this effect rises you will find more extra-legal action taken.

13. 13 13 Dave W.

the question is a bit silly because most people place a positive value on the suffering of murderers.

I have a different metastandard to suggest:

The standard should be whatever standard is likeliliest to cause spending parity in criminal trials (that is, the prosecutor and the defendant spend exactly the same amount of money (or money-equivalent) in doing the trial.

Why I suggest this metastandard:

In criminal trials, the probability of guilt, as perceived by a juror, will depend strongly upon which side spends more money. In cases where the two sides spend evenly (including investigators, experts and lawyers), then 90% probability of guilt means what you think it means. However, if the prosecution outspends the defense by a factor of 10 or 100 or 1000 or 1000000, then a 90% probability of guilt at the end of the trial does not mean what you think it means. It would correspond to a much, much lower objective probability of guilt. To put it a different way, if they had enforced spending parity, that same juror would not see a 90% chance of guilt, but rather a much, much lower chance.

This concern about spending disparity dominates the type of concern expressed by Blackwell.

So how does this impact the quantification of reasonable doubt:

The probability number corresponding to reasonable doubt should be a function of the proportional disparity in funding as between defense and prosecution.

If the prosecution outspends the defense by a factor of 100 or more, then the RD cutoff should be 99.999%

If the spending is equal, then it should go down to 90%

If the defense spends more (see, eg, the OJ case), then the RD cutoff should go below 90%.

14. 14 14 Alan Wexelblat

I think you’re missing a significant factor here, which is what I think Blackstone was getting at: what is the relationship between citizens and a justice system?

If I believe that the system in my country has (what I consider to be) a high chance of convicting an innocent then that changes my relationship to, and interaction with, that system. You can’t retroactively restore that relationship by saying “well, we’re doing this because we believe it gives a greater overall chance of protecting people from harm.”

I depend on law enforcement to protect me from harm (rightly or wrongly) as well as my own available force of arms, good sense, and whatnot. The purpose of the justice system is to mete out justice, which is why cops don’t serve on juries and judges aren’t given arrest powers.

By framing a discussion of one in terms of the other you’re creating a question that is unanswerable, or whose answers will be wholly nonsensical because people come to the question with models unaccounted-for in your assumptions.

15. 15 15 Steve Landsburg

Alan Wexelblat: I want two things from my justice system. I want it to protect me from criminals and I want it not to falsely convict me of anything. Insofar as there is some conflict between those goals, I have some preferences about the ratio at which they should be traded off.

What you seem to be advocating is that the justice system should choose a ratio that differs from the one I would choose for myself. Which is to say that you want the justice system to undercut me. I don’t see how that would improve my relationship with the justice system.

16. 16 16 Thomas Bayes

Steve Landsburg:

I think we are confusing three different probabilities in this discussion.

The first is the probability of guilt given the evidence:
Pr[guilt | evidence]
This is determined by our prior assumptions about guilt and the quality of the evidence. For this one, we can ask how certain of guilt we need to be to make a conviction.

The next is the probability of a conviction given guilt:
Pr[conviction | guilt]
This is determined by the standard to which we place on the first probability when we make decisions. This probability is needed to answer the question about how many of the guilty people are walking the streets.

The final is the probability of guilt given a conviction:
Pr[guilt | conviction]
This is determined by the standard to which we place on the first probability, and on the prior assumptions we make about the population. This is the probability needed to understand how many of the convicted people are guilty.

None of these probabilities are equal, but we seem to be using one of them (the first) to answer all of the questions. Does this make sense, or am I off track?

17. 17 17 John Faben

Thomas Bayes, I nearly wrote the same comment yesterday, but couldn’t quite figure out which the different conditional probabilities at stake were. I think you have it.

18. 18 18 Steve Landsburg

Thomas Bayes:

None of these probabilities are equal, but we seem to be using one of them (the first) to answer all of the questions. Does this make sense, or am I off track?

The original question about urns was an exercise in computing Pr[guilt|evidence].

The policy quesiton is: For what values of Pr[guilt|evidence] would you be willing to convict? (So far I am reiterating what you said.)

Once we make that policy decision, Pr[conviction|guilt] already depends on assumptions about the population. To see this, suppose that in Lower Slobbovia, all crime scenes have evidence of a quality that allows the criminal to be identified correctly with probability 95%, whereas in Upper Slobbovia, all crime scenes have evidence of a quality that allows us to identify the criminal for sure. If both countries adopt a 98% reasonable doubt standard, then Pr[conviction|guilt=yes] will be 0% in LS and 100% in US.

And of course Pr[guilt|conviction] depends also on assumptions about the population as you say.

19. 19 19 Jonathan Campbell

Thomas B: Let’s label your 3 probabilities A, B, and C, in order. We have been debating the level that A should reach (call it A_0) before we convict. If we do this, then B can range between 0 and 100%, depending on the quality of the evidence brought in the trials of guilty people. C will equal A over the long term assuming enough trials.

20. 20 20 Jonathan Campbell

Sorry, C will range between A_0 and 100% depending on quality of evidence.

21. 21 21 Harold

The definition of a 90% standard is that it’s a standard according to which 90% of those who are convicted are actually guilty. This is Pr[conviction | guilt], isn’t it? There is no mention of evidence.

Your post today asks “What is the right standard for “reasonable doubt”?” This is Pr[guilt | evidence]. It is what doubt should we convict given the evidence.

So you are using one probablity to answer the question based on another. In the case of the urn, we can be pretty sure we have all the evidence, because you have defined it so. Therefore the two probabilities are the same. In a real court, we can only use Pr[guilt | evidence]. we must try to establish what level of thios probablity equates to the 90% Pr[conviction | guilt]. in order to answer the question posed. Is this right?

22. 22 22 Thomas Bayes

Jonathan Campbell and Steve Landsburg:
Excellent points and explanations.

Here is my summary of what you’ve said:
If conviction is defined by this test
Pr[guilty|evidence] > P0

then

Pr[guilty|conviction] is also greater than P0, but Pr[conviction|guilty] will depend on how good the evidence is. You can always hold your standard high enough to ensure that convicted people are in fact guilty, but your evidence system needs to be good to ensure guilty people are convicted. It seems so obvious when it is said this way, but I think it is wonderful that all of these statements are mathematical facts that can be proven and quantified.

All of these points can be illustrated by replacing the balls in the two urns from the original example with (45 black; 55 red) and (55 black; 45 red). The evidence would be of much lower quality, so you would need to see 11 or 12 black balls (out of 12) to say they were drawn from the urn on the right with at least 90% certainty. This would happen less than 1% of the time, so you would rarely have certainty beyond reasonable doubt.

23. 23 23 Thomas Bayes

“The definition of a 90% standard is that it’s a standard according to which 90% of those who are convicted are actually guilty. This is Pr[conviction | guilt], isn’t it? There is no mention of evidence.”

No. You are describing Pr[guilt|conviction], which is the probability of guilt given that you’ve been convicted. This will be at or above the 90% standard. The probability of a conviction given that you are guilty, Pr[conviction|guilt], will, as Steve and Jonathan explained, depend on the quality of the evidence.

Both of these probabilities can be computed once you pick a P0 and decide to declare guilt when

Pr[guilt|evidence] > P0.

24. 24 24 Harold

OK, lets see if I have got this. Say we set Pr[guilt|conviction] (C) at 90%. This is our target from the 90% definition. Say we set Pr[guilt|evidence] (A) at 90%. Then C must be equal to or larger than A. It is impossible for more innocent to be convicted than is defined by the level of A. This is because lower quality evidence (such as the 55/45 ball split) can only make conviction less likely.

I may be getting the hang of this (or not)

Steve has said that C can be less than A (but not in those words) if manipulative prosecutors abuse the system by faking evidence. I would like to re-iterate that C could be less than A by honest prosecutors not gathering all the evidence, or not presenting it properly.

25. 25 25 Mike H

Your last paragraph gives a clue – convicting the guilty is actually a means of protecting the innocent.

Whether you execute an innocent party or a murderer, there is some deterrent effect. If you release either one, you expect some extra murders. Naively, let’s assume these aren’t affected by the public perception of the case. Or that the judge is able to determine these effects for any particular case.

Let A = number of extra murders if we release an innocent party.
Let B = number of extra murders if we release a murderer.
Let C = number of extra murders if we execute an innocent party (probably C < 0).
Let D = number of extra murders if we execute a murderer (probably D < 0).
Let q be the amount we value the life of a murderer relative to a member of the general population, presumably 0<q<1.
Let p be the probability that our accused prisoner is actually guilty.

If we acquit, we expect a social cost of (1-p)A + pB deaths.
If we condemn, we expect a social cost of (1-p)(C+1) + p(D+q) deaths.

This implies a cutoff of (1-A+C)/(1-q-A+B+C-D)

Using numbers from your article and book (and some that I cook), A=0 (say), B=0.01, C=0 (say), D=-8 (from your 1994 book) this gives rather scary cutoffs of only 11% or 12%. In other words, give ever

Perhaps my values for A, B, C and D are unreasonable. If D=-1 instead, we get values between 50% and 99%, depending on what we choose for q.

26. 26 26 rapscallion

“No matter which meta-standard you adopt, you’ll probably also want to adjust it for the fact that when the reasonable-doubt threshhold is set too low, it invites abuse by manipulative prosecutors. It’s easier to fake evidence that someone is 70%-likely to be a murderer than to fake evidence that someone is 90%-likely.”

Are we not assuming rational expectations–in which case prosecutor manipulation would be factored into the juries’ p-values? If so, then the focus should be more on particular types of evidences and arguments than on jurors’ ability to apply Bayes’ Theorem.

27. 27 27 Steve Landsburg

Mike H: Your calculation looks right, and does seem to give a frighteningly low cutoff (this is based on a *very* quick mental calculation, but I assume you checked this more carefully than I did). The reason, I think, why we don’t want that low a cutoff is that it’s much easier to falsify evidence when the cutoff is that low, so the police and prosecutors would be tempted to falsify a lot of evidence, either against people they don’t like or against people they want to convict just to clear their dockets.

28. 28 28 Mike H

Unfortunately, if C is less that A-1, it implies a cutoff of 0%. Eg, if A=0, as long as executing an innocent party deters even one murder, it means everyone before the dock should get the chop. Unless, amazingly, D is bigger than B+q.

29. 29 29 Thomas Bayes

Mike H:
You are setting B, which is the expected number of extra murders if we release a murderer, to be equal to 0.01. Does this mean that a typical murderer will kill 0.01 people, on average, if they are released?

You are setting D, which is the number of extra murders if we execute a murderer, to be equal to -8. Does this mean that a typical murderer would have killed 8 people, on average, if we wouldn’t have executed them?

Why isn’t D equal to -B? It seems to me that you are saying that releasing one hundred murderers will result in 1 additional murder, but if we execute a single murderer, then we will prevent 8 extra murders (plus the murderer’s death, to which you’ve assigned a reduced cost of q).

I see that you have set A = -C, which makes sense to me. What am I missing about the meaning of D and B?

30. 30 30 Steve Landsburg

Thomas Bayes:

You are setting D, which is the number of extra murders if we execute a murderer, to be equal to -8. Does this mean that a typical murderer would have killed 8 people, on average, if we wouldn’t have executed them?

Mike H got this number from my book, The Armchair Economist, where I give it as a more-or-less-consensus estimate of the deterrent effect of executing a murderer. So it does not mean this murderer will kill 8 people; it means that the failure to execute will weaken deterrence sufficiently so that 8 additional murders will be committed (presumably by other murderers).

31. 31 31 Mike H

@Thomas I got B from the blog post, and I got D from Steve’s book, as he points out.

@Steve It does give a frightfully low cutoff – but it’s applied to the very simplistic scenario where convicted killers are executed only, and social benefit is only measured in terms of lives saved.

Note also : the prosecutor could make a ‘social benefit’ decision about whether to arrest a suspect and bring them to trial – based on his or her own estimate of their guilt.

Note that if
* q=1 (say, for philosophical reasons)
* A=B (because, say, acquittals don’t make the news, but the trial process makes murderers swear off their violent ways and become decent, law-abiding citizens)
* D > C (because, say, the general public – including future murderers – don’t identify much with those stinky murderers, but everyone is terrified into non-violence when it is discovered that the executed prisoner was actually innocent)

then to minimise the social cost (measured in deaths) the court should *acquit* prisoners who are *more likely* to have done the dirty deed.

The maths doesn’t lie, but the thing it is telling the truth about is the model.

32. 32 32 Harold

I am probably revealing my lack of understanding again, but why doesn’t C=D? Surely the deterent effect is the same whether the person was actually guilty or not, since the executed person is assumed to be guilty?

33. 33 33 Dave W.

Police and prosecutors falsify evidence sometimes, and that is a big problem. It is not the biggest problem, however, not the biggest.

the biggest problem is that police and prosecutors cherry pick what evidence to present, and then hope that the defense does not have the resources to find and put on evidence probative of innocence or of Constitutional violations or mitigation.

Take the case of Professor gates and Sergeant Crowley. the way the story played out in the media was that Sergeant Crowley was called to Gates’ house because somebody thought it was being burgled. Crowley went to the house and saw Gates inside. Gates did not want to let Crowley in his house. Crowley did not want to let Gates go fetch his id without Crowley following him through his own house. So Crowley came in, uninvited and followed Gates to get Gates’ id. Gates got angry at this and was arrested because he yelled insults at Crowley on his front porch. With this understanding of the facts (and knowing the applicable Fourth Amendment law), Crowley was arguably in the right.

However, there are missing facts. It turns out that the person who called the police didn’t particularly think that Gates and his driver (who was helping him with his luggage) was burgling the house. Rather, the caller stated to dispatch that another person thought the house might be being burgled, but the caller had her doubts about that. Diespatch did not ask the caller about her doubts. Rather dispatch merely told Crowley (on the radio) that a caller thought the house was being burgled. then when Crowley got to the scene, someone (probably the caller) tried to speak with Crowley before he approached the house. There is a good chance that this bystander was trying to tell Crowley that it had become apparent that there was no burglary. Crowley refused to speak to the witness (he told the witness to leave) and proceeded to go up to the house, hoping to go inside.

This understanding of the case changes things considerably from a legal standpoint. More particularly it means that Crowley did not have sufficient cause to go in the house and was (in cahoots with dispatch) ignoring facts that suggested that Crowley had no right to go into the house uninvited. Which would mean that Gates had the right to yell, as his Constitutional rights were being violated.

It is not surprising that the charges against Gates were dropped. He would have hired a lawyer who would have hired an investigator who would have found that caller and also the witness in front of gates house (assuming they were different people). The lawyer would have put on a good presentation convincing the jury that Crowley was out of line, not Gates.

But, imagine that Gates had been poor. His public defender would have not developed these facts at trial. Instead, the well-funded prosecution would have thoroughly cleaned every skeleton out of the closet of Gates’ poor person equivalent. the unfavorable (to poor-Gates) facts would have been presented to the jury in a professional and eloquent manner. The unfavorable (to poor-Gates) facts would lie hidden and dormant.

In the main, this is the problem with a low RD cut-off. It is not that police and prosecutors outright falsify evidence so much as they over-develop the facts needed to get a conviction and ignore or passively obscure facts that disfavor conviction. Juries systematically get distorted picture due to the funding inequity. At least when the defendant is poor.

When the defendant is rich, it is a different story. What do you think is the probability that the Ramsey parents lied to the police about what happened to little Jon Benet? I’ll tell you this: if the Ramsey’s had been poor, just about any juror would have found both parents guilty of murder to 99.9%. the prosecution would have completely steamrolled them, and it would have done so without falsifying a darn thing. Does this mean that the (real) Ramseys were guilty of murder to 99.9%?

34. 34 34 Thomas Bayes

“. . . it means that the failure to execute will weaken deterrence sufficiently so that 8 additional murders will be committed . . . ”

Aha! I see. Thank you. (Harold’s suggestion that this weakening effect should be there for the execution of innocent people too seems reasonable, though, but maybe I’m missing something again.)

Does execution strengthen deterrence in a similar manner? I suspect criminal psychologists have studied this. I should reread “The Armchair Economist”. Excellent book.

35. 35 35 Harold

Dave W – this is pretty much my point also.
The 8 prevented murders is the result of an economic model. As Mike H says, the maths doesn’t lie, but all it tells you about is the model. The effect of these 8 saved murders is not large enough for anyone to actually find by analysis of the data using other methods. Everyone that uses different analysis techniques fails to find it. This could be because the execution rate compared to the murder rate is so low that “prevented” murders are lost in the noise. There are something like 20,000 murders annually in the US, and say 50 executions. So you would be looking for the difference between 20,000 and 20,400. I would have thought that would show up in the data, especially since the executions are clustered, but it could be lost in the noise.

A key point is that if you seriously want to prevent 400 murders a year, there are probably plenty of ways to do it other than executing people.

36. 36 36 Steve Landsburg

Harold:

Surely the deterent effect is the same whether the person was actually guilty or not, since the executed person is assumed to be guilty?

I should think that the deterrent effect depends on the number of murderers executed.

If we execute 10 100%-sure murderers, we’ve executed 10 murdereres. If we execute 10 90-% sure murderers, we’ve executed (in expectation) only 9.

37. 37 37 Harold

OK, so if we put the numbers in and get a cut-off of 10%, then we know that we are executing lots of innocent people. We know we need to execute 10 people to get 1 guilty one, which will then prevent 8 extra murders.

I think I see a problem here, D is not an independent variable. The deterence effect of 8 only applies if nearly all executions are of guilty parties. The lower the threshold, the lower the deterence. D needs to be a function of p. This is beyong my abilities to formulate.

38. 38 38 Alan Wexelblat

Steve: I’m not advocating that the justice system undercut you. I’m saying that asking the justice system to protect you is an error of kind. It’s rather like expecting the bus driver to teach you how to drive your car. The safe operation of a bus may improve your car-driving but it’s not the bus driver’s job to teach you. And it’s not the justice system’s job to protect you.

Here I’m referring to the bit of the justice system we’re discussing, which is the bit that decides guilt and innocence. A larger model that includes law enforcement in the system doesn’t help with the reasonableness problem, I think.

Even within the bit of the system that determines guilt or innocence that determination is separated from the penalty determination. Guilt or innocence should be based on the facts at hand, not on externalities such as the likelihood that erroneously releasing a repeat offender increases the likelihood of more offenses being committed.

In the penalty phase of a trial, AFTER guilt is determined, one is allowed to introduce evidence such as a defendant’s past criminal record, recidivism rates for individuals convicted of the crimes at hand, etc. These all may be relevant factors to appropriate punishment, but they are NOT part of the determination of whether or not the person should be found guilty.

My claim is that the “choosing of a ratio” is a category error that fundamentally misunderstands how a justice system works.

39. 39 39 Jonathan Campbell

Mike H: That is interesting. I bet D will flip to positive if too many people are murdered. If they start putting people to death for being 11% likely to have murdered someone, we’ll see more murders, not less.

40. 40 40 Steve Landsburg

Harold:

The 8 prevented murders is the result of an economic model.

No, it’s the result of empirical data analysis.

41. 41 41 Steve Landsburg

Alan Wexelblat: If the justice system isn’t protecting anyone from anything, then I suppose we might as well dispense with it entirely.

42. 42 42 Ken B

@Steven Landsburg:
>Harold:
>The 8 prevented murders is the result of an economic model.
>No, it’s the result of empirical data analysis.

I double. The claim you make is not that that there is a statistical corelation, but also causality. (It’s not relevant to this discussion if it’s just a coincidence.) That inference depends on an assumption — some might call it a model — of how incentives work. Ergo, an “economic model”.

43. 43 43 Ken B

@Steve Landsburg (again)
To AW you wrote “Alan Wexelblat: I want two things from my justice system. I want it to protect me from criminals and I want it not to falsely convict me of anything. Insofar as there is some conflict between those goals, I have some preferences about the ratio at which they should be traded off.” but that really isn’t an accurate summary of how you started this discussion. After all a system which gives you peremptory and unlimited powers fits those two desiderata pretty well. You also want to consider the welfare of the murderers, or of the other falsely accused. So your rebuttal to AW misses the point I think.

44. 44 44 Steve Landsburg

Ken B:

That inference depends on an assumption — some might call it a model — of how incentives work. Ergo, an “economic model”.

The inference does not depend on any assumption about how incentives work, but it does (of course) depend on auxiliary assumptions about the validity of various instrumental variables, etc.

45. 45 45 Ken B

@Steve Landsburg
Nah, that’s wrong. Your point about 8 is (stripped down): executions have a causal effect on the murder rate. That is not a purely empirical finding like I have 10 toes.

46. 46 46 Steve Landsburg

Ken B:

That is not a purely empirical finding like I have 10 toes.

Right. It is a purely econometric finding, which does rely on some assumptions. But causality is not among those assumptions. Causality *follows from the data*, given those assumptions.

47. 47 47 Harold

There is some semantic nit-picking here, I think. The Ehrlich approach is an econometric one, I think. I don’t fully understand it, but it is not simply an examination of the data of murder rates with and without executions to find the correlation. I think it is an attempt to control for all the other factors to isolate particular variables. I had thought that it depended on a model of utility in some form. Whether you call this an economic model is beside the point. I cannot say whether this is the best approach, but I think it is the only approach that finds this correlation. This may be because it is the only approach that is sensitive enough to detect it, or it may be some other reason.

I think it also found that there was a much stronger effect with increasing chance of getting caught. Since there is also evidence that juries are less likely to convict if the sentence is death, perhaps the abolition of capital punishment would deter more murders through the increased conviction rate. Perhaps this is controlled for in the empirical data analysis.

However, the cost of all the appeals for death row convicts is considerable. The money saved here could perhaps deter more murders if it were diverted to better detection. The key finding, perhaps, is that the total number of murders prevented by executions is at best rather small.

48. 48 48 Mike H

@Harold asks why not C=D. If you want to make C=D, go for it. I don’t see why C should necessarily exactly equal D in all possible universes. Eg, executions might get reviewed by the media, and executions of innocents be more likely that other executions to provoke public rage.

49. 49 49 Ken B

@Steve
I think the other poster’s point was that causality only flows from the data if you have a model (in this instnace about incnetives). This is correct. You cannot prove a causal link without an experiment. You can infer a causal link from data and a theory. Otherwise you are just equating corelation with cause. That being the case, there IS an economic theory here.

50. 50 50 Steve Landsburg

Ken B:

You cannot prove a causal link without an experiment. You can infer a causal link from data and a theory.

But you don’t need a *structural* theory, which in this case means you don’t need any assumptions about incentives.