How to Be Reasonable

Listening to Season One of NPR’s podcast Serial, which is the story of a real-life murder case, I came away about 80% sure that the defendant was guilty and 100% sure that I’d vote to convict him. This got me to pondering whether my standards for reasonable doubt (apparently less than 80% in this case) are in fact reasonable.

So I wrote down the simplest model I could think of — a model too simple to give useful numerical cutoffs, but still a starting point — and I learned something surprising. Namely (at least in this very simple model), the harsher the prospective punishment, the laxer you should be about reasonable doubt. Or to say this another way: When the penalty is a year in jail, you should vote to convict only when the evidence is very strong. When the penalty is 50 years, you should vote to convict even when it’s pretty weak.

(The standard here for what you “should” do is this: When you lower your standards, you increase the chance that Mr. or Ms. Average will be convicted of a crime, and lower the chance that the same Mr. or Ms. Average will become a crime victim. The right standard is the one that balances those risks in the way that Mr. or Ms. Average finds the least distasteful.)

Here (I think) is what’s going on: A weak penalty has very little deterrent effect — so little that it’s not worth convicting an innocent person over. But a strong penalty can have such a large deterrent effect that it’s worth tolerating a lot of false convictions to get a few true ones.

In case I’ve made any mistakes (and it wouldn’t be the first time), you can check this for yourself. (Trigger warning: This might get slightly geeky.) I assumed each crime has a fixed cost C to the victim and a random benefit B to the perpetrator. For concreteness, we can take C=2 and take Log(B) to be a standard normal distribution, though the results are pretty robust to these particulars. (Or, much more simply and probably more sensibly, take B to be uniformly distributed from 0 to C — the qualitative results are unchanged by this.)

Now assume that the size of the punishment is P and the number of convictions necessary to get one true conviction is always N. More precisely: I’ve assumed that for every crime, there are N suspects, one of whom is guilty, and all of whom are equally damned by the evidence. The question then is: Should we, given the strength of that evidence, always convict or never convict?

If we never convict, then all possible crimes are committed, so the net cost of crime is the average value of C-B, as B ranges over all values greater than zero. If we always convict, then crimes are committed only when B>P, and we have to add the cost of punishment to the cost of the crime, so the net cost of crime is the average value of C-B+NP, as B ranges over all values greater than P. (This assumes that punishment is a pure social cost, so we’re talking about prison sentences, not fines.)

Comparing the two costs, it turns out that for fixed N, the “never convict” policy is better when P is small, and the “always convict” policy is better when P is big — though the threshhold value of P increases with increasing N.

In other words, the cutoffs look like this, so that in the blue-dot case — with quite weak evidence and a quite harsh punishment — the recommendation is to convict:

Whereas I might naively have expected something more like this:

I find this result surprising and wonder how it would hold up in a more fully fleshed-out model. The obvious place to start tinkering is with the assumption that all cases are identical. It’s pretty clear that in a world where the quality of the evidence against a murder defendant never rises above the 50% level, we’d want to convict at that level — the alternative, after all, is to effectively legalize murder. But if some cases are stronger than others, and if defendants don’t know in advance how strong the evidence against them will be, then we can achieve some deterrence with a much higher cutoff. This will surely, then, shift the red line in the graph to the right, but I don’t know whether it will affect the downward slope.

Print Friendly, PDF & Email
Share

61 Responses to “How to Be Reasonable”


  1. 1 1 Harold

    What one “ought” to do and what people tend to do are different. I believe one argument against the death penalty is that the harsher penalty makes juries less likely to convict.

    Each graph works for one crime only – the “harshness of penalty” for shoplifting might range from 1 day to say 3 years, but for murder would range from 10 years to life perhaps.

    Something else to think about is “all possible crimes.” In this example it is all crimes that would be committed in the absence of punishment. In fact, nearly all crime that is *not* committed is for reasons other than fear of punishment. I think most of us do not shoplift, or bash a granny, or take sweets from children because of some sense of right and wrong rather than fear of punishment.

    The model assumes that the punishment regime does not affect this, but we must be very, very careful here. A slightly increased sense of injustice could cause much more crime through reduction in “moral” abstinence than it reduces crime through fear of punishment. The model assumes no interaction between this “moral” crime reduction and the punishment regime. We can speculate that lots of false convictions could easily encourage some people to commit more crime as they feel less responsibility for other individuals.

    Another factor is that the length of punishnment is far less significant that the chance of being caught. Most criminals probably believe the chances of serving any sentence is small, so it makes little difference how long the sentence was. If the money save from shorter sentences was used in improved detection that could possibly reduce crime much more.

    I am still thinking about the model – the results are interesting and counterintuitive.

  2. 2 2 jonny957

    “When you lower your standards, you increase the chance that Mr. or Ms. Average will be convicted of a crime, and raise the chance that the same Mr. or Ms. Average will become a crime victim.”

    I think you mean lower the chance that the same Mr. or Ms. Average will become a crime victim. Otherwise both of those options seem bad, and there is no trade-off.

  3. 3 3 Steve Landsburg

    jonny957: Thanks for catching this. I’ve fixed it.

  4. 4 4 Ken B

    Well, I haven’t looked carefully but there is an obvious hole here. Most compliance with the law is voluntary and depends on the populace perceiving the law and the government as legitimate. Jailing or executing a lot of innocent people almost at random will corrode that legitimacy pretty fast.

  5. 5 5 Jon

    This is an interesting post. I see why you’d vote to convict based on your reasoning (though I worry this line of reasoning suggests we should have extremely stiff punishments for all crime and create an authoritarian, punishment-driven state to reduce crime to zero), but could you elaborate on why you find him guilty?

    I’ve found people’s response to the Serial case is kind of a litmus test for their politics and worldview. Conservative lawyer friends are all convinced he’s guilty, but they also share the worldview that most people who end up on trial are guilty.

    On the other hand, I came away from the podcast feeling like Adnan may have been guilty, but there’s overwhelming room for reasonable doubt so I would not have voted to convict. It’s been a while since I listened to it, but I remember the prosecution’s evidence hinged on Jay’s testimony. The cellphone record showed objectively that Jay was lying (at least about some of what he was saying) and therefore, I felt, everything he had to say was open to question. On paper, I see nothing but reasonable doubt, and suspect most people who believe Adnan decidedly guilty are acting on intuition – he seems sketchy, and dishonest, and therefore must be guilty. But intuition is the opposite of what we want in a justice system, right?

    Of course, I have my own litmus-test bias here. I live in Virginia, where we’re seeing case after case after case of convicts being exonerated by DNA evidence years after the fact – primarily black defendants found guilty on intuition, proven innocent by objective facts. So I’m inclined to mistrust testimony, memory, and police motivations.

  6. 6 6 Steve Landsburg

    Jon: I’m a little hesitant to get into this discussion, as I have no more expertise than any other casual listener to the podcast. I’m sure there’s a lot that wasn’t in the podcast, and I’m sure there’s a lot in the podcast that I failed to digest.

    That said, I found Jay clearly credible overall, despite the fact that he was clearly dishonest about some details. The fact that he knew the location of the victim’s car strikes me as clear confirmation that he was involved in more or less the way he said he was involved. As for the obfuscated details, especially in the early police interviews, these seemed easily explained by Jay’s reluctance to reveal anything about his drug dealing operation.

    I also think we learn something from the defense lawyers’ attempt to pin the whole thing on a recently released serial killer with no plausible connection to Jay, and hence, it seems to me, no plausibile involvement in the case. If they’re so desperate that they’re jumping at something this implausible, and so dishonest as to try to sell it, then I tend to discount everything else they have to say.

    Of course if I were actually on a jury, I’d pay far more attention to all this stuff and maybe my mind would changed. But if I remained about as convinced as I am now (which is still about 80%), I’d certainly vote to convict.

    This, incidentally, seems to put me solidly in the mainstream, as empirical studies have found that real-life juries tend to draw the reasonable doubt line somewhere around 75%.

  7. 7 7 Dan

    The social costs (N*P) associated with convicting the guilty vs the innocent are probably not the same. The guilty who are not convicted are likely to commit more crimes in the future, and the innocent who are convicted can no longer contribute to society in a positive way.

  8. 8 8 Pete

    Along the lines of what Dan had to say,

    Once you convict the wrong person, you give up on searching for the right one. The lower the standards of not convicting the wrong person, the more likely the right one gets away.

    Or

    This takes power away from the juries and judges and puts it into the hands of the investigators and prosecutors, for better or worse.

  9. 9 9 Ken B

    @Pete 8
    Interesting. Doesn’t Steve’s model suggest we should keep searching and keep convicting? I mean, for some parameters it must pay to drop the threshold below 50%.
    https://en.wikipedia.org/wiki/Maximilien_Robespierre#/media/File:Robespierre_ex%C3%A9cutant_le_bourreau.jpg

  10. 10 10 Alan Wexelblat

    My instincts go along the same lines as Dan’s, which is to say I think your basic model is wrong in the sense of considering the wrong things. Let me see if this makes sense:

    Assuming all costs are equal we’d like to have a system that minimizes false positives, even at the expense of increased false negatives. This is the present justice system, which is weighted in favor of letting some guilty people go free in an effort to ensure fewer non-guilty people go to prison.

    Now let’s add costs into the situation. If you agree that in fact a crime has been committed in each case (that is, ignore cases of malicious or erroneous prosecution) then the cost of one erroneous conviction is ALWAYS going to be higher than the cost of an erroneous innocent finding. The reason is that if we incorrectly convict an innocent person we thereby must always also incur the cost of erroneously NOT convicting the correct guilty person. The two always go hand-in-hand as far as I can see, and this is independent of the severity of the punishment.

    Your model says, “When you lower your standards, you increase the chance that Mr. or Ms. Average will be convicted of a crime, and lower the chance that the same Mr. or Ms. Average will become a crime victim.” I think you have that backward, by the reasoning above. With lower standards you increase the rate of erroneous convictions, which is also increasing the rate at which true criminals go free. By increasing the rate of ‘escape’ of true criminals I think you are INCREASING the chance that Mr/Ms A will become a crime victim.

    Does this make sense?

  11. 11 11 Pat

    If you could make the punishment infinite, would that make the burden of proof approach 0%?

    I understand your logic but positive externalities do not justify it. The harsher the punishment, the more you need to be sure that the person is guilty.

  12. 12 12 Steve S

    “But a strong penalty can have such a large deterrent effect that it’s worth tolerating a lot of false convictions to get a few true ones.”

    You lost me here. I get what you’re trying to do but putting innocent people in jail has such a high negative personal cost that I don’t really care what the positive societal costs might be.

    Has the deterrent effect of strong penalties even been measured sufficiently for you to claim it is “large”? I don’t have research in front of me but I thought most states moved away from capital punishment after realizing it didn’t really deter anything.

  13. 13 13 Thomas Purzycki

    If we tried to apply the model to punishments that WERE fines instead of prison sentences, “always convict” would be the winner for any N or P (keeping the assumption of zero cost enforcement). I don’t think focusing on prison style punishments could magically render whatever limitations result in that silliness irrelevant.

  14. 14 14 Thomas

    I have the following problems with your analysis:

    1. It’s rare that there are N suspects against all of whom the evidence is equally damning.

    2. N suspects are prosecuted; only 1 is. And that 1 is usually the person against whom the evidence is at least strong if not compelling.

    3. Cases are tried one at a time, and what happens in other trials involving he same crime don’t matter. What matters is the evidence against a particular defendant in a particular trial. I can’t imagine being on a jury and voting to convict or acquit on the basis of your model.

    4. Your model, as I understand it, depends on a concept of net social cost. But net social cost is irrelevant to the victims of crime. They don’t bear a net social cost; they bear a cost, period. And, for crimes in which victims can’t or won’t receive restitution — because they’re killed or maimed, or because (as is usually true) the criminals can’t afford restitution — that cost is born by them and them alone. They couldn’t care about net social cost. Nor should they.

  15. 15 15 Andrew_M_Garland

    Mr. Landsburg,

    Your post is a very complicated presentation of the standard moral dilemma. I adapt it to the proposed case.

    Assume that 120 grisly public executions each year could stop allmost all murders, if murders were so punished. Would it be justified to lower the standards of evidence so that at least 120 people were found guilty and executed this way each year?

    Note that one would have to convict some innocent people to be executed each year after the first, because murder would no longer happen often, being suppressed by these executions. But, one still needs the executions to have the suppressing effect.

    If you are a socialist, then the greater good wins, and you break a few unlucky eggs to further the utopia. The greater populace benefits by this policy.

    If you are an individualist, then it is a shocking crime to convict and execute someone on less than the best evidence so that their example can be used as advertising.

    I am sure that your math supports the socialist case. Sacrificing 120 people to prevent thousands of murders is a good deal, even in terms of live vs life. It is horrifying to me that you would embed these principles into a math problem and say “It goes against our instincts about justice, but that is what the math tells us.”

    Another objection. You assume that bigger punishments have proportionally greater effects at suppressing the targeted crimes. What is the evidence for that?

    I have read that criminals do adjust their behavior to avoid harsher penalties, but they don’t stop being criminals. For example, criminals will kill all witnesses if they fear severe penalties. There is also the example of pickpockets working the crowd at hangings.

    http://www.time.com/time/magazine/article/0,9171,894775,00.html
    == ==
    [edited] Around 1800, British hangings were attended by huge crowds, and since spectators were preoccupied with watching the gallows, hangings were favorite hunting grounds for pickpockets. Naturally, picking a pocket was a capital offense.

    If opponents of capital punishment had to sum up their entire case in one image, it would be a a 19th century English pickpocket reaching for the wallet of a spectator at the hanging of a pickpocket.
    == ==

  16. 16 16 Roger

    Steve, I am surprised that you accept 80% and others accept 75%. There are 3 major legal standards for proof, and I would have guessed:

    preponderance of the evidence 51%
    clear and convincing evidence 80%+
    beyond reasonable doubt 98%+

    Also, I would have thought that the main argument for convicting on a grisly crime with relaxed standards is this: society needs closure and people hate the idea of a horrible murderer going free. (And they hate it enuf to convict someone on weak evidence.)

  17. 17 17 Ian

    In addition to the problems point out above, why assume that B is always less than C? I’m sure we can all think cases where this doesn’t hold. Further why assume that the cost to the indevidual is the same as the cost to society?

  18. 18 18 Harold

    Andrew 15. The moral dilema you describe is interesting. The problem with translating it into action is that we do not know that such execution would suppress murders to zero. In fact, we are certain that it would not have this effect. When we consider the dilema, we cannot help but think about it in terms of reality, so we reject the solution that might be a good one because we know it will not work in practice.

    Trolley problems are variants of these dilemas. Most people think that we should pull a lever that will kill one to save 6. In this example, we can easily see that the action we propose (pulling the lever) will very likely have the effect decribed (saving 6 by killing one).

    If we think it is right to take this action, why is it not right to “pull the lever” of false conviction for one in order to save 6 others from being killed?

  19. 19 19 Steve Landsburg

    Thomas:

    Your model, as I understand it, depends on a concept of net social cost. But net social cost is irrelevant to the victims of crime. They don’t bear a net social cost; they bear a cost, period. And, for crimes in which victims can’t or won’t receive restitution — because they’re killed or maimed, or because (as is usually true) the criminals can’t afford restitution — that cost is born by them and them alone. They couldn’t care about net social cost. Nor should they.

    Of course. But this post wasn’t about what the victims should care about; it’s about what the jurors should care about.

  20. 20 20 Steve Landsburg

    Andrew Garland:

    Your post is a very complicated presentation of the standard moral dilemma. I adapt it to the proposed case.

    It seems to me that if you can say this, then you’ve missed the whole point of the post, which of course could be my fault as much as yours.

    The main point of the post was the (to me, highly surprising) fact that in a simple model, the red curve slopes downward when I would have expected it to slope upward — and to raise the question of whether this is something we should now expect to see in a more realistic model.

    In other words, the question is: Is the P-cutoff an increasing or a decreasing function of N? This is, first of all, not a moral question. It is also, second of all, a question I’ve never seen raised anywhere eles. If it’s in fact standard, I’ll be glad for a reference.

  21. 21 21 Steve Landsburg

    Roger:

    98% ?!??!!? Really? I’d be stunned if anyone has ever applied such a stringent standard, or ever seriously argued for it. Even the famously stringent Blackstone standard (freeing 10 guilty men to avoid convicting 1 innocent party) is only about 91%.

  22. 22 22 Ken B

    “crimes are committed only when B>P”

    Hmmm. If I as prospective perpetrator am considering a crime, don’t I care about P/N? Assuming I will be one of the N defendants, I pay P only 1/N of the time. So crimes are committed when BN>P I think.

    Am I m issing something?

  23. 23 23 Roger

    Yes, it is quite common to be 98% sure of criminal guilt. There may be surveillance photos, fingerprints, DNA, confessions, possession of stolen goods, backstabbing ex-partner, etc. The Serial case was picked because it happened to be a relatively weak case. Most cases are very strong.

  24. 24 24 Steve Landsburg

    Ken B: If you commit a crime, you and N-1 others will all be accused and convicted. Each of you suffers punishment P. You care only about your own punishment, not the others. So you compare B to P.

  25. 25 25 Harold

    “If you commit a crime, you and N-1 others will all be accused and convicted.”
    This was mentioned above, and I am not sure I have understood you position. Only one person is convicted of each crime. If Ken B was the first, there would be no others since the crime is then solved, even if there are N-1 others with sufficient evidence against them.

  26. 26 26 John Alcorn

    Ken B (comment no. 4) notes, “Most compliance with the law is voluntary and depends on the populace perceiving the law and the government as legitimate.” To extend this line of reasoning, accurate enforcement of laws requires more than mere compliance by the law-abiding citizens. It requires active cooperation by citizens who are in a position to supply necessary evidence in the form of tips and testimony. Information (evidential warrant) is a scarce, strategic good. Citizens refrain from active cooperation with law enforcement, when they believe that verdicts and sentences violate norms of fairness. If jurors follow Prof. Landsburg’s model of what is reasonable, might they violate community norms of fair punishment, and thereby deter active cooperation with law enforcement by citizens who often have the keys to evidence?
    See Akerlof and Yellen (1993), “Gang behavior, law enforcement, & community values,” (Brookings Institution): http://www.brookings.edu/research/articles/2013/10/gang-behavior-law-enforcement-community-akerlof-yellen

  27. 27 27 Jon

    This model framework is clearly wrong because if we modify it for the case of a fine, the net social cost is zero, but we will still censor crimes where the benefit is less than the fine X the probability of conviction.

    This would lead to the conclusion that we should be indifferent to imposing large fines on people with little evidence, which is of course not acceptable. In fact as a juror I should always vote for the fine because I would get a net benefit (tiny) from having the fine offset my taxes.

    There are several sources of this flaw. Mostly, a system that randomly punishes innocent people will lose public confidence. Secondly criminals often perceive a lower cost of punishment than innocent people.

    Also, I believe there is a fundamental mathematical error in this. The assumption is implicit that if I commit a crime the probability of getting punished is proportional to the probability of being convicted conditional on my being tried: Assumption is

    prob(convict | guilty ) = constant X prob(guilty|level of evidence & trial)

    But I don’t believe this constant is “constant” because as one lowers the threshold for determining someone is guilty, the pool of potential innocents increases. Thus there is an increase likelihood of convicting the wrong person and a decreasing likelihood that the guilty will face punishment–unless this is offset by a corresponding decrease in the probability that nobody will be tried.

  28. 28 28 delurking

    First, this:
    “If you commit a crime, you and N-1 others will all be accused and convicted.”
    is nonsensical. You cannot run a system of justice that way. The less evidence there is for a particular crime, the larger N will be for that crime.

    Second, this:
    “98% ?!??!!? Really? I’d be stunned if anyone has ever applied such a stringent standard, or ever seriously argued for it. Even the famously stringent Blackstone standard (freeing 10 guilty men to avoid convicting 1 innocent party) is only about 91%.”
    is simply false.

    You cannot calculate what the confidence for conviction needs to be given only the threshold that it is better to free 10 guilty people than convict one innocent person (but not better to free 11 per falsely convicted innocent person). You need another input to the equation, namely the ratio of false to true accusations.

  29. 29 29 Ken B

    There is still something wrong with the numbers governing my decision to commit a crime. We have N guys convicted, but the total population is X. Your cost-to-the-malefactor calculation seems to require that I be in the N. Why assume the perpetrator always part of the N cases? (Why not let N = 1 if we can assume that?)

  30. 30 30 danyzn

    This makes perfect sense. The case is clearest if B has a positive lower as well as upper bound, say B_min and B_max respectively. Now if P is lower than B_min (“slap on the wrist”), the punishment is pointless as no one is deterred. You should not annoy people by ineffectually slapping their wrists. If P is higher than B_max (“painful death”), then always convicting would result in zero crime as the deterrence is perfect. You should always vote for painful death since you can thereby eliminate crime, and with zero crime, you would never actually have to impose painful death on anyone.

    Maybe a story can highlight the reasoning and also what’s wrong with it in practice. Alien social scientists give everyone on earth a Universal Tickling Button (UTB). Any human being can, by pressing a button (in secret, unobserved by anyone else) cause 5 seconds of mild discomfort to everyone on earth. Once any UTB has been pressed, all UTBs self-destruct, so this would only be a one-off bout of discomfort.

    The aliens also give the World Government a Universal Torture Device (UTD) which if activated will cause everyone on earth to undergo eternal torture. Some economist advises the World Government to adopt the policy that, if anyone were to ever press their UTB, it would immediately activate the UTD in response. Since no one wants eternal torture, everyone would refrain from pressing the UTB, and the policy would be successful in preventing 5 seconds of mild discomfort for everyone on earth, at zero cost. Everyone on earth! Think of the utility improvement from this costless policy! Thanks brilliant economist!

  31. 31 31 John

    Your proposed trade-off is false. A lower burden of proof means more innocent people will satisfy it, and the less likely the true criminal will be caught, making criminals bolder and more likely to commit crimes. In other words, the lower the evidentiary standard, the more likely Mr Average will be falsely accused, and the MORE crime Mr Average will suffer, a lose-lose proposition.

    The correct trade-off, in my opinion, is the one that minimizes criminal incentive. If evidentiary standards are too low, patsies get caught, criminal incentives are high. If evidentiary standards are too high, nobody gets caught, and criminal incentives are high. Somewhere in the middle, criminal incentive is minimized.

  32. 32 32 Graeme

    While this mathematical model holds water, it suffers from the same shortcoming as the homo economicus arguments of late 20th century economics: assume all humans are rational and that the act in their own best interests. You assume humans who are likely to commit crimes respond to deterrence in some linear or non-linear rational fashion. Everything we know about human psychology, though, shows us beyond any doubt that this is not true.

  33. 33 33 Sisyphus

    The premise of this article appears to be that:

    a. A criminal act is a rational decision that weighs prospective costs vs. benefits of the act
    b. Prospect of punishment is a deterrent to crime, and the deterrent effect is proportional to the severity of punishment
    c. Expected probability of being caught and convicted is perceived by the criminal to be high

    Empirical evidence, on the contrary, suggests that criminals, at least the ones who are actually caught and so can be evaluated, are quite poor at relating the above points to their own situations. Rather, they tend to be impulsive, exhibit very short-term thinking about rewards and punishments and have poor ability to imagine long-term consequences. Consequently, the prospect of punishment is in no way a deterrent to criminal behavior.

    Effective deterrent to crime, it would follow, would consist of removing the crime-prone individuals from society. In fact, we do see a correlation between the rise of mass incarceration and reduction in violent crime, even if we do not know whether there is a casual relationship.

    Fundamentally, the original function of law enforcement (broadly defined, to include police, the judiciary and corrections) has been to subsume the retributory function into itself and away from the victims’ families, as a way of preventing blood feuds. Insofar as punishment is meted out with sufficient certainty, it is clear that this function is fulfilled.

    What is far less clear is how just the punishment in fact is. Event if the above-mentioned mass incarceration is the direct cause of the drop in crime, is actually just that offenders serve sentences as long as 30 years to life for relatively minor crimes, such as grand theft, assault, or drug dealing, simply because of their prior rap sheets? For that matter, is just for people to serve very long sentences for murder? After all, murder is a crime with the lowest rate of recidivism.

    Fundamentally, to summarize, we as a society, determine the forms of retribution that we consider to be just without any regard for their efficacy on reducing crime rates.

  34. 34 34 Howie

    Does your model account for the diminishing marginal deterrent effect of additional punishment as sentence lengths increase? There’s substantial evidence that the difference in deterrence between a one year and a two year sentence is much bigger than the difference between a 49 year sentence and a 50 year sentence. Folks have used this to argue that punishments that are swift and certain can achieve much more deterrent effect at much lower cost.

    https://en.wikipedia.org/wiki/Swift,_Certain,_and_Fair
    http://nnscommunities.org/our-work/strategy/swift-certain-fair

  35. 35 35 Jon

    Another problem stems from the fact that the formulation assumes there is a numerical “benefit” that determines whether the crime will be committed.

    Actual observations show that many crimes are committed irregardless of the severity or likelihood of severe punishment; arguing that the fact the crime is committed proves that the benefit exceeded the value of the expected punishment sheds no light whatsover.

    Thus any realistic model would assume that for any crime there is some base level that is independent of the severity, timing, and likelihood of punishment. You then will get a qualitatively different result.

    BTW: it only makes sense for the severity, timing, and likelihood to be within feasible limits; arguing that a child molester could be deterred by the 100% likelihood of extreme torture 1 second after the incident is irrelevant because that ain’t going to happen.

  36. 36 36 Don

    This is an extremely silly idea. I’d like to be professional and use better, more respectful words, but trust me that silly is the nicest I could be. I teach economic approaches to law, specifically with theories of punishment and this is a unique form of bastardization of these concepts that I have never encountered anything like before. The author speaks purely in language of deterrence and applies a rigid algorithm, but the author in doing so betrays his complete lack of understanding of the concept. A pure cost and benefit analysis of deterrence as the author has attempted should not in any way be concerned with confidence levels of the guilt of the accused; cold calculating deterrence only cares about using people as instruments via punishment driven incentives. In others words, a cold application of strict deterrence would say that you should sentence someone to life in prison even if you are 100% certain of their innocence, so long as there is some belief in society that the person is not and the punishment will produce a net gain in society even at the accused’s expense. It seems to me that the author is either not fully cognizant of the philosophy of deterrence or is mixing deterrence principles with principles of incapacitation, yet applying some rigid formula in the name of deterrence. It’s all very disconcerting to read to be frank.

  37. 37 37 Leo

    As a long time prosecutor I can assure you this is how things work in real life. Getting a jury to convict on a petty crime, such as a shoplift or minor in possession of alcohol, is a near impossible task. While jurors will often convict someone charged with rape of a child or murder on limited evidence. If you let a shoplifter walk free no one really cares. But people tend to error on the side of caution when it involves extraordinarily serious offenses. No one wants to be responsible for setting a child rapist or a murder free.

  38. 38 38 Miguel Madeira

    “Comparing the two costs, it turns out that for fixed N”

    I think this is the problem with your model – one of the results of having laxing vs. strict criteria for conviction is changing the number of people who are convicted, then I think that does not make much sense to assume a fixed “N”

  39. 39 39 Miguel Madeira

    Let’s try a different model:

    L – laxeness criteria for covinction
    p1(L) – probability of convicting the guilty
    p2 (L) – probability of convincting an innocent

    Crime occurs if B>p1(L)*P

    Net cost of crime:

    C-B+[p1(L)+p2(L)]*P

    An higher “L” reduces the crime level, but simultaneosly raises the cost of crime, and this increase is higher if the punishment is higher (in practice, higher the punishment, higher the social cost of punishing an innocent)

  40. 40 40 Harold

    Don, in my view the idea is not presented as a policy proposal, but as a puzzle. The question that is interesting to me is why is this policy not appropriate? I don’t think you have answered this.

    Here is where I think the error lies. “so the net cost of crime is the average value of C-B+NP.” For ONE crime, we have one C, one B and N x P. That is , for one crime we convict N people.

    Now, this is not how it works, nor how most people think it works. We only ever convict one person for each crime.

    If we are propsing changing the system to one that convicts everybody that passes a threshold of evidence, then the model would have some validity. But we are not.

    Therefore, the comment of John is correct “In other words, the lower the evidentiary standard, the more likely Mr Average will be falsely accused, and the MORE crime Mr Average will suffer, a lose-lose proposition.”

    For each crime, we will still only convict one person, but it is much more likely to be the wrong person. Therefore the true conviction rate falls, and since it is fear of being caught rather than fear of punishment that is the biggest deterent, we will get more crime.

  41. 41 41 Ken B

    @ Miguel 38
    I think Steve is just noting a partial derivative. For a fixed N the slope is down, hence for all N the slope is down. That is, pick a standard of proof, and the slope is down. He is not saying you need a fixed N.

  42. 42 42 Ken B

    Let’s assume for the moment Steve’s model is right.
    What does that tell us? That juries do or should apply looser standards? No. But let’s get away from juries for a second, and replace the 12 man committee by a vaguer judgmental forum: public opinion, or for a more particular example, the collective judgment of a community.
    Can this model tell us things about communities prone to collective judgment?
    What it tells us is that *if* a social enforcement mechanism arises and if it follows this rule *then* it will for a time succeed in the sense that the community members will feel it is working. This because the rewards exceed the costs.
    A hair-trigger rush to judgment form of heretic hunting will feel like a good thing to the community, at least for a while.

    So before you scoff: Twitter; Tumblr; Salem; Mizzou.

  43. 43 43 Daphne Sylk

    Nice math, and for all I know, accurate. People aren’t math, nor or they reasonable.
    First, the justice system is flawed. One defendant gets a public defender six months out of law school, another defendant charged with a similar crime can afford a gaggle of experienced and connected attorneys with a vast array of expert witnesses.
    Second, we’re all chock full of biases that trump reason every time. For instance, a person selected for jury duty was, as a child, humiliated or punished by an adult wearing a bright red tie. She doesn’t recall the incident, much less the tie. One of the lawyers, however, is wearing a bright red tie. There’s subconscious distrust and anger. The juror will mentally take the opposite side of whatever the lawyer says. There is no way she’s letting that attorney get a win.

  44. 44 44 A Critic

    The deterrent effect of the severity of punishment is a myth without basis. Examine the nations such as Iran and Thailand with the death penalty for drug dealing yet have not seen any decline in drug dealing rates.

    It is certainty of punishment that is a deterrent. If innocent people are punished then guilty ones shall go free and punishment is anything but certain for the guilty, thus removing any deterrent effect of punishment.

  45. 45 45 Steve Davenport

    You have some risky assumptions. 1) Criminals know the punishment for their crime and incorporate that rationally into a cost-benefit analysis. Let’s let that one slide. 2) That a punishment that is twice as severe is twice as powerful a deterrent. If you modeled deterrence as log(P), but the social costs of punishment as P, then I think that would dramatically change your model results, since punishment would then have sharply diminishing marginal returns but constant marginal social costs. There’s a big literature on how punishment deters criminals; the “Swift-Certain-Fair” paradigm of probation/parole programs is generating very good evidence that punishment has the best deterrent effect when it is SWIFT and CERTAIN; severity is a much less effective (but more expensive) way of generating deterrence.

  46. 46 46 iceman

    Remind me again why we care about B? Is it some pure form of utilitarianism? What would happen to the model if B>C?

    Harold #40 makes a good deal of sense to me

  47. 47 47 Don

    Harold, I implied why the policy is not appropriate, but if it wasn’t fully clear that is my fault and I will clarify. 1) it is inappropriate as it purports to be desiring deterrence as its sole goal but operates in a way that inconsistent with the principles of pure deterrence. 2) Wanting to craft systems of punishment that function exclusively in a calculating deterrence manner is extremely unethical and wrought with problems, see the example in my post of deterrence demanding extreme punishment even with certainty of innocence of the accused.

  48. 48 48 Steve Landsburg

    A Critic:

    If innocent people are punished then guilty ones shall go free and punishment is anything but certain for the guilty,

    No, you misread. N people are all equally likely to be guilty. They’re either all punished or none are. So the guilty are punished only if the innocent are.

  49. 49 49 Harold

    “They’re either all punished or none are.”

    Have I got this right here. A woman is murdered. We are 95% certain that the husband did it. There is also evidence that suggests the butler, the delivery man, her sister and the baker may have done it, with evidence reducing in that order of 90%, 80%, 70% and 60% certainty. If we set the standard at 95%, we convict the husband. At 90% we convict the butler as well. At 70% we convict the delivery man as well, etc.

    Under the current system, whatever level we set the burden of proof at, we will only ever convict one person. If we actually gatehr all the evidence we will convict the husband. Or we might get it wrong and only ever convict the butler if we miss the evidence against the husband and have a 90% burden. If we missed the evidence against the butler and the husband we would not convict the delivery man at 90%, but we would at 80%. We will still only ever get one conviction.

    It seems to me that you are proposing a completely new system of justice. If we had such a system where everyone with a certain amount of evidence against them was to be convicted then your model would be valid (as far as I can tell). However, if we are to stick with the current system where we have one conviction for each crime then it is not valid. The conclusions have no bearing on the current system, so we cannot conclude anything about the costs and benefits of altering the penalties and the burden of proof in the system we have now.

    This seems a fundamental problem with the model, but there has been no response to my earlier comments. Have I misunderstood what is going on?

  50. 50 50 Harold

    Don,
    “1) it is inappropriate as it purports to be desiring deterrence as its sole goal but operates in a way that inconsistent with the principles of pure deterrence.”
    It is surely desiring lowest social cost as the sole goal.

    Your example is that using this scheme we should convict a known innocent if we know it will have a net social benefit, and you contend that this is unethical. I mentioned earlier the trolley problems. These are ethically quite knotty. Most people believe it is ethical to switch the trolley to kill one and save 6. Most people also think it unethical to convict one known innocent to benefit many. But exactly why one is ethical and the other not is not straightforward.

    A utilitarian would believe that such a act would be ethical, and this is not such an obscure philosophy that we can discard it out of hand.

    In practice, even a utilitarian may argue against such actions, because we do not in fact know with sufficient certainty that there will be a net gain – there are certainly all sorts of other factors nto included in the model. That is a practical argument, not a philosophical one.

    The fact is that with the current system we do know that we convict innocent people. Unless we require absolute certainty we know that some convictions will be wrong. We just don’t know which ones they are. Taken as a whole, if we agree that 90% confidence is OK, then we know that 1 in 10 convicts are innocent people. In practice, we must lock up some innocent people if we are ever to convict the guilty. Why is it different if we know the identity of one of them?

    I am not saying that there is no difference, but it is not simple to pin it down.

  51. 51 51 Steve Landsburg

    Harold: Have I got this right here. A woman is murdered. We are 95% certain that the husband did it. There is also evidence that suggests the butler, the delivery man, her sister and the baker may have done it, with evidence reducing in that order of 90%, 80%, 70% and 60% certainty. If we set the standard at 95%, we convict the husband. At 90% we convict the butler as well. At 70% we convict the delivery man as well, etc.

    The model is not even that sophisticated. If we are 95% certain that the husband did it, then there are exactly 1.05263 people, one of whom is the husband, and all of whom are equally likely to have done it. (1 is 95% of 1.05263). (Somehow we manage to envision these fractional people.) At a standard of 95% or less, we convict them all; at a higher standard we convict none.

    This is obviously unrealistic in a thousand ways, but is meant to capture the fundamental thing about a 95% standard: Each time you convict one guilty person who just meets that standard, you also convict (on average) .05263 innocent people. I expect that the current model tells us something about what to expect in more sensible models that have that same characteristic.

  52. 52 52 Harold

    Thanks for the clarification. I am not sure if my point makes any difference, but you say “Each time you convict one guilty person who just meets that standard, you also convict (on average) .05263 innocent people.” This makes a total of 1.05263 people convicted.

    Whereas, in reality, and rounding off, you only convict 0.95 guilty people and 0.05 innocent people, making a total of 1.00 people every time you convict someone.

    Does it make any difference? If so, could it be easily incorporated into the model? It means that the average cost is simply P, not NP, but that seems sensible, since we do in fact only convict one person for each crime.

    Thinking through, in the current model, if N=2, then we need to convict 2 people to get one guilty one. If we always convict, we pay twice for punishment, but we get “full” value for deterrent. That is, since conviction is certain, crime will only be committed when B>P.

    “I expect that the current model tells us something about what to expect in more sensible models that have that same characteristic.”

    If by that characteristic you mean the ratio of convicted guilty to convicted innocent I am not sure.

    In an adjusted model with one conviction per crime (like the real world), with “always convict” we would still lock up one innocent for every guilty person, but we would only convict one of the two. The ratio of innocent to guilty convictions is still the same, but the guilty have only a 50% chance of conviction, or a deterrence of 1/2 P. Now crime is committed if B>0.5P. So the net cost of crime is the average value of C-B+P, and B ranges over all values greater than 0.5P. Does this come out the same?

  53. 53 53 Enrique Guerra-Pujol

    Landsburg model is a fascinating one, but is it really a puzzle? I developed a Bayesian model of criminal and civil litigation that is relevant to this discussion. The link to my model is here:
    http://arxiv.org/abs/1506.07854

  54. 54 54 Enrique Guerra-Pujol

    Following up on my previous comment, my Bayesian model includes a scenario in which the outcome of a trial is purely random and in which the moving party is either “risk-averse” or “risk-loving” (i.e. in which the prosecutor is 90% confident the defendant is guilty or only 60% confident). The surprising result about my Bayesian model is that in either scenario, the posterior probability the defendant is, in fact, guilty is pretty high.

  55. 55 55 Ken B
  56. 56 56 Harold

    Enrique Guerra-Pujol. I found your paper very interesting. In a situation where the trial is a 50:50 affair, the probability “reverts” to the certainty of the prosecutor. A casual examination might suggest that this would happen when the trial was to convict 100% of the time, or in effect to have no trial and rely on the prosecutor. That is, if the prosecutor is 60% certain, then the chances that a person convicted of a wrongful act is actually guilty would be 60%. It is interesting that this result is only the outcome if we impose a coin toss on top of the prosecutors choice.

    If we were to have a totalitarian system where the prosecutors choice is simply rubber-stamped, that is 100% of trials convict, what is the probability that the convicted person is guilty, given risk averse (90% certain) and risky (60% certain) prosecutors?

  57. 57 57 iceman

    Why do we care about B again?

    #55 uh-oh, math is “too good to be true”?

  58. 58 58 Harold

    Ken B. The headline is not quite accurate, it is not more evidence, but more agreement. This only applies when there is real uncertainty in the evidence. When there is no error, unanimity is good. Pc is the error rate, so Pc of 10^-2 is 1 in 100. It is surprising, if not astounding, that at this very low error rate we see a drop in certainty with after only 3 witnesses agreeing. Whilst the principle is reasonable, I would niaively have thought that with 1% error it would take many more to agree before we would be less certain.

  59. 59 59 Ken B

    Harold 58 Yes, agreed. But it seems relevant to this discussion, and I noticed the 95% level mentioned as the common standard. Who can pass up a chance to get in a poke at both Steve and Roger?

  60. 60 60 Enrique

    To Harold (#56): You pose the following intriguing question: “If we were to have a totalitarian system where the prosecutors choice is simply rubber-stamped, that is 100% of trials convict, what is the probability that the convicted person is guilty, given risk averse (90% certain) and risky (60% certain) prosecutors?”

    It’s a great question, but the idea of a rubber stamp (or Kangaroo court) is antithetical to the spirit of my original model in #53 & #54. Remember that I was presenting a “Bayesian model” of civil and criminal litigation, so under any condition (i.e. level of risk aversion), my model would require a process in which judges or jurors are willing to “update” or revise their beliefs in light of the evidence presented at trial.

  61. 61 61 Harold

    Enrique, Thanks for the reply.

  1. 1 Wednesday assorted links - Marginal REVOLUTION
  2. 2 Justice Remodeled - CURATIO Magazine

Leave a Reply