Listening to Season One of NPR’s podcast Serial, which is the story of a real-life murder case, I came away about 80% sure that the defendant was guilty and 100% sure that I’d vote to convict him. This got me to pondering whether my standards for reasonable doubt (apparently less than 80% in this case) are in fact reasonable.
So I wrote down the simplest model I could think of — a model too simple to give useful numerical cutoffs, but still a starting point — and I learned something surprising. Namely (at least in this very simple model), the harsher the prospective punishment, the laxer you should be about reasonable doubt. Or to say this another way: When the penalty is a year in jail, you should vote to convict only when the evidence is very strong. When the penalty is 50 years, you should vote to convict even when it’s pretty weak.
(The standard here for what you “should” do is this: When you lower your standards, you increase the chance that Mr. or Ms. Average will be convicted of a crime, and lower the chance that the same Mr. or Ms. Average will become a crime victim. The right standard is the one that balances those risks in the way that Mr. or Ms. Average finds the least distasteful.)
Here (I think) is what’s going on: A weak penalty has very little deterrent effect — so little that it’s not worth convicting an innocent person over. But a strong penalty can have such a large deterrent effect that it’s worth tolerating a lot of false convictions to get a few true ones.
In case I’ve made any mistakes (and it wouldn’t be the first time), you can check this for yourself. (Trigger warning: This might get slightly geeky.) I assumed each crime has a fixed cost C to the victim and a random benefit B to the perpetrator. For concreteness, we can take C=2 and take Log(B) to be a standard normal distribution, though the results are pretty robust to these particulars. (Or, much more simply and probably more sensibly, take B to be uniformly distributed from 0 to C — the qualitative results are unchanged by this.)
Now assume that the size of the punishment is P and the number of convictions necessary to get one true conviction is always N. More precisely: I’ve assumed that for every crime, there are N suspects, one of whom is guilty, and all of whom are equally damned by the evidence. The question then is: Should we, given the strength of that evidence, always convict or never convict?
If we never convict, then all possible crimes are committed, so the net cost of crime is the average value of C-B, as B ranges over all values greater than zero. If we always convict, then crimes are committed only when B>P, and we have to add the cost of punishment to the cost of the crime, so the net cost of crime is the average value of C-B+NP, as B ranges over all values greater than P. (This assumes that punishment is a pure social cost, so we’re talking about prison sentences, not fines.)
Comparing the two costs, it turns out that for fixed N, the “never convict” policy is better when P is small, and the “always convict” policy is better when P is big — though the threshhold value of P increases with increasing N.
In other words, the cutoffs look like this, so that in the blue-dot case — with quite weak evidence and a quite harsh punishment — the recommendation is to convict:
Whereas I might naively have expected something more like this:
I find this result surprising and wonder how it would hold up in a more fully fleshed-out model. The obvious place to start tinkering is with the assumption that all cases are identical. It’s pretty clear that in a world where the quality of the evidence against a murder defendant never rises above the 50% level, we’d want to convict at that level — the alternative, after all, is to effectively legalize murder. But if some cases are stronger than others, and if defendants don’t know in advance how strong the evidence against them will be, then we can achieve some deterrence with a much higher cutoff. This will surely, then, shift the red line in the graph to the right, but I don’t know whether it will affect the downward slope.