Here I have a couple of urns. The one on the left contains 70 red balls and 30 black. The one on the right contains 30 red and 70 black.

While you weren’t looking, I reached into one of these urns and randomly drew out a dozen balls. As you can see, 4 of them were red and 8 were black.

Here are three questions that I think you ought to be able to answer if you want to be in the business of assessing evidence:

- If you had to guess, which urn would you guess I drew from?
- What’s your estimate of the odds that you’re right?
- Do you think you’re right beyond a reasonable doubt?

Further discussion to follow later in the week. (Hat tip to Howard Raiffa, who will also figure in the upcoming discussion.)

Well, I’m obviously going to pick the right urn.

I did a really rough estimate and got that your balls are almost 80 times more likely to have came from the right urn. Using a calculator, I get almost 50. I think 2% allows some room for reasonable doubt? But not a lot of it.

To get that result, all I did was take the product of some independent probabilities; all the other factors I could think of (e.g. summing over order rearrangements) would end up canceling when I take the ratio of left urn/right urn probabilities.

So Professor Landsburg, what’s the first thing I did wrong here?

1) The second urn

2) Some quick calculations lead me to believe that it is around 50 times more likely that it came from the second urn was rather than the first. There is probably something I am missing that makes this number much bigger or much smaller.

3) Yes.

1. The urn on the right.

2. It will actually be the urn on the left 2% of the time.

3. I’m unwilling to accept a justice system where 1/50 death row inmates are innocent. I am not right “beyond a reasonable doubt”.

1. The right urn.

2. I’ll assume there’s a 98% chance I’m correct based on the previous two comments.

3. This depends what you mean by reasonable doubt. Where I practice, a finding beyond a reasonable doubt can be made based solely on circumstantial evidence only if the circumstantial evidence is consistent with that finding and inconsistent with any other rational conclusion.

Of course this definition help all that much, because it just raises the question of what a “rational conclusion” is. In my view, it is fortunate for the legal profession that we never deal with cases where the probabilities are so exact.

Still, I am not convinced beyond a reasonable doubt that you drew the balls from the right urn.

Yeah I get 50 to 1 odds as well.

I assume this is leading up to a post about how sloppily the legal system defines “reasonable doubt”? :)

Agreed, and I would add that to the list of many, many other ways in which the legal system falls far short of the standards that (good) academic researchers set for themselves in pursuit of the truth. Most obviously, the legal system is structured in a way that favors the side with more money (even when the judge isn’t taking bribes directly).

One of the worst things you could say about an academic research institution would be that they bias their results in favor of the rich or powerful. But the legal system barely even pretends that it doesn’t do this.

My first instinct (without calculation) was the right urn, but probably not that much more likely, so with reasonable doubt.

If this was important evidence presented to me on a jury, I would expect expert evidence from a statistician.

Hoping someone posts the workings so I don’t have to remember my statisitics from along time ago.

I think that is the same urn in both pictures. You don’t have 2 urns at all! This sort of sloppy approach to evidence could get you into trouble in court.

Lawyers are supposed to be able to assess evidence? I think a much better quiz would be:

Ambulance A contains a man who slipped on his neighbors kitchen floor.

Ambulance B contains a woman who was burned spilling hot coffee on herself.

1) Which ambulance ought to be chased?

2) What would you do while skulking in the hospital parking lot?

3) Do you have any plans of entering politics after practicing law?

If they answer yes to the third question, have them quietly disappeared.

I would guess the jar with 70 black.

There is a 72% chance of drawing at least eight black balls from that jar. There is less than a 1% chance of drawing eight black balls from the 30 black ball jars. At p= .01, I would reject the idea that the draw came from the jar with 30 black balls, so I don’t think there is reasonable doubt.

Please show your workings for maximum marks.

Clearly the commenters here are not lawyers since they haven’t yet gotten into a discussion/debate about what amount of doubt is reasonable.

Bayes rules and kick asses.

Quick(ish) calculation is that there is a 24.% chance of drawing that combination from the right jar, and a 0.5% chance from the left.

Given that you have drawn that combination, 24.6% / (24.6% + 0.5%) = 98% chance that it is from the right.

so:

1. Right Jar

2. 98%

3. No, could have missed something, could be a trick, would I want more certainty before imprisoning somebody,….yep.

Answering the reasonable doubt question requires knowing the stakes. The cost of the error changes based on the expected consequences.

The problem with asking if it’s reasonable is that most people wouldn’t interpret the answer as “It is (is not) reasonable to assume that 70% yes is more likely than 30% yes” but as “It is (is not) reasonable to assume that yes is more likely than no.”

If the former is the conclusion, then reasonable in all cases, if the latter, it depends on the cost of the error as 70/30 has been turned into an absolute.

Harold,

Think of it as a matter of finding the ratio of how many (non-duplicative) ways you could pull 4 reds & 8 blacks from the first container vs. how many ways you could do it from the second container. The first is 30C8 * 70C4, and the second is 30C4 * 70C8. (Just as a reminder, for any integers x & y, xCy = x!/y!/(x-y)!, which means that if you set this question up with paper & pencil, you find a whole lot of those factorials cancel out.)

I’ll pick the right urn. Instead of guessing, I use my calculator to work out the probabilities. It is around 49 times more likely that it came from the right urn. So I’m right 49 times out of 50. Is this beyond reasonable doubt?

1) The second urn

2) About 1 in 5

3) Definately not.

My fault..

I was doing 4 out of 10 not out of 12..

It is about 2 percent..

To me that is not reasonable doubt if that is the only evidence.

If there is additional evidence, that 50 to 1 mark is very convincing.

By my spreadsheet:

1. The right urn.

2. Odds of picking 8 black out of the right urn=23%. Odds of picking 8 black out of the left urn=0.8%. Odds that it’s the right urn=97%.

3. I wouldn’t bet my life on it.

I also got that it is roughly 50x more likely you’ll get 4 red and 8 black from the 2nd urn.

1.) If I had to guess I’d guess the 2nd urn

2.) I’m 98% sure I’m right (which doesn’t really mean anything, what I actually believe is that I AM right)

3.) Given this evidence (and assuming my calculations are correct) and nothing else, I’m sure beyond reasonable doubt.

I also recently took the LSAT and am trying to get into law school… so, how’d I do?

Thank you, ryan yin.

Part of the problem with trial by jury is that what the juror considers reasonable doubt depends very much on the situation. I believe that if the juror can relate well to the accused, if he is “one of us”, then a very small doubt will get an aquital. The juror is looking for reasons to aquit. If the accused is from a “different” group, the doubt has to be large. Also, a very serious case with lots of publicity will require huge doubt to aquit. After a gory murder or terrorist atrocity, there is a feeling “someone has to pay”. The juror is more inclined to look for reasons to convict.

I estimate (based on nothing but a hunch) that “reasonable doubt” can be anything from about 30% to 99.9% probability of guilt.

I turned to Wikipedia for help. :)

Standard of proofThe “standard of proof” is the level of proof required in a legal action to discharge the burden of proof, that is to convince the court that a given proposition is true. The degree of proof required depends on the circumstances of the proposition. Typically, most countries have two levels of proof or the balance of probabilities:beyond a reasonable doubt— (highest level of proof, used mainly in criminal trials)clear and convincing evidence— (intermediate level of proof, used mainly in civil trials in the U.S.)preponderance of evidence— (lowest level of proof, used mainly in civil trials; typically means more likely than not)There doesn’t seem to be a percentage set in stone, but later in the article:

One of the earliest attempts to quantify reasonable doubt was a 1971 article by Rita Simon and Linda Mahan, “Quantifying Burdens of Proofâ€”A View from the Bench, the Jury, and the Classroom.” [9] In a later analysis of the question (“Distributions of Interest for Quantifying Reasonable Doubt and Their Applications,” 2006[10]) , three students at Valparaiso University presented a trial to groups of students. Half of the students decided the guilt or innocence of the defendant. The other half recorded their perceived likelihood, given as a percentage, that the defendant committed the crime. They then matched the highest likelihoods of guilt with the guilty verdicts and the lowest likelihoods of guilt with the innocent verdicts. From this, the researchers gauged that the cutoff for reasonable doubt fell somewhere between the highest likelihood of guilt matched to an innocent verdict and the lowest likelihood of guilt matched to a guilty verdict. From these samples, they concluded that the standard was between 0.70 and 0.74.1. I guessed the right urn.

2. Decided not to estimate and used the binomial distribution function in Excel. This gave me a 0.78% chance of picking that combination from the left urn and a 23.1% of picking that combination from the right urn. The odds that it was the right urn ar 29.6:1. This is a 96.7% chance that it was the right urn.

3. Beyond a reasonable doubt is subjective. I would be willing to convict on a 96.7% chance I’m getting it right.

1) Expected Value from the Right Urn for 12 trials = 8.4 Black / 3.6 Red (and vice versa)

2) Odds are approximately 99%

3) I don’t think it’s reasonable to doubt in most cases since the vast majority of the justice system deals with probable cause. In this case, it’s highly unlikely that the balls came from the left jar. I find it interesting that everyone jumps to life/death consequences where, a 1-2% confidence is not sufficient.

Okay, time to put “my rule” to use.

The probability Professor Landsburg drew from urn 2 (on the right) given that he drew 8 black and 4 red is:

Pr[urn 2 | 8 black, 4 red] =

Pr[8 black, 4 red | urn 2]*Pr[urn 2]/Pr[8 black, 4 red]

and the probability he drew from urn 1 is

Pr[urn 1 | 8 black, 4 red] =

Pr[8 black, 4 red | urn 1]*Pr[urn 1]/Pr[8 black, 4 red]

To compute these, I need a prior probability for him drawing from urn 2. Most people have been assuming this is 1/2. If it is, then

Pr[urn 2 | 8 black, 4 red] = 0.9674

and

Pr[urn 1 | 8 black, 4 red] = 0.0326.

So, if I believe it is equally likely for him to draw from each urn, then I’d say there is a 96.74% chance he drew from urn 2 (on the right) based on the evidence. The 3.26% chance that he didn’t could be cause for reasonable doubt.

To have more fun, suppose that drawing from urn 2 was a crime. Would this evidence persuade me to convict? Unlikely, because I doubt that my prior probabilities would be equal in this case. If, for instance, 10 credible character witnesses said that they have known Professor Landsburg all his life, and all but one swore that there is no way he would draw from urn 2, then I might change Pr[urn 2] to 0.1. If I did, then

Pr[urn 2 | 8 black, 4 red] = 0.7671

and

Pr[urn 1 | 8 black, 4 red] = 0.2329

Now I would have reasonable doubt.

It is interesting to ask what level of prior probability for urn 2 would be enough to ‘negate’ the evidence and make:

Pr[urn 2 | 8 black, 4 red] = 0.5

This will happen if, prior to seeing the evidence, I believe that there is only a 3.26% chance that Professor Landsburg would draw from urn 2.

I think that the consideration of prior probabilities is an essential aspect for this problem, and something that should be done when assessing evidence for legal decisions, medical diagnoses, and many other areas. Even when you select your prior to be uniform when assessing evidence, it helps to be aware that you are doing so, and to know how your assessment of the evidence might change if you changed your prior.

Well this does depend on how likely I think you are to choose the right urn while i’m not looking before I see what you drew. If the urns are on different sides of a room and your seat is next to the one on the left thenI might be less sure of reasonable doubt. yes I am being a bit pedantic here.

Of course, when we do the natural thing and assume a 50-50 chance of each urn being picked prior to seeing the balls then yes I’d pick the right urn and I’d be right 98% (ish) of the time.

Ok, so these should be questions for a jury questionnaire. What has this got to do with aptitude for law school?

Arguably, if I wanted to use this test to identify a good candidate for law school, I might want to identify the person who can best identify with the thinking of a typical juror. So the “correct” answer would be the answer that best conforms to the answer given by people from the demographic class of people who populate juries.

The previous answers seem to indicate that it is a matter of probabilities, but forget the importance of priors. There is one piece of missing information: you did not tell us how did you choose the urn you reached into, i.e. the prior probability that you chose left one. Indeed, whatever was the prior probability, the 4:8 observation makes us update our posterior a lot toward the right urn, but the conclusion still depends on the prior.

Since it looks like you may want to trick us, I would guess that you chose to pick the balls out from the left one. Given this prior, my probability of mistake is 0. Whether or not my assumption about the prior is correct is a matter of psychology and cannot be concluded from the information provided.

I believe the probability of the right urn is 98.0%, which is given by X/(X+1), where X is 66*65*64*63/(26*25*24*23). X is the ratio of the # of permutations that could give the required result from the right urn to the # for the left urn.

(as Ryan Yin said).

If I were asked this question in a law school admission interview, there are a lot of things I would focus on before trying to compute probabilities. For example:

Why are you drawing balls from urns?

Do you have any stake in the outcome?

Who put the balls in the urns?

Did anyone witness the balls being put in the urns?

Did anyone witness you drawing the balls out of the urns?

How do you know that the urns have the number of balls given?

Are the balls in each urn identical, other than color?

How were the balls in each urn distributed? (i.e., are they layered red/black or black/red? very well shaken to randomize? etc.)

To get the probability of picking BBBBBBBBRRRR (in exactly that order) from the right urn, the numerator is 70*69*68*67*66*65*64*63*30*29*28*27 and the denominator is 100*99*98*97*96*95*94*93*92*91*90*89.

The probability of picking BBBBBBBBRRRR (in exactly that order) from the left urn can be calculated similarly, but it’s 70 through 67 and 30 through 23 in the numerator instead of 70 through 63 and 30 through 27.

The ratio of the first to the second is about 48.2 to 1, which means the probabilities are about 98% and 2%, respectively.

(The “exactly that order” stuff doesn’t matter in this case because the number of possible orders is the same for the left and right urns, and multiplying each of the two probabilities by the same number does not affect the ratio between them.)

As for whether 98% certainty is beyond reasonable doubt (in the context of finding somebody guilty of a crime), I think that’s a hard question. I vote yes, but that’s pretty close to where I’d intuitively draw the line. (Not that I would want to rely on intuition. I’d prefer to see a comparison of the costs of making type 1 versus type 2 errors in this context; but identifying and quantifying those costs seems rather complicated.)

I think it’s pretty obvious that the prior probability for each urn, based on what we know, is 0.5.

If we were supposed to have information leading us to use different priors, it would have been supplied.

Oops. My earlier calculations used the wrong distribution for

P[8 black, 4 red | urn 1]. (Meant to use hypergeometric, but used binomial.)

So, the numbers are a little different:

Pr[urn 2 | 8 black, 4 red] = 0.9797

and

Pr[urn 1 | 8 black, 4 red] = 0.0203,

when the priors are equal to 1/2.

Believing that there was only a 1/10 chance he would select from urn 2 drops the posterior to 0.8427.

Believing there is only a 2.03% chance of him selecting from urn 2 makes the posterior equal to 0.5.

Sorry about the mistake.

I’m not looking at any of the previous answers.

Suppose the P(selecting left)=P(selecting right). That is the unconditional probabilities are the same.

1) the one with 70 black

2) By Bayes rule P(2| 8 black, 4 red)=[P(2)*P(8 black, 4 red| 2)]/[P(2)*P(8 black, 4 red| 2)*P(1)*P(8 black, 4 red| 1)

= 0.9797

3) I’m less doubtful than if the answer was 0.96

I get about 30 to 1, being about (7/3)^4. This is a rough-and-ready estimate done in my head. Whether this is ‘beyond reasonable doubt’ is surely a question about law, not probability?

1. The jar on the right.

2. As someone above said: (/ 2456. (+ 2456 40)) => 98%

In the interest of showing my work:

https://gist.github.com/670234

1.) the right urn.

2.) P=.98

3.) yes, beyond a reasonable doubt. For my normal prediction and decision making activities p=.98 is really, really sure.

Ah, what is beyond reasonable doubt? Good question. In fact, this very question was at the heart of a murder trial where I served on the jury. The foreperson of this jury wanted one of the jurors, and probably the rest of the jurors who disagreed with the foreperson, disqualified from the jury for not being reasonable in their threshold for reasonable doubt. The judge was asked to define what reasonable doubt was, and returned the standard mumbo-jumbo, which was not particularly helpful.

What would have been a lot more helpful is the following definition, which became clear to me after this trial. To be found guilty “beyond reasonable doubt” is to be found guilty beyond what is reasonable doubt for ALL of the 12 jurors at the end of deliberations. This implies that every juror’s reasonable doubt is valid, and they all will be different, some more different than others. A defendant should be criminally convicted only if all jurors are convinced beyond their own definition of reasonable doubt.

Equally troubling about this jury experience was how little the evidence figured into many of the jurors’ decisions. But that is another issue entirely.

The comments include many valid versions of “reasonable doubt.” A 2% chance of being wrong in a murder trial is not beyond reasonable doubt for me.

You’re all committing an inverse gambler’s fallacy. You’re assuming a random distribution of events occurred. It didn’t. Yes, if Steve were to draw from unknown jars an arbitrarily large number of times, 98% of the time these results show up, it will come from the second jar. However, that didn’t happen. Marbles were drawn from a single unknown jar a single time. Either of two jars are capable of this result, this one time, though one is still a better guess, therefore:

1. the right jar

2. 50%

3. hell no.

To use an example from the field Steve is referencing, the person you should arrest following a shooting is not the person with the most guns – which is what all the probability math would suggest.

The first two questions, beaten to death in the analyses above, give a simple mathematical answer.

But the third question is (mis)leading. After you’ve computed the mathematical probabilities, you’re left with a Pascal problem. That is, you have to multiply the probability (2%) of your being wrong by the cost of being wrong. As someone posted above, 2% error rate is too high (for him) in a capital case. But it probably wouldn’t be in a drug dealing case or in a burglary.

Hence, the information in the third question is too limited to be able to answer it satisfactorily. 98% is beyond reasonable doubt for most cases, but perhaps there are some where the penalty is so high that you’d like to see 99.5% or something. And, the point, this Pascal calculation is of course going to be subjective among the jurors. What’s your reasonable doubt may not at all be mine.

But it’s still a Pascal problem, in the end.

I agree with most everyone else on the first two questions. But on the third, I would have to say it depends on the level of punishment that might be handed out if I find the defendant guilty:

* If it’s an infraction, or even a misdemeanor up to the level of littering (90 days jail), 2% would be a small enough doubt that I’d vote guilty.

* Anything much more serious, I’d have to vote not guilty.

If the judge and/or prosecutor won’t tell the jury the potential penalty (for instance, a case that could be the “third strike” in a state with a three-strikes law) I’d feel a moral duty to find that information out for myself, or if prevented, would assume it’s high and vote not guilty.

The balls are more likely to come from the right urn.

The odds I am right using hypergeometric calculations (exact) are .2463 to .0051, or 97.97%. The population is large enough, and the sample is small enough, that binomial distribution will come very close (.2311 to .0078), or 96.74%.

Either way, reasonable doubt exists, unless we are prepared to say that 46771 (based on 2008 prison population of 2,304,000) innocent persons in prison is a reasonable number. I would venture to say if YOU were one of those 48K or a member of their family or a friend of yours, reasonable doubt would be sufficient.

GREAT QUESTION! Thanks for making me think!

IANAL…..

Answer 1: Probably the one with 70 black

Answer 2: Making a number of assumptions here, approx 50:1

Answer 3 (if this is a law school admissions test): Which side am I arguing again?

the issue here is in the answer for #2. With the obvious set of assumptions, there is an overwhelming chance that this was taken from the jar on the right. However, if one was arguing this in court, you’d also have to argue that the assumptions themselves could reasonably be doubted.* Similarly if one was arguing on the other side, I’d be looking at validating those assumptions, and providing additional evidence if it existed.

* For example, may be this was the highest portion of black balls taken out of the jar out of twenty tries. I would think this would shift the balance of evidence towards the other jar, but I am not a statistician. Or maybe there are differences in the balls which make one more likely to be selected than the other. Or maybe…….

On the whole, I don’t think the mere number of balls is enough to get beyond reasonable evidence without more context as to the selection process, the nature of the balls, and other background stuff.

1. If I accept as true your statement that you reached into one of the two urns and randomly drew out 12 colored balls, I would guess (for purpose of a parlor game) that you drew them out of the urn on the right.

2. My estimate (guess) of the odds that I’m right would be greater than 1 in 2.

3. I would not guess that I am right beyond a reasonable doubt, even for a parlor game, without counting all of the balls remaining in each urn.

If the answer to the above questions had consequences (such as guilt or innocence), I would want to know much more, for example: who witnessed you drawing out the balls, do the balls you drew out match the balls remaining in either urn, if no one witnessed you drawing out the balls – what is your reputation for honesty, if more or less than 100 or 88 balls remain in either urn – what is the explanation for that fact, do you personally have anything to gain or lose if I believe or disbelieve your testimony about the balls? And so forth, and so on.

If beyond a reasonable doubt was a math question, you wouldn’t need juries.

The answers to 1 and 2 are as explained by many comments above. But the answer to 3 has to be no. The error bounds on the estimate in step 2 are massive since we have no way of knowing how you chose which jar to pick from.

For example, if someone offered you $1 for every red ball you picked, then you almost certainly picked the right jar and just got very unlucky. Nothing in the question allows us to rule out this possibility. For any realistic situation, that you may have had some stake in the ball colors is entirely reasonable supposition.

The question requires us to guess with no way to know how accurate our guess is. There is no way to say there is no reasonable doubt.

David Schwartz: What if you know for certain that I flipped a fair coin to choose the urn?

Steve Landsburg: Then the question comes down to whether a 2% chance is reasonable doubt or not.

I am a criminal defense attorney and can confidently say that no jury or judge with whom I have ever worked thinks mathematically about the concept of beyond a reasonable doubt.

“Thomas Bayes” got what I think is the correct answer. Prior probabilities are very important here. The odds are impossible to calculate or even estimate from the information given.

The answer can only be calculated if these two conditions are met:

- The author picked the urn randomly (e.g., by throwing a coin)

- The author would have asked the question regardless of the outcome.

Otherwise, I can walk around the room, do various things, and whenever an unlikely event occurs, ask you to calculate probabilities.

Even though you’ll think you are 98% certain with each answer, you’ll still be wrong every time.

Igor Ostrovsky: I agree. If you read the followup post from the day after this one, you’ll see this point addressed in full.

The right urn was framed. I don’t think the balls came from the left urn either.

I thought of an easier way to calculate the odds (in answer to number two). We’ve drawn twelve balls. Divide them into three chunks: RRRR, BBBB, and BBBB.

The first two chunks, taken as a whole, are equally likely to have come from either urn; so they can be ignored. The third chunk is the one we care about.

After the first two chunks have been drawn, there are either 66 black balls remaining in the urn, or 26, depending on whether we are drawing from the right urn or the left. If we are drawing from the right urn, there are 66 * 65 * 64 * 63 ways to draw four black balls. If we are drawing from the left, there are 26 * 25 * 24 * 23 ways to draw four black balls.

The ratio of (66 * 65 * 64 * 63) to (26 * 25 * 24 * 23) is about 48.21 to 1. The corresponding probabilities are 97.97 percent and 2.03 percent.

1) If I had to guess, I would say the right urn.

2) There is no evidence one way or the other. The reported outcome is possible regardless of which urn was sampled. The fact the reported outcome is far more probable with the left urn than with the right urn is irrelevant, as the selection of the urn was not random, but conscious.

If the reported outcome was extremely improbable with one choice (probability less than 0.001) that might constitute such evidence. A probability of less than 0.000001 would be exclusion beyond reasonable doubt pf that choice.

3) Obviously not.

IANAL.