The Extra Half Boy

Published

on January 7, 2011

. 42 Comments

Last week’s series of posts on boys, girls and population ratios drew an astonishing 850 or more comments, and they’re still coming in. There’s a lot of stuff worth reading in those comments, but the conversation — which is spread out over four threads and often involves several overlapping subconversations — has become difficult to follow. Fortunately, one of our more insightful commenters has volunteered to write a guest post summarizing what he sees as some of the most important discoveries to come out of those discussions. He has signed his guest post “Tom M”, but in the comments, he is simply “Tom”.

Without further ado:

The Extra Half Boy

A Guest Post

Tom M.

1. Introduction

Steve Landsburg posed a problem in this post at his blog entitled Are You Smarter than Google?. Here was the problem statement:

There’s a certain country where everybody wants to have a son. Therefore each couple keeps having children until they have a boy; then they stop. What fraction of the population is female?

Well, of course, you can’t know for sure, because, by some extraordinary coincidence, the last 100,000 families in a row might have gotten boys on the first try. But in expectation, what fraction of the population is female? In other words, if there were many such countries, what fraction would you expect to observe on average?

As we’ve worked on that problem, one of the points that came up was something called the “extra half boy.”

In this post I’m going to try to summarize what we’ve said about that critter and some related issues.

A key reference, that everybody should read, is What is the expected proportion of girls?, by ‘Thomas Bayes’.

2. Definitions

2.1. Flip a coin. The flip yields either ‘G’ or ‘B’ with equal probability.

2.2. A family

Flip a single coin repeatedly, until a B comes up exactly once, then stop. Record the entire string of results in order.
Call that string a ‘completed family.’

Examples of completed families:
B
GB
GGB
GGGGGGGGGGGGGGGGB

Examples of strings that are not completed families:
BG
BB
BGGGB

2.3. A population of N children

Sometimes we may be given a string of flips without information about families. So long as the string terminates in a B, we can always assign the flips to completed families sequentially.

Example:

String: GGBGBBGB
There are 4 corresponding completed families: GGB GB B GB

When the string terminates in a G, we can never decompose it, in the order given, into completed families.

2.4. A ‘completed country’ is a group of completed families

Examples of completed countries:
B (k=1, N=1)
B B (k=2, N=2)
GGGGGGB GGB (k=2, N=10)
GB B GGGGGB B (k=4, N=10)

In order to calcuate national statistics, we need to define some set of completed countries. We can define and index sets of completed countries in different ways. Two of those ways are described below.

2.5. The ‘ensemble of k-family completed countries’

This is the set of all possible completed countries that consist of k completed families.
Example: the set of possible countries for k=1:
B
GB
GGB
GGGB
GGGGB
…

This ensemble, like every ensemble of k-family countries for k>0, is infinite.

Note that N varies over this ensemble.

2.6. The ‘ensemble of N-child completed countries’

This is the set of all completed countries that have the same N.

Example: the set of all 8 possible countries for N=4:
BBBB
BBGB
BGBB
BGGB
GBBB
GBGB
GGBB
GGGB

For finite N, every ensemble of N-person countries is finite. Here it has 8 members.

Note that k (the number of Bs) varies over this ensemble.

3. Single-family completed countries Steve Landsburg covered this in the post A Big Answer on his blog.

I’ll repeat some of that here.

3.1. The possible bit strings are the ensemble of 1-family countries

k=1 (one family –> one B)

B N=1. 1 case. Probability 1/2.
GB N=2. 1 case. Probability 1/4.
GGB N=3. 1 case. Probability 1/8.
GGGB N=4. 1 case. Probability 1/16.
GGGGB N=5. 1 case. Probability 1/32.
GGGGGB N=6. 1 case. Probability 1/64.

The single B can only come at the end of the string, so for each k there is only one possible string.

3.2. Statistics for the ensemble of 1-family countries

E(B) = 1
E(G) = 1
E(G/(G+B)) = 1 – ln(2) ~ 0.307.
[See A Big Answer ].

4. Multiple-family completed countries for fixed k

Here we’re looking at the ensemble of k-family countries.

4.1. The ensemble of 2-family completed countries

k=2

N=2.
BB
1 case. Probability 1/4

N=3.
GBB
BGB
2 cases. Probability 1/8 each

N=4.
GGBB
GBGB
BGGB
3 cases. Probability 1/16 each

N=5.
GGGBB
GGBGB
GBGGB
BGGGB
…

There are infinitely many cases. For each value of N there are N-1 cases.

One B comes at the end of the string. The second B comes anywhere else in the string.

4.2. Statistics for the ensemble of 2-family completed countries

E(B) = 2
E(G) = 2
E(N) = 4

E(G/(G+B)) ~ 0.386294

This result is due to Anshuman. Please see his comment: he provides the analytical treatment as well as the numerical result.

Notice that the approximate formula

E(G/(G+B)) ~ 1 – 1/(2*E(N)) = 1-1/8 = 0.375

is already a half-decent approximation to the exact value, even for k=2.

If instead we only consider the first N-1 flips, Anshuman’s result is modified to

E(G/(G+B)) | first N-1 flips = (1/4)*1*0 + (1/8)*2*(1/2) + (1/16)*3*(2/3) + (1/32)*4*(3/4) + (1/64)*5*(4/5) + …

That simplifies to the familiar series:

E(G/(G+B)) for the first N-1 births with k=2

= 0/4 + 1/8 + 2/16 + 3/32 + 4/64 + 5/128 + …

= 1/2.

That is, if we ignore the lastborn child, E(B/(B+G))=1/2. That is true for any k>1.

4.3. The ensemble of k-family completed countries

Begin with the ensemble for the 2-family case, in section 4.1. In this case also, one B comes at the end of the string. Another k-1 Bs come anywhere else. The remaining symbols are Gs. So for each N there are (N-1 k-1) possible strings. Here the notation (a b), a choose b, is the binomial coefficient a!/((a-b)!b!).

4.4. Statistics for the k-family case

Thomas Bayes has given statistics in his report. (Please keep in mind that at publication time Thomas was using the symbol K for the population and N for the number of families. That is the reverse of the convention used here.)

E(B) = E(G) = k.

Over the first N-1 children only, following the argument in section 4.2 above, the formula for general k>1 is

E(G/(G+B)) = sum from N=1 to infinity, 2^(-N)(N-1 k-1)(N-1)/N.

Here (a b) is the binomial coefficient “a choose b” again.

In his piece What is the expected proportion of girls?, Thomas estimates the expected proportion of females in a country where every family wants a boy: roughly, E(G/(G+B)) ~ 1/2 – 1/(2*E(N)). (This is an approximation, better for larger k.)

Because 1/(4k) is half a child, we’ve called this effect an “extra half boy.”

Though it’s interesting to see that, because E(B) = E(G) for an ensemble of k-family countries, the “extra half boy” only appears in the expected ratio, not in the expectations, nor in their ratio.

Nevertheless the impact on the G/(G+B) ratio is entirely associated with the last child. Calculated over the first N-1 children only, E(G/(G+B)) = 1/2 exactly.

Jonathan Campbell pointed out something very important here. That last equation implies that the statistics for the first N-1 children in the population are not those of random coin flips. Here’s the argument:

We know that for all N children E(B)=E(G)=k.

Since the last child is a boy, for the first N-1 children we have E(B) = k-1 and E(G) = k.

For N-1 random flips the expected number of Bs and Gs would be k-1/2 each.

So for fixed k the last flip is fixed to B, but the first N-1 flips are not completely independent.

We’ve excluded sequences of N flips for which B is not equal to k.

In fact, if we know both k and N, the allowable sequences are exactly the (N-1 k-1) ways to place k-1 boys into N-1 slots in the birth sequence. That set of outcomes is drastically reduced from the complete set we’d get
from N-1 coin flips.

You have to be careful thinking about cases where k and N are both fixed. Those ensembles do not resemble strings of coin flips very much at all.

5. The ensemble of completed countries for a fixed N

This is a different way of looking at the problem. The first comment I’ve found suggesting this idea is from Neil in a discussion with Phil Birnbaum.

Instead of fixing the number of families (k), we fix the total population (N). We simply have N-1 random coin flips, followed by a single B due to the national stopping rule.

The number of families is determined by the outcome of the coin flips as described in section 2.3 above.

In this case, by inspection of the ensemble,

E(B) = E(G) + 1

E(B) = N/2 + 1/2

E(G) = N/2 – 1/2

E(k) = N/2 + 1/2 = E(B)

E(G/(G+B)) = 1/2 – 1/(2*N))

A terminal half boy appears, along with a missing half girl.

The half-couple appear in the expectation values for B and G in this case, not only in the expected ratio.

This isn’t surprising: we have a sequence of otherwise random births, terminating with a B.

In effect, by imposing the constraint that completed families must terminate with a B, we put in half a boy and took out half a girl.

Example: for N=4,

BBBB k=4 G/(G+B)=0
BBGB k=3 G/(G+B)=1/4
BGBB k=3 G/(G+B)=1/4
BGGB k=2 G/(G+B)=1/2
GBBB k=3 G/(G+B)=1/4
GBGB k=2 G/(G+B)=1/2
GGBB k=2 G/(G+B)=1/2
GGGB k=1 G/(G+B)=3/4

E(B) = 2.5
E(G) = 1.5
E(k) = E(B) = 2.5
E(G/(G+B)) = 3/8

This case is very easy to perform calculations on.
As Neil put it in a comment,

The point is the probabilities of the different subsequences are determined exactly as you would get by flipping a fair coin. If that was all there was to it, Lubos would be right, the expected girl proportion is exactly equal to 50%—but there is also the boy on the decision coin, which is why Steve is right, it is less than 50%.

(Note that we’re still only talking about the single-generation case only here, however.)

If we want to talk about all possible completed countries, the fixed-N case gives us an easy way to do that.

For every N, E(B)-E(G)=1. So if we average over all N we will get E(B)-E(G)=1 independent of how we weight that average. In that sense, we might say that completed single-generation countries average a half boy more (and a half girl less) than we would get from a totally random sequence of births.

6. Martingales

These haven’t taken up much of the discussion.

But Henry, in this comment, pointed out a parallel to a well-known and widely-analyzed gambling “system.” Just a check to make sure that we haven’t got a moneymaking bonanza here, which would be a bad sign.

Henry adds some proofreading a few comments later.

7. Conclusions

In the single-generation case, it seems clear that the expected ratio of girls is somewhat less than 1/2.

The simplest way to see that may be to consider the set of all completed countries that have N children (Section 5 above).

For any N we get E(B)-E(G)=1.
For every N, E(G/(G+B)) is less than 1/2; in fact E(G/(G+B))=1/2 – 1/(2*N).
For every N the shortfall 1/(2*N) is due entirely to the action of the stopping rule; the remaining births are random coin flips.

When we consider more than one generation, I don’t understand clearly whether the answer to Steve’s problem is 1/2 or something less. It still seems possible that in the absence of a termination, there might be no terminal B and no deviation from 1/2. We certainly haven’t proven that, but I don’t think we’ve disproved it so far.

Thomas Bayes in What is the proportion of girls? provided an outline for a solution that would come out to less than 1/2.

Neil and Pietro have suggested a different argument also leading to slightly less than 1/2.

Neil’s interesting argument here is based on the idea that the stopping rule in the problem statement can lead to extinction for some countries in the ensemble. Since most of the ensemble, in fact, consists of families with fewer girls than boys, how can we expect them to produce enough new couples to replace the original parental
generation?

Extinction by means of excess males seems to correspond to E(G/(G+B))<1/2, though I haven't seen details. I hope they're right. The more I work on this problem, the more eagerly I begin to look forward to the extinction of these miserable countries and their misguided inhabitants! 8. Acknowledgements and additional references

Most of this material appeared first in posts or comments on three follow-up posts at Steve’s place: A Big Answer, Win Landsburg’s Money!!! and Slippery Lube.

Again, a key reference, that everybody should read, is What is the expected proportion of girls?, by ‘Thomas Bayes’. In that document, Thomas outlines his solution to Steve’s problem and covers some of this same material from a different perspective.

(One note: at publication time Thomas used the symbol K for the population and N for the number of families. That is exactly the reverse of the convention I use here. I’m sticking with the convention I define below, for compatibility with Steve’s posts and the majority of comments. But I do apologize for the inconsistency.)

Most of the material here was either posted by Steve Landsburg on his blog or worked out among commenters there, including, in no very reliable order, Thomas Bayes, Jonathan Campbell, Neil, Jonatan, Phil Birnbaum, Pietro Poggi-Corradini, Anshuman and Henry. I apologize to the folks I’ve inevitably missed. Not all these people may agree with everything I’ve written!

When I cite comments, I try to hit the key ones.

But of course other related points, corrections, refinements, rants, etc., are often nearby in the comment stream.

42 Responses to “The Extra Half Boy”

Feed for this Entry Trackback Address

1 1 Steve Landsburg
January 6, 2011 at 11:57 pm

I’d like to add just one observation, which I feel sure Tom will agree with. (He can correct me if I’m wrong.)

In Section 7, Tom writes

In the single-generation case, it seems clear that the expected ratio of girls is somewhat less than 1/2.

The simplest way to see that may be to consider the set of all completed countries that have N children

This may leave some readers with the false impression that the “not 1/2” answer applies only to countries where all families have finished reproducing. (This claim has been made in comments several times, though never with any supporting arugment.)

To see that the answer to the original question is not 1/2 in an incomplete country, consider the case of a country with two families, each of which wants a boy, and take a snapshot after two years, when families still have a good chance of being incomplete. In that case, the expected fraction of girls is 7/16, which is not 1/2. The calculations are here.
2 2 Henry
January 7, 2011 at 5:40 am

Woo, I get a citation! I’m famous!

Unfortunately, the link to my comment and several others don’t work, so I’m not as famous as I could be.
3 3 Harold
January 7, 2011 at 6:30 am

This has all been a lot to get my head around, so apologies if I state some total howler, and make all above commenters think “What do we have to do to get this accross!”

Anyway, my thoughts. For 1st generation only. In the early days of the discussion, it was more or less agreed that the E(B) = E(G), I think. Now we have for a completed country E(B)-E(G)=1. This still produces a result <1/2, but seems not to involve difference between ratios of expectation and expected ratios. However, we have selected from the set of "all possible countries" (or simulations), a sub-set of "completed countries" (or simulations). In the totality of probablistic countries there will always be some "incomplete", with an almost endless string of girls. I suppose this is where the balancing 1/2 girl resides. So overall E(B) = E(G) for any country, as we do not know in advance that it will be complete by the time our simulatiion ends.

For an incomplete country, in effect the extra boy occurs at the beginning (in small families), with a high probability. The balancing girl occurs in a long string of endless girls, with a low probability. The long string of girls can never quite get long enough to balance the boy.

For multi-generations, including a limited period of fertility and mortality, I am now inclined to the extinction idea. Assuming total monogamy and pairing for life, there must always be some fewer number of families in each generation, since there are more boys than girls, and number of boys cannot be more than the number of original families. Of course, in the real world, total monogamy is not likely to occur.
4 4 Jonatan
January 7, 2011 at 7:32 am

I feel that there is a more simple proof than that. Some logical argument about that every child is a fair coin flip, except the last one born in the country, because the last one must be a boy.
5 5 Tom
January 7, 2011 at 8:09 am

Steve,

Absolutely. I should probably have put that in, but I just couldn’t get my mind (and my time) around the incomplete-countries cases too.
6 6 Tom
January 7, 2011 at 8:12 am

Henry,

Thanks, I’m looking into the comment links.
7 7 Steve Landsburg
January 7, 2011 at 8:16 am

Henry:

Unfortunately, the link to my comment and several others don’t work, so I’m not as famous as I could be.

Fixed now, I think — my apologies.
8 8 Tom
January 7, 2011 at 8:25 am

Jonatan,

There should be my best try at an argument like that in the piece, Section 5. Looks like the beginning of it got cut off. I’m looking into what happened.
9 9 Steve Landsburg
January 7, 2011 at 8:29 am

Comment at 8:30AM EST on Friday, January 7: Several of the comment links were broken. More importantly, a quotation mark in the html had gone missing and this was rendering a chunk of the post invisible, beginning with “Jonathan Campbell” and going through the first paragraph of section 5.

Fortunately, this got caught pretty early. But if you’ve already read the entire post, and if you want to make sure you’ve got it all, you should go back and re-read that part (it’s about a screenful).
10 10 Steve Landsburg
January 7, 2011 at 8:35 am

Jonatan: The section Tom is referring to is now restored.
11 11 Tom
January 7, 2011 at 8:39 am

Harold,

No, probably the (previously) missing text is causing the problem. One approach to the problem (our usual one) is used in sections 3 and 4 and gives E(G)=E(B). A different approach is used in section 5 and gives E(B)-E(G)=1.

It’s a pleasantly spooky problem.
12 12 Harold
January 7, 2011 at 8:40 am

I think Jonmathon Campbell’s N-1 case (the bit that was invisible before) not being random applies only to completed countries, doesn’t it? By selecting only completed countries from the set of all countries, you have removed some “randomness” so you wouldn’t expect the results of the coin tosses to be completely random. Is the same as saying “be careful where k and N are both fixed?
13 13 Thomas Bayes
January 7, 2011 at 8:41 am

Another simple way to think about the ‘incomplete’ country for which all families have not finished reproducing . . . even if the final child has not been born at the time of your census, you have to account for him in your in your expectation.

Imagine, for example, that an urn contains one ball with probability 0.9, and contains two balls with probability 0.1. The expected number of balls in the urn is 1.1, but, when you check to see how many balls it actually contains, you might see only one. But the fact that you might not see two balls does not make the expected number of balls equal to 1.

Likewise, the actual census for the country might not contain the last boy, but the expected value for the proportion of girls still has to account for the probability that it might. The only way to make the expected proportion of girls equal to 1/2 is to ensure that the last boy will never be counted in the census. This is one of the key points I try to make in the note Tom cited:

http://www.landsburg.org/tbayes.pdf

Tom: Thanks for doing a great job of sorting through and summarizing the comments and insights that so many people shared on this puzzle.
14 14 Tom
January 7, 2011 at 9:14 am

Harold, you said

I think Jonmathon Campbell’s N-1 case (the bit that was invisible before) not being random applies only to completed countries, doesn’t it?

Yes, we used the terminal boy.

By selecting only completed countries from the set of all countries, you have removed some “randomness” so you wouldn’t expect the results of the coin tosses to be completely random.

Yes.

Is the same as saying “be careful where k and N are both fixed?

It’s not exactly the same, but it’s the same kind of thing. In both cases we’re restricting what strings can belong to the ensemble. In the first case, what Jonathan was talking about, we only fixed k. But we already get something strange: N varies according to the birth sequence, and then for each N the elements of the ensemble are just the ways to shuffle the first N-1 births into different orders.

If we fix both N and k we’re left with just the shuffling part. We know how many kids, and we know how many boys.
15 15 Tom
January 7, 2011 at 9:20 am

Thomas & All —

I wanna make it clear, before any controversy has a chance to get started, that I understand & agree with the point Thomas and Steve have been making about incomplete generations. We could all be wrong but we’re in agreement on this point.

I restricted this post to completed families to save time and reader fatigue … but also because E(G/(G+B))<1/2 for an incomplete first generation follows so clearly from the argument in the completed case.

So please (please!) don't come and post "Tom said …" and force me to explain that I didn't say!
16 16 Tom
January 7, 2011 at 9:55 am

Anybody looking for the link to Anshuman’s comment in section 4.2, this is the correct link.
17 17 Harold
January 7, 2011 at 11:21 am

Thanks, of course k is equal to number of boys in a completed country.

Regarding again multi-generations, this does not seem to have been conclusively nailed. If your conditions include monogamy and no intergenerational breeding, then I think all countries go extinct. With k families, the absolute maximum families in the next generation is k. So all fluctuations that result in fewer boys or girls will result in un-paired children, and fewer families in each and every generation. Only in the unlikely event of exactly 50% boys will the numbers of families stay the same. It becomes more complex if spare children from one generation can pair with those from the next.

It seems to me the only way to get 1/2 is to make the population trend to infinity. This could be done with a large country, or a small country running for many generations. If all countries go extinct, then running for a long period becomes impossible. The answer to the problem then becomes “something less than 1/2, until the country goes extinct”.
18 18 Tom
January 7, 2011 at 11:58 am

Argh. An omission from the summary post, sorry folks. I should have credited Lubos for mentioning that extinction was a key issue. It is, and as far as I know he said so before Pietro, Neil and I did.
19 19 Tom
January 7, 2011 at 12:00 pm

To keep things “fair and balanced,” as befits a summary, here was a response from Steve on Lubos’ extinction point.

Very likely nobody else cares about these fine points, but once you get into this kind of review-paper gig it’s hard to stop.
20 20 Tom
January 7, 2011 at 12:11 pm

Harold,

Your second paragraph sounds right. Pietro pointed out that an expectation value of exactly 2 kids per family leaves the question of extinction open. He suggested that we use a branching process to analyze things more closely. I think he (or Neil?) was suggesting in effect that we define “family” as “one woman’s sequence of kids.” We abandon monogamy. (“Regrettably,” as Dr. Strangelove noted.) Men, as long as you have at least one per country, become inessential (as Maureen Dowd has pointed out).

I haven’t begun to look at that.
21 21 Jeff Semel
January 7, 2011 at 12:51 pm

Wonderful post, Tom. I think the end of the last sentence in 3.1 should read “for each N there is only one possible string.”

Thank you for pointing me to Henry’s comment on the Martingale analogy. I missed it the first time around.
22 22 Tom
January 7, 2011 at 1:44 pm

1. Jeff, thanks! Yes, that sentence should say N not k.

2. And, sorry, folks, another correction. Near the top of section 4.2 I wrote

E(G/(G+B)) ~ 1 – 1/(2*E(N)) = 1-1/8 = 0.375

The formula should be

E(G/(G+B)) ~ 1/2 – 1/(2*E(N)) = 1/2 – 1/8 = 0.375
23 23 Greg B
January 7, 2011 at 7:04 pm

Steve,

Using your slightly different problem where in a country that each family has children until it has either a boy or two children, then stops…

Assume we had 16 of these countries. According to the probabilities you listed, we should expect:

4 countries (1/4) would have zero girls
4 countries (1/8 + 1/8) would have one girl and two boys

4 countries (1/8 + 1/8) would have two girls and one boy
1 country (1/16) would have two girls and two boys
2 countries (1/16 + 1/16) would have three girls and one boy
1 country (1/16) would have four girls.

Therefore, if you choose one country of the sixteen at random, you have a 50% chance of choosing a country with less than half girls and a 50% chance of choosing a country with half or more girls. Assuming this is correct, why isn’t an answer of 1/2 as good as 7/16?
24 24 Steve Landsburg
January 7, 2011 at 8:06 pm

Greg B: Sorry, I briefly posted — and then deleted — a completely wrong response to your query. I hope you didn’t see it.

The right response is that you’ve correctly computed that the median country has 1/2 girls. But the problem does not ask for a median; it asks for a mean.
25 25 Greg B
January 7, 2011 at 8:57 pm

Steve,

I appreciate the prompt reply. I contend the question neither specifies using the median nor the mean. I further contend that only your chosen solution suggests calculating the mean is the proper solution to the problem. The question merely asks one to guess the fraction of girls in the hypothetical country. If we were to play this game using your two family example, and you chose 7/16 while I chose 1/2, you would be closer 50% of the time and I would be closer 50% of the time. Wouldn’t we? On the other hand, if we kept restarting the game until one of us was exactly right, I would eventually win and you could never win.

If you’ll permit me another question. I wondering if you could explain in plain English how the following reason is flawed. In a country with 1000 families, after the first round of births, I imagine you’ll agree the population is 500 boys and 500 girls. Now, the families with a boy are done and the 500 with a girl continue. After the second round of births, is there some reason I should expect anything other than a population of 750 boys and 750 girls?

I appreciate your time.
26 26 Steve Landsburg
January 7, 2011 at 9:32 pm

Greg B: If you look at the original post, you will see that the question asks for the expectation of the ratio. That, by definition, is the mean. I also explained that “expectation” means “average” (i.e. mean) for those who weren’t familiar with the terminology.

In a country with 1000 families, after the first round of births, I imagine you’ll agree the population is 500 boys and 500 girls.

Well, that’s one possibility. Another is 498 boys and 502 girls. Another is 400 boys and 600 girls. Another is 0 boys and 1000 girls. All those possiblities go into computing the average.
27 27 Steve Landsburg
January 7, 2011 at 9:35 pm

Greg B:

PS— Suppose you expect 750 boys and 750 girls. The ratio of those expectations is 1. But the ratio of the expectations is not the same thing as the expectation of the ratio, which is the entire moral of this puzzle.
28 28 Steve Landsburg
January 7, 2011 at 9:38 pm

Greg B: In fact, it occurs to me that the following unbelievably simple example might illustrate the point as well as anything:

Imagine one family that has one child. What is the expected ratio of boys to girls in the family?

There is a 1/2 chance that there are zero boys and one girls, for a ratio of 0. There is a 1/2 chance that there are 1 boy and zero girls, for a ratio of Infinity. The expected
ratio is therefore (1/2) x 0 + (1/2) x Infinity, which, if we allow ourselves to work with Infinity at all, is surely Infinity.

In this problem the expected number of boys equals the expected number of girls, so the ratio of the expectations is 1. But the expectation of the ratio is infinity. And the lesson is not to compute the one when you’re trying to compute the other.
29 29 Greg B
January 8, 2011 at 12:08 am

Steve,

First, some tedious points of fact.

1.) The first paragraph of your original post contained the question without reference to the “in expectation” phrase. In your second paragraph however, you modified the original question to include the “in expectation” phrase.

2.) The question posed in your slightly different problem where a country with two families would have children until either a boy or two children does not contain the phrase “in expectation”.

So, if adding the phrase “in expectation” or deciding the entire moral of the puzzle is learning the ratio of the expectations is not the same thing as the expectation of the ratio dictate using the formulas and methods you and others have graciously provided, then I concede the point and offer my gratitude for the lesson.

However, in the two family, one boy or two children problem, we agree on the possible outcomes and corresponding probabilities. If, in reality, such a country existed, we know with certainty that the actual fraction of girls cannot be 7/16, don’t we? I’ll ask one more time and then I’ll leave you alone, lest you begin to think I’m a troll. Is there any reason, other than “it’s the entire moral of the puzzle”, to believe that an answer of 7/16 is any better than 1/2? My contention remains 7/16 is a mildly worse answer because there isn’t any reason to anticipate it being closer to being right but it eliminates the possibility of being exactly right.
30 30 Steve Landsburg
January 8, 2011 at 6:11 am

Greg B: The two-family at-most-one-child example was intended to be read by people who were already following this discussion and therefore knew what question was being asked. Obviously, if you don’t know question is being asked then you can’t be expected to get the right answer, so 1/2 is as good as 7/16. For that matter, 3/4, 6/Pi^2, and .01001000100001…. are all equally good answers, because tthey are also answers to questions someone might have asked.

But if you know what the question is, the reason to prefer the answer of 7/16 to the answer of 1/2 is that 7/16 is correct and 1/2 is incorrect.

7/16 is a mildly worse answer because there isn’t any reason to anticipate it being closer to being right but it eliminates the possibility of being exactly right.

On the contrary. 7/16 is the one and only exactly right answer to the question that’s being asked.
31 31 Greg B
January 8, 2011 at 7:38 am

Steve,

Allow me to follow your example and offer to put some money at stake. I can write and you can verify a computer program to simulate your two family, one-boy or two child scenario. We can run the simulation as many times as you wish and each time the simulation produces a result equal to 7/16, I will donate $1000 to a charity of your choice, and each time the simulation produces a result equal to 1/2, you will donate $1000 to a charity of my choice. Deal?

Getting back to the original puzzle of having children until boy, which allegedly has been asked at Google, which of these statements is false?

1) The author of a puzzle is uniquely entitled to assert the moral of the puzzle.
2) You aren’t the author of the puzzle.
3) The author of the puzzle has not revealed to you the moral of the puzzle.
32 32 Steve Landsburg
January 8, 2011 at 10:21 am

Greg B:

Your statement number 1) is obviously false.

Now: I know of a continent with many countries, half of which have two lakes and half of which have four lakes. Which of the following statements do you think is false?

1) The average number of lakes per country is three.
2) No country has exactly three lakes.
3) The average must be a number that actually occurs among the things you’re averaging.
33 33 Greg B
January 8, 2011 at 11:19 am

Steve,

Statement number 3 is false and explains why computing the average is not necessarily the best approach when asked to predict what might actually occur.
34 34 Steve Landsburg
January 8, 2011 at 12:45 pm

Greg B: I’m glad we agree that statement number 3 is false. I hope we can also agree that computing the average is the best approach when asked to compute the average.
35 35 Greg B
January 8, 2011 at 2:13 pm

Yes, of course. Where we obviously disagree is twofold:

1) Whether the original puzzle is asking the reader to compute an average. You seem to find it most interesting that the ratio of the expectations is not the same thing as the expectation of the ratio, whereas I find it most interesting that in every example you create to illustrate your point, your calculation results in an answer that isn’t even in the set of possible solutions. Further, I enjoy these kinds of puzzles primarily because it is interesting to try and/ or learn different approaches to solving the puzzle and the basis for each.

In case there is any confusion, the original puzzle, according to paragraph one of your initial article entitled “Are You Smarter Than Google” is as follows:

There’s a certain country where everybody wants to have a son. Therefore each couple keeps having children until they have a boy; then they stop. What fraction of the population is female?

2) Whether anyone other than the author can assert with complete certainty the moral/lesson/purpose of the puzzle. You seem to feel you are in a position to assert the moral of the puzzle, whereas I can imagine a variety of possible morals but don’t feel I’m in a position to assert one of them.

Enjoy the rest of your weekend.
36 36 HowardW
January 8, 2011 at 2:30 pm

One can generalize the problem with the assumption that the gender of each child is i.i.d. but the chance of a girl is p < 1, not necessarily one-half. In such a case, the expected fraction of girls in K completed families tends to p – p(1-p)/K, as K becomes large. The expected number of children is K boys and Kp/(1-p) girls, so E[N]=K/(1-p). Combining, E[frac] = p – p/E[N].

So one shouldn't refer to an extra half boy, but p of a boy. ;)
37 37 Steve Landsburg
January 8, 2011 at 2:56 pm

Greg B: You did an excellent job of reading the first paragraph of the original post. You might have tried reading the second.
38 38 Steve Landsburg
January 8, 2011 at 10:54 pm

Tom:

It occurs to me that there’s another point worth making here.

Some might be tempted to say “Ah, yes, well, I thought all along that the expected ratio should be 1/2, and now I see that the only thing getting in the way of that is an extra half boy, which is pretty much negligible in a large population. That’s why the expected ratio E(G/G+B) goes to 1/2 as k gets large.”

But that’s plain wrong. To see why, apply it to the expected ratio E(B/G). One might equally well argue “Ah, well, boys are as likely as girls, so B/G really ought to be 1. Now I see that my expectations are only off by half a boy, which is negligible in a large population, so that’s why E(B/G) goes to 1 as k gets large”.

The problem with this is that E(B/G) doesn’t go to 1 as k gets large; it goes to infinity.

So any argument that says “half-boys are irrelevant in large populations, therefore we can pretty much ignore them when we compute expectations” is thoroughly invalid, even if it happens to lead to right answers some of the time.
39 39 Tom
January 8, 2011 at 11:16 pm

It’s an interesting point. Depending on what we’re interested in, half a boy could be huge.

Solomon exploited that point to solve a different problem, but the links are in a primitive chapter/verse system that DNS doesn’t recognize.
40 40 Steve Landsburg
January 9, 2011 at 8:21 am

Tom: :)
41 41 Tom
January 9, 2011 at 10:20 am

Steve,

No, but the point “be careful with expected ratios” is clearly important. One point in the space of possible outcomes can blow everything else away.
42 42 polyglot
January 10, 2011 at 2:09 am

Why has this problem attracted so much attention?
So far as I can make out- all the contributors to this debate are male.
Perhaps what appeals about this problem is it appears to justify being a wanker- the sex ratio will be always a little less than 50%, so it’s not my fault I don’t have a date- while at the same time there is the unconscious knowledge that under this rule, the population will crash much sooner than otherwise (which is almost as good as inventing Dr. Strangelove’s doomsday device) or one can fantasize oneself as the Omega man servicing all the women of the planet- all, that is, except the fatties. Fatties need not apply.

The Extra Half Boy

42 Responses to “The Extra Half Boy”

Leave a Reply

Search:

Recent Posts

Archives

Econ Blogs

Math Blogs

Philosophy Blogs

Science Blogs

Unclassified Blogs