lexyeevee

troublesome fox girl

hello i like to make video games and stuff and also have a good time on the computer. look @ my pinned for some of the video games and things. sometimes i am horny on @squishfox



fizbin
@fizbin

Two weeks ago, I wrote a post called "Your explanation of the Monty Hall problem is wrong" about the "Monty Hall Problem" and what I thought was missing from the standard explanation.

That post used Bayesian inference which involved using some math that some people find a little more advanced than necessary to explain this problem. This is an attempt to explain the same thing as before, but using simpler math. Probability aficionados will likely call this a "frequentist" explanation, but I think of it as explaining probability by using the multiverse concept that so many people are familiar with now from all the ways it shows up in SF and pop culture.

Also, I liberally illustrated this with poorly drawn cartoons composited out of one gigantic image in a CSS crime, so that's something.


lexyeevee
@lexyeevee

this has inspired in me an extremely satisfying explanation of the monty hall problem that doesn't need probability theory or counting cases

consider this version of the game:

there are three doors, A B C. without loss of generality, you choose door A. (if this doesn't sit well with you, just assume we labeled the doors after you picked one.)

monty hall now offers you a choice: would you like to open door A, or open both doors B and C?

and this is exactly the same as the regular problem. after all, what does it matter who opens the doors?

the only reason monty opens one is so that the final choice — stay or switch — seems to be a choice between two equivalent options, each a single door. and he deliberately opens a goat door to keep things ambiguous. but that's all just theatrics and has no bearing on the question.

switching has a ⅔ chance to win because you get to open two doors instead of one. that's it, that's the whole problem.

this also neatly explains all variants.

  • say there are five doors, and you choose two of them. monty opens two more, revealing goats. would you like to keep your two doors and open both, or open only the one that remains? switching gives you a ⅗ chance to win.

  • say monty always opens the goat door that's closest to him (and you know this). switching still gives you a ⅔ chance to win! his choice function just reveals extra information that divides the game into more cases: ⅓ of the time you now know exactly where the car is, and ⅔ of the time it's a coinflip. obviously if you know where the car is then you should go for it, but that still means switching, so your chance to win by switching is still ⅔.

  • say monty opens a door at random, and if he accidentally reveals the car, the whole game is restarted from the beginning. this does change your chances to win, but specifically because the game can only restart if you didn't pick the car, which is exactly when you should switch. in half of those games, you would have won by switching, but the game is abandoned before you even get offered a choice. so you end up with ⅓ win by staying, ⅓ win by switching, and ⅓ the whole game resets.

on the other hand, if your goal is to win a goat, then you should simply pick the door that monty opens


You must log in to comment.

in reply to @fizbin's post:

In the original game show as I remember it (I only saw it some in syndication), there weren't games where contestants would be offered the chance to switch a door they had selected. There were games where the host would offer the contestants various amounts of money to take the money instead of whatever was behind a given door, but that's a different sort of problem and has more to do with the psychology behind a game of chicken than with probability directly.

in reply to @lexyeevee's post:

Fortunately, it's generally possible to exchange a new car for more money than is needed to purchase a goat, though if you had your heart set on a particular goat from the game show stage you may still be disappointed.

that post also led me to my own thoroughly satisfying understanding of the problem which I think taps into the same kind of logic, looking at two very clean-cut definitives rather than probability (in a way that cleanly extends to probability in the additional variants). After over a decade of "believing but not liking the answers" it was very nice to finally just get it.

This explanation basically reverses the order of the steps "the host opens a door" and "the contestant chooses whether to keep her door", which is fine but you probably want to accompany it with an explanation for why doing that switch in order is fine.

Because here's an incorrect argument someone could make:

If we're going to consider doing things out of order, we might as well begin the game by having the host open a door. So the host eliminates one of the doors, then the contestant chooses one of the doors remaining, and then the contestant is asked whether to switch. This must be 50/50 (ed. note: it is). Therefore, since this is just an inconsequential reordering of the original (ed. note: it isn't) the original problem must also be 50/50.

In fact, one of the arguments made by the person I linked to as the "will likely never be convinced" example is very similar to this, though he replaces the host with a dartboard.

The incorrect argument fails because in the original problem it isn't important just that "the host opened door B to reveal a goat", but that "when the host's choices were constrained to «show a goat by opening one of doors B and C» they opened door B to show a goat". The incorrect argument removes the constraints on the host's behavior.

One way you could work this into another argument is to say that when the host opens a door to reveal a goat, all the probability "weight" that was behind the door flows to the doors remaining closed which the host could have opened. In the incorrect argument, opening door B makes the initial 1/3, 1/3, 1/3 split into 1/2, 0, 1/2, because in the incorrect argument the host was unconstrained when choosing a door to eliminate. In the standard problem, assuming that the contestant chose door A, the host opening door B changes the 1/3, 1/3, 1/3 split into 1/3, 0, 2/3 because the host was constrained to doors B and C, so the probability weight removed from door B all flows to door C.

(Formalizing this concept of probability weight gets you to Bayesian inference)

i think it was at least strongly implied: monty's opening a door doesn't give you any new information about where the car is, because you already knew there would be at least one goat among B + C, and all he's done is confirm that. if you decided to switch first and got to open both doors yourself, you would still open at least one door with a goat behind it, and all he's doing is skipping that step for you. what he does has no impact on either decision you make; the key is just viewing "switch" as meaning "switch to B + C" rather than "switch to one of B or C".

on the other hand if he opens a door before you ever get to pick one, then he's eliminated a door you might have chosen, and that of course affects your choice of door.

i'm familiar with the "probability flow" framing — it's the most common explanation, i think — but it's never sat well with me because it's phrased as though chance were a physical thing that's moving around as a result of someone's actions, and i think that's misleading in a different sort of way. it seems much more concrete to say that there's always a ⅔ chance the car is behind B or C, and that is still true after he opens one of the doors.

I agree that the host opening one of door B or C doesn't change the likelihood split between the buckets {prize behind door A} and {prize behind one of doors B, C}. It obviously does rearrange things within that second bucket, but that doesn't matter.

But here's the thing: I worked my way up to these Monty Hall posts by putting some posts of probability things up on Facebook first, and in one of them got into a rather uncomfortable argument with a long-term friend who was being wrong and upset about it. And her reasoning on that other problem was super close to the "we already knew that there was one goat so revealing it doesn't tell us anything" reasoning. So close, in fact, that I'm not sure how to distinguish between her reasoning that led to the wrong answer in that problem and your reasoning here that's leading to the correct answer. (And why I have to approach the problem from a completely different angle)

First off, note that doors B and C are equally likely at the start of the game to be "goat, goat", "goat, car" or "car, goat". If "car" and "goat" were classic alleles, and "car" was recessive, an entity made up of the BC genes would obviously have phenotype goat.

Anyway, the problem was this: two people who are both Rh+ in blood type have two biological children: first Ryan who is Rh- and then Sally who is Rh+. What is the chance that Sally is a carrier of the Rh- gene? Now, secondly, imagine that the Father's Rh+ gene codes for a variant of the Rh protein that can be detected in some test. Sally's blood tests positive for this variant, but the test doesn't tell us whether all the Rh protein in her blood is this variant or only some of it. In this second scenario, what is the chance that Sally is a carrier for Rh-?

My friend argued quite passionately that the answer to both questions should be ½. (In fact, the answer to the first question is ⅔ and to the second ½) In the original problem, we know that Sally's gotten at least one Rh+ gene from somewhere, so just take whatever gene was Rh+ and consider the other one: the other parent had a 50/50 chance of contributing an Rh- gene. In the second problem, we now know that the father contributed an Rh+ gene. The mother still has a 50/50 chance of passing along an Rh- gene. That the father's "door was opened" (that is, genetic contribution was disclosed) is irrelevant.

Now, these are different problems, but the reasoning that you're using echoes her incorrect reasoning strongly and I haven't quite been able to tease apart where her "determining that the father gave Rh+ is irrelevant" reasoning is faulty but your "showing that a goat was behind door B is irrelevant" reasoning still holds.

i think the difference gets to the heart of why monty hall is unintuitive in the first place—

it feels like monty hall is telling us something new in the middle of the game. but we know upfront that monty will reveal a goat. he has privileged information, and the structure of the game means he can always do this. the only thing he's actually telling us is that (at least) one of the unchosen doors had a goat behind it — and we already knew that! so we don't really learn anything important when he opens a door. we don't even know if his choice was forced or not. if the car were behind B or C, it's still behind B or C; as you point out we've just narrowed down where it would be.

but that's not what your blood test is. you, arbitrarily, without any foreknowledge, choose door B (the father) to peek behind, and the result is that there was a goat behind that door. but this is new information: it eliminates the possibility that the car was ever behind that particular door. monty hall doesn't remove possible initial conditions like this — if the car were behind door B, he would just avoid showing it to you.

the same thing happens if you reword the monty hall problem as "monty hall then opens door B to reveal a goat". essentially that's saying the car will never be behind door B, no matter how many times you play the game... so it must simply be behind either A or C, with equal probability.


hm. this feels like i'm just explaining the problem. it's hard to specifically defend my reasoning without introducing new ideas that i would then have to define for a casual reader. damn

the most compelling thing i've come up with is that the monty hall problem contains extra ambiguity, in that the other two doors are interchangeable, and we don't know which one monty will open for any given game. the "one door vs two doors" framing fixes this by making monty's action and your decision always operate on the same object — the set of doors you didn't choose. but if monty always opens door B, then doors B and C have distinct roles and you know different things about them, so collapsing them together doesn't make sense. (compare with a four-door game where you choose A, monty opens B to reveal a goat, and you can opt to open A or both C and D. switching now has a ⅔ chance to find the car again.)


i also wonder if the blood test question is harder to reason about because it seems like something you could only do once, after which you simply know the answer, whereas you can play the monty hall game as many times as you like. so the idea that "the car can never be behind door B" is harder to grapple with


also if you're still trying to convince the friend, maybe you'll have better luck getting to the heart of it with this:

  • you flip two coins. one comes up heads. what's the chance the the other comes up heads? (⅓)

  • you flip two coins. the first comes up heads. what's the chance the second comes up heads? (½)


you would not believe how many times i've edited this comment haha

Okay, that's fair, though that "the host is giving no information from opening the door" bit is crucial, and I think it deserves exploration. The way I think of it is that in the standard scenario, 2/3 of the time that the host opens door C, the prize is behind (B+C) while 1/3 of the time that the host opens door C the prize is behind A. Same for the host opening door B: 2/3 of the time that the host opens door B, the prize is behind (B+C) and 1/3 of the time that the host opens B the prize is behind door A. These are also the original ratios for door A versus doors (B+C), so no information.

That is, from the angle of "door A" versus "doors B, C" the host's actions change nothing. Fortunately, this is also the same angle that the switch vs keep decision is made from.


I have seen a few of the edits, and have also edited this comment a few times.


I now want to explore a different game show format in which the producers flip a coin for each door to decide whether to hide a car or a goat behind each door, but reject sets of flips that would have three goats or three cars. The rest of the game is played as before, except ... hrm. Maybe that isn't always such an interesting game.

I still kind of want to explore it because I spent so long getting my css crimes working for the original post and feel like I should use them more if I can, whereas for something with more doors I'd have to go and draw all new sets of images.

yeah the key is probably about what we know from the statement of the problem, though i'm having a time wording it precisely since it's a bit abstract. maybe this gets to the heart of why probability questions are just plain weird.

if you say "he opens door B and there's a goat", that alters the initial setup; we were just told the car could be anywhere, but games where the car is behind door B are apparently impossible (or in some sense exempt from consideration), so probabilities naturally change. but if you say "he opens a door with a goat behind it", that doesn't make any games impossible, so nothing changes.

i suspect another reason monty hall trips people up is that it's so very tempting to make the problem concrete and just picture him opening door B. but now we've thrown out all the games where that's where the car is. his choice of door is part of the individual game, not part of the problem.

yeah the key is probably about what we know from the statement of the problem

I think that's part of why another friend of mine in those Facebook discussions mentioned earlier keeps recommending Bayesian inference for all probability problems. (As in my prior unillustrated post from a few weeks ago) Sure it's more machinery and sometimes seems like overkill, but it also tends to "just work" and sidesteps philosophical questions of "wait, what are we asking about really" by viewing all probability as a measure of our personal uncertainty/lack of omniscience.

Maybe one way to get at it is as follows:
Two people who are both Rh+ in blood type have two biological children: first Ryan who is Rh- and then Sally.
What is the chance that Sally is Rh-? (1/4)
What is the chance that Sally is Rh+? (3/4)

The two people are tested and it turns out the father is RhD+ and the mother is RhCE+.
What is the chance that Sally is Rh-? (1/4)
What is the chance that Sally is Rh+? (3/4)
(a) What is the chance that Sally is both RhCE+ and RhD+? (1/4)
(a′) “What is the chance that Sally is a carrier of the Rh- gene?” (1 - (a), 3/4)
What is the chance that Sally is RhD+ only? (1/4)
What is the chance that Sally is at least RhD+? (1/2)
What is the chance that Sally is RhCE+ only? (1/4)
What is the chance that Sally is at least RhCE+? (1/2)

Sally’s blood is tested with a simple coagulation test and it turns out to be Rh+. Given this new information:
What is the chance that Sally is Rh-? (0)
What is the chance that Sally is Rh+? (1)
(a) What is the chance that Sally is both RhCE+ and RhD+? (1/3)
(a′) “What is the chance that Sally is a carrier of the Rh- gene?” (1 - (a), 2/3)
What is the chance that Sally is RhD+ only? (1/3)
What is the chance that Sally is at least RhD+? (2/3)
What is the chance that Sally is RhCE+ only? (1/3)
What is the chance that Sally is at least RhCE+? (2/3)

Sally needs a blood transfusion and the only blood available is RhD+ (by far the most common type). If Sally is RhCE+ only, she can’t use this blood, so she is tested for the presence of RhD and it turns out that she is at least RhD+. Hurray!
What is the chance that Sally is Rh-? (0)
What is the chance that Sally is Rh+? (1)
(a) What is the chance that Sally is both RhCE+ and RhD+? (1/2)
(a′) “What is the chance that Sally is a carrier of the Rh- gene?” (1 - (a), 1/2)
What is the chance that Sally is RhD+ only? (1/2)
What is the chance that Sally is at least RhD+? (1)
What is the chance that Sally is RhCE+ only? (0)
What is the chance that Sally is at least RhCE+? (1/2)

That Sally is RhD+ is not irrelevant to the question, “Does Sally have RhCE only or does she have both RhCE and RhD?” Learning that Sally has RhD eliminates the possibility that she is RhCE-only; it thereby makes it more likely to be the case that Sally has both RhD and RhCE. This is equivalent to saying that it makes it less likely that she is a “carrier of the Rh- gene” (which in this case means that she lacks the RhCE gene or lacks the RhD gene).

I was thinking about the blogger who will probably never be convinced, and it suddenly projected my mind back, probably over a decade, to watching a Daily Show or similar comedy news show segment on a man who was preparing for the end of the world, because, between the world ending, and the world not ending, that's two possibilities, so that's a fifty-fifty chance, and with those odds, it makes sense to prepare for the worst.