BACKGROUND

A few weeks ago I posted “A Bayesian Puzzle”. I took it down because Bayesianism warranted more careful treatment than I had given it. But while the post was live at Ricochet (where I cross-posted in September-November), I had an exchange with a reader who is an obdurate believer in single-event probabilities, such as “the probability of heads on the next coin flip is 50 percent” and “the probability of 12 on the next roll of a pair of dice is 1/18”. That wasn’t the first such exchange of its kind that I’ve had; “Some Thoughts about Probability” reports an earlier and more thoughtful exchange with a believer in single-event probabilities.

DISCUSSION

A believer in single-event probabilities takes the view that a single flip of a coin or roll of dice has a probability. I do not. A probability represents the frequency with which an outcome occurs over the very long run, and it is only an average that conceals random variations.

The outcome of a single coin flip can’t be reduced to a percentage or probability. It can only be described in terms of its discrete, mutually exclusive possibilities: heads (H) or tails (T). The outcome of a single roll of a die or pair of dice can only be described in terms of the number of points that may come up, 1 through 6 or 2 through 12.

Yes, the expected frequencies of H, T, and and various point totals can be computed by simple mathematical operations. But those are only expected frequencies. They say nothing about the next coin flip or dice roll, nor do they more than approximate the actual frequencies that will occur over the next 100, 1,000, or 10,000 such events.

Of what value is it to know that the probability of H is 0.5 when H fails to occur in 11 consecutive flips of a fair coin? Of what value is it to know that the probability of rolling a  7 is 0.167 — meaning that 7 comes up every 6 rolls, on average — when 7 may not appear for 56 consecutive rolls? These examples are drawn from simulations of 10,000 coin flips and 1,000 dice rolls. They are simulations that I ran once — not simulations that I cherry-picked from many runs. (The Excel file is at https://drive.google.com/open?id=1FABVTiB_qOe-WqMQkiGFj2f70gSu6a82 — coin flips are on the first tab, dice rolls are on the second tab.)

Let’s take another example, which is more interesting, and has generated much controversy of the years. It’s the Monty Hall problem,

a brain teaser, in the form of a probability puzzle, loosely based on the American television game show Let’s Make a Deal and named after its original host, Monty Hall. The problem was originally posed (and solved) in a letter by Steve Selvin to the American Statistician in 1975…. It became famous as a question from a reader’s letter quoted in Marilyn vos Savant’s “Ask Marilyn” column in Parade magazine in 1990 … :

Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice

Vos Savant’s response was that the contestant should switch to the other door…. Under the standard assumptions, contestants who switch have a 2/3 chance of winning the car, while contestants who stick to their initial choice have only a 1/3 chance.

Vos Savant’s answer is correct, but only if the contestant is allowed to play an unlimited number of games. A player who adopts a strategy of “switch” in every game will, in the long run, win about 2/3 of the time (explanation here). That is, the player has a better chance of winning if he chooses “switch” rather than “stay”.

Read the preceding paragraph carefully and you will spot the logical defect that underlies the belief in single-event probabilities: The long-run winning strategy (“switch”) is transformed into a “better chance” to win a particular game. What does that mean? How does an average frequency of 2/3 improve one’s chances of winning a particular game? It doesn’t. As I show here, game results are utterly random; that is, the average frequency of 2/3 has no bearing on the outcome of a single game.

I’ll try to drive the point home by returning to the coin-flip game, with money thrown into the mix. A \$1 bet on H means a gain of \$1 if H turns up, and a loss of \$1 if T turns up. The expected value of the bet — if repeated over a very large number of trials — is zero. The bettor expects to win and lose the same number of times, and to walk away no richer or poorer than when he started. And for a very large number of games, the better will walk away approximately (but not necessarily exactly) neither richer nor poorer than when he started. How many games? In the simulation of 10,000 games mentioned earlier, H occurred 50.6 percent of the time. A very large number of games is probably at least 100,000.

Let us say, for the sake of argument, that a bettor has played 100,00 coin-flip games at \$1 a game and come out exactly even. What does that mean for the play of the next game? Does it have an expected value of zero?

To see why the answer is “no”, let’s make it interesting and say that the bet on the next game — the next coin flip — is \$10,000. The size of the bet should wonderfully concentrate the bettor’s mind. He should now see the situation for what it really is: There are two possible outcomes, and only one of them will be realized. An average of the two outcomes is meaningless. The single coin flip doesn’t have a “probability” of 0.5 H and 0.5 T and an “expected payoff” of zero. The coin will come up either H or T, and the bettor will either lose \$10,000 or win \$10,000.

To repeat: The outcome of a single coin flip doesn’t have an expected value for the bettor. It has two possible values, and the bettor must decide whether he is willing to lose \$10,000 on the single flip of a coin.

By the same token (or coin), the outcome of a single roll of a pair of dice doesn’t have a 1-in-6 probability of coming up 7. It has 36 possible outcomes and 11 possible point totals, and the bettor must decide how much he is willing to lose if he puts his money on the wrong combination or outcome.

CONCLUSION

It is a logical fallacy to ascribe a probability to a single event. A probability represents the observed or computed average value of a very large number of like events. A single event cannot possess that average value. A single event has a finite number of discrete and mutually exclusive outcomes. Those outcomes will not “average out” — only one of them will obtain, like Schrödinger’s cat.

To say that the outcomes will average out — which is what a probability implies — is tantamount to saying that Jack Sprat and his wife were neither skinny nor fat because their body-mass indices averaged to a normal value. It is tantamount to saying that one can’t drown by walking across a pond with an average depth of 1 foot, when that average conceals the existence of a 100-foot-deep hole.

# “Settled Science” and the Monty Hall Problem

The so-called 97-percent consensus among climate scientists about anthropogenic global warming (AGW) isn’t evidence of anything but the fact that scientists are only human. Even if there were such a consensus, it certainly wouldn’t prove the inchoate theory of AGW, any more than the early consensus against Einstein’s special theory of relativity disproved that theory.

Actually, in the case of AGW, the so-called consensus is far from a consensus about the extent of warming, its causes, and its implications. (See, for example, this post and this one.) But it’s undeniable that a lot of climate scientists believe in a “strong” version of AGW, and in its supposedly dire consequences for humanity.

Why is that? Well, in a field as inchoate as climate science, it’s easy to let one’s prejudices drive one’s research agenda and findings, even if only subconsciously. And isn’t it more comfortable and financially rewarding to be with the crowd and where the money is than to stand athwart the conventional wisdom? (Lennart Bengtsson certainly found that to be the case.) Moreover, there was, in the temperature records of the late 20th century, a circumstantial case for AGW, which led to the development of theories and models that purport to describe a strong relationship between temperature and CO2. That the theories and models are deeply flawed and lacking in predictive value seems not to matter to the 97 percent (or whatever the number is).

In other words, a lot of climate scientists have abandoned the scientific method, which demands skepticism, in order to be on the “winning” side of the AGW issue. How did it come to be thought of as the “winning” side? Credit vocal so-called scientists who were and are (at least) guilty of making up models to fit their preconceptions, and ignoring evidence that human-generated CO2 is a minor determinant of atmospheric temperature. Credit influential non-scientists (e.g., Al Gore) and various branches of the federal government that have spread the gospel of AGW and bestowed grants on those who can furnish evidence of it. Above all, credit the media, which for the past two decades has pumped out volumes of biased, half-baked stories about AGW, in the service of the “liberal” agenda: greater control of the lives and livelihoods of Americans.

Does this mean that the scientists who are on the AGW bandwagon don’t believe in the correctness of AGW theory? I’m sure that most of them do believe in it — to some degree. They believe it at least to the same extent as a religious convert who zealously proclaims his new religion to prove (mainly to himself) his deep commitment to that religion.

What does all of this have to do with the Monty Hall problem? This:

Making progress in the sciences requires that we reach agreement about answers to questions, and then move on. Endless debate (think of global warming) is fruitless debate. In the Monty Hall case, this social process has actually worked quite well. A consensus has indeed been reached; the mathematical community at large has made up its mind and considers the matter settled. But consensus is not the same as unanimity, and dissenters should not be stifled. The fact is, when it comes to matters like Monty Hall, I’m not sufficiently skeptical. I know what answer I’m supposed to get, and I allow that to bias my thinking. It should be welcome news that a few others are willing to think for themselves and challenge the received doctrine. Even though they’re wrong. (Brian Hayes, “Monty Hall Redux” (a book review), American Scientist, September-October 2008)

The admirable part of Hayes’s statement is its candor: Hayes admits that he may have adopted the “consensus” answer because he wants to go with the crowd.

The dismaying part of Hayes’s statement is his smug admonition to accept “consensus” and move on. As it turns out the “consensus” about the Monty Hall problem isn’t what it’s cracked up to be. A lot of very bright people have solved a tricky probability puzzle, but not the Monty Hall problem. (For the details, see my post, “The Compleat Monty Hall Problem.”)

And the “consensus” about AGW is very far from being the last word, despite the claims of true believers. (See, for example, the relatively short list of recent articles, posts, and presentations given at the end of this post.)

Going with the crowd isn’t the way to do science. It’s certainly not the way to ascertain the contribution of human-generated CO2 to atmospheric warming, or to determine whether the effects of any such warming are dire or beneficial. And it’s most certainly not the way to decide whether AGW theory implies the adoption of policies that would stifle economic growth and hamper the economic betterment of millions of Americans and billions of other human beings — most of whom would love to live as well as the poorest of Americans.

Given the dismal track record of global climate models, with their evident overstatement of the effects of CO2 on temperatures, there should be a lot of doubt as to the causes of rising temperatures in the last quarter of the 20th century, and as to the implications for government action. And even if it could be shown conclusively that human activity will temperatures to resume the rising trend of the late 1900s, several important questions remain:

• To what extent would the temperature rise be harmful and to what extent would it be beneficial?
• To what extent would mitigation of the harmful effects negate the beneficial effects?
• What would be the costs of mitigation, and who would bear those costs, both directly and indirectly (e.g., the effects of slower economic growth on the poorer citizens of thw world)?
• If warming does resume gradually, as before, why should government dictate precipitous actions — and perhaps technologically dubious and economically damaging actions — instead of letting households and businesses adapt over time by taking advantage of new technologies that are unavailable today?

Those are not issues to be decided by scientists, politicians, and media outlets that have jumped on the AGW bandwagon because it represents a “consensus.” Those are issues to be decided by free, self-reliant, responsible persons acting cooperatively for their mutual benefit through the mechanism of free markets.

*     *     *

Roy Spencer, “95% of Climate Models Agree: The Observations Must Be Wrong,” Roy Spencer, Ph.D., February 7, 2014
Roy Spencer, “Top Ten Good Skeptical Arguments,” Roy Spencer, Ph.D., May 1, 2014
Ross McKittrick, “The ‘Pause’ in Global Warming: Climate Policy Implications,” presentation to the Friends of Science, May 13, 2014 (video here)
Patrick Brennan, “Abuse from Climate Scientists Forces One of Their Own to Resign from Skeptic Group after Week: ‘Reminds Me of McCarthy’,” National Review Online, May 14, 2014
Anthony Watts, “In Climate Science, the More Things Change, the More They Stay the Same,” Watts Up With That?, May 17, 2014
Christopher Monckton of Brenchley, “Pseudoscientists’ Eight Climate Claims Debunked,” Watts Up With That?, May 17, 2014
John Hinderaker, “Why Global Warming Alarmism Isn’t Science,” PowerLine, May 17, 2014
Tom Sheahan, “The Specialized Meaning of Words in the “Antarctic Ice Shelf Collapse’ and Other Climate Alarm Stories,” Watts Up With That?, May 21, 2014
Anthony Watts, “Unsettled Science: New Study Challenges the Consensus on CO2 Regulation — Modeled CO2 Projections Exaggerated,” Watts Up With That?, May 22, 2014
Daniel B. Botkin, “Written Testimony to the House Subcommittee on Science, Space, and Technology,” May 29, 2014

# The Compleat Monty Hall Problem

Wherein your humble blogger gets to the bottom of the Monty Hall problem, sorts out the conflicting solutions, and declares that the standard solution is the right solution, but not to the Monty Hall problem as it’s usually posed.

THE MONTY HALL PROBLEM AND THE TWO “SOLUTIONS”

The Monty Hall problem, first posed as a statistical puzzle in 1975, has been notorious since 1990, when Marilyn vos Savant wrote about it in Parade. Her solution to the problem, to which I will come, touched off a controversy that has yet to die down. But her solution is now widely accepted as the correct one; I refer to it here as the standard solution.

This is from the Wikipedia entry for the Monty Hall problem:

The Monty Hall problem is a brain teaser, in the form of a probability puzzle (Gruber, Krauss and others), loosely based on the American television game show Let’s Make a Deal and named after its original host, Monty Hall. The problem was originally posed in a letter by Steve Selvin to the American Statistician in 1975 (Selvin 1975a), (Selvin 1975b). It became famous as a question from a reader’s letter quoted in Marilyn vos Savant‘s “Ask Marilyn” column in Parade magazine in 1990 (vos Savant 1990a):

Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?

Here’s a complete statement of the problem:

1. A contestant sees three doors. Behind one of the doors is a valuable prize, which I’ll denote as \$. Undesirable or worthless items are behind the other two doors; I’ll denote those items as x.

2. The contestant doesn’t know which door conceals \$ and which doors conceal x.

3. The contestant chooses a door at random.

4. The host, who knows what’s behind each of the doors, opens one of the doors not chosen by the contestant.

5. The door chosen by the host may not conceal \$; it must conceal an x. That is, the host always opens a door to reveal an x.

6. The host then asks the contestant if he wishes to stay with the door he chose initially (“stay”) or switch to the other unopened door (“switch”).

7. The contestant decides whether to stay or switch.

8. The host then opens the door finally chosen by the contestant.

9. If \$ is revealed, the contestant wins; if x is revealed the contestant loses.

One solution (the standard solution) is to switch doors because there’s a 2/3 probability that \$ is hidden behind the unopened door that the contestant didn’t choose initially. In vos Savant’s own words:

Yes; you [the contestant] should switch. The first [initially chosen] door has a 1/3 chance of winning, but the second [other unopened] door has a 2/3 chance.

The other solution (the alternative solution) is indifference. Those who propound this solution maintain that there’s a equal chance of finding \$ behind either of the doors that remain unopened after the host has opened a door.

As it turns out, the standard solution doesn’t tell a contestant what to do in a particular game. But the standard solution does point to the right strategy for someone who plays or bets on a large number of games.

The alternative solution accurately captures the unpredictability of any particular game. But indifference is only a break-even strategy for a person who plays or bets on a large number of games.

EXPLANATION OF THE STANDARD SOLUTION

The contestant may choose among three doors, and there are three possible ways of arranging the items behind the doors: S x x; x \$ x; and x x \$. The result is nine possible ways in which a game may unfold:

Events 1, 5, and 9 each have two branches. But those branches don’t count as separate events. They’re simply subsets of the same event; when the contestant chooses a door that hides \$, the host must choose between the two doors that hide x, but he can’t open both of them. And his choice doesn’t affect the outcome of the event.

It’s evident that switching would pay off with a win in 2/3 of the possible events; whereas, staying with the original choice would off in only 1/3 of the possible events. The fractions 1/3 and 2/3 are usually referred to as probabilities: a 2/3 probability of winning \$ by switching doors, as against a 1/3 probability of winning \$ by staying with the initially chosen door.

Accordingly, proponents of the standard solution — who are now legion — advise the individual (theoretical) contestant to switch. The idea is that switching increases one’s chance (probability) of winning.

A CLOSER LOOK AT THE STANDARD SOLUTION

There are three problems with the standard solution:

1. It incorporates a subtle shift in perspective. The Monty Hall problem, as posed, asks what a contestant should do. The standard solution, on the other hand, represents the expected (long-run average) outcome of many events, that is, many plays of the game. For reasons I’ll come to, the outcome of a single game can’t be described by a probability.

2.  Lists of possibilities, such as those in the diagram above, fail to reflect the randomness inherent in real events.

3. Probabilities emerge from many repetitions of the kinds of events listed above. It is meaningless to ascribe a probability to a single event. In case of the Monty Hall problem, many repetitions of the game will yield probabilities approximating those given in the standard solution, but the outcome of each repetition will be unpredictable. It is therefore meaningless to say that a contestant has a 2/3 chance of winning a game if he switches. A 2/3 chance of winning refers to the expected outcome of many repetitions, where the contestant chooses to switch every time. To put it baldly: How does a person win 2/3 of a game? He either wins or doesn’t win.

Regarding points 2 and 3, I turn to Probability, Statistics and Truth (second revised English edition, 1957), by Richard von Mises:

The rational concept of probability, which is the only basis of probability calculus, applies only to problems in which either the same event repeats itself again and again, or a great number of uniform elements are involved at the same time. Using the language of physics, we may say that in order to apply toe theory of probability we must have a practically unlimited sequence of uniform observations. (p. 11)

*     *     *

In games of dice, the individual event is a single throw of the dice from the box and the attribute is the observation of the number of points shown by the dice. In the same of “heads or tails”, each toss of the coin is an individual event, and the side of the coin which is uppermost is the attribute. (p. 11)

*     *     *

We must now introduce a new term…. This term is “the collective”, and it denotes a sequence of uniform events or processes which differ by certain observable attributes…. All the throws of dice made in the course of a game [of many throws] from a collective wherein the attribute of the single event is the number of points thrown…. The definition of probability which we shall give is concerned with ‘the probability of encountering a single attribute [e.g., winning \$ rather than x ] in a given collective [a series of attempts to win \$ rather than x ]. (pp. 11-12)

*     *     *

[A] collective is a mass phenomenon or a repetitive event, or, simply, a long sequence of observations for which there are sufficient reasons to believe that the relative frequency of the observed attributed would tend to a fixed limit if the observations were indefinitely continued. The limit will be called the probability of the attribute considered within the collective [emphasis in the original]. (p. 15)

*     *     *

The result of each calculation … is always … nothing else but a probability, or, using our general definition, the relative frequency of a certain event in a sufficiently long (theoretically, infinitely long) sequence of observations. The theory of probability can never lead to a definite statement concerning a single event. The only question that it can answer is: what is to be expected in the course of a very long sequence of observations? It is important to note that this statement remains valid also if the calculated probability has one of the two extreme values 1 or 0 [emphasis added]. (p. 33)

To bring the point home, here are the results of 50 runs of the Monty Hall problem, where each result represents (i) a random initial choice between Door 1, Door 2, and Door 3; (ii) a random array of \$, x, and x behind the three doors; (iii) the opening of a door (other than the one initially chosen) to reveal an x; and (iv) a decision, in every case, to switch from the initially chosen door to the other unopened door:

What’s relevant here isn’t the fraction of times that \$ appears, which is 3/5 — slightly less than the theoretical value of 2/3.  Just look at the utter randomness of the results. The first three outcomes yield the “expected” ratio of two wins to one loss, though in the real game show the two winners and one loser would have been different persons. The same goes for any sequence, even the final — highly “improbable” (i.e., random) — string of nine straight wins (which would have accrued to nine different contestants). And who knows what would have happened in games 51, 52, etc.

If a person wants to win 2/3 of the time, he must find a game show that allows him to continue playing the game until he has reached his goal. As I’ve found in my simulations, it could take as many as 10, 20, 70, or 300 games before the cumulative fraction of wins per game converges on 2/3.

That’s what it means to win 2/3 of the time. It’s not possible to win a single game 2/3 of the time, which is the “logic” of the standard solution as it’s usually presented.

The alternative solution doesn’t offer a winning strategy. In this view of the Monty Hall problem, it doesn’t matter which unopened door a contestant chooses. In effect, the contestant is advised to flip a coin.

As discussed above, the outcome of any particular game is unpredictable, so a coin flip will do just as well as any other way of choosing a door. But randomly selecting an unopened door isn’t a good strategy for repeated plays of the game. Over the long run, random selection means winning about 1/2 of all games, as opposed to 2/3 for the “switch” strategy. (To see that the expected probability of winning through random selection approaches 1/2, return to the earlier diagram; there, you’ll see that \$ occurs in 9/18 = 1/2 of the possible outcomes for “stay” and “switch” combined.)

Proponents of the alternative solution overlook the importance of the host’s selection of a door to open. His choice isn’t random. Therein lies the secret of the standard solution — as a long-run strategy.

WHY THE STANDARD SOLUTION WORKS IN THE LONG RUN

It’s commonly said by proponents of the standard solution that when the host opens a door, he gives away information that the contestant can use to increase his chance of winning that game. One nonsensical version of this explanation goes like this:

• There’s a 2/3 probability that \$ is behind one of the two doors not chosen initially by the contestant.
• When the host opens a door to reveal x, that 2/3 “collapses” onto the other door that wasn’t chosen initially. (Ooh … a “collapsing” probability. How exotic. Just like Schrödinger’s cat.)

Of course, the host’s action gives away nothing in the context of a single game, the outcome of which is unpredictable. The host’s action does help in the long run, if you’re in a position to play or bet on a large number of games. Here’s how:

• The contestant’s initial choice (IC) will be wrong 2/3 of the time. That is, in 2/3 of a large number of games, the \$ will be behind one of the other two doors.
• Because of the rules of the game, the host must open one of those other two doors (HC1 and HC2); he can’t open IC.
• When IC hides an x (which happens 2/3 of the time), either HC1 and HC2 must conceal the \$; the one that doesn’t conceal the \$ conceals an x.
• The rules require the host to open the door that conceals an x.
• Therefore, about 2/3 of the time the \$ will be behind HC1 or HC2, and in those cases it will always be behind the door (HC1 or HC2) that the host doesn’t open.
• It follows that the contestant, by consistently switching from IC to the remaining unopened door (HC1 or HC2), will win the \$ about 2/3 of the time.

The host’s action transforms the probability — the long-run frequency — of choosing the winning door from 1/2 to 2/3. But it does so if and only if the player or bettor always switches from IC to HC1 or HC2 (whichever one remains unopened).

You can visualize the steps outlined above by looking at the earlier diagram of possible outcomes.

That’s all there is. There isn’t any more.

# Understanding the Monty Hall Problem

I have extensively revised, expanded, and republished this post as “The Compleat Monty Hall Problem.” Please go there.

I first encountered  the Monty Hall problem a few years ago. My intuitive answer to the problem was wrong, according to the explanation of the problem that I read at the time. I forgot about the problem until a few days ago, when I gave the answer I had given before and read, again, that I had answered incorrectly. With two strikes, I began to focus on the problem, in an effort to understand why my intuitive answer is incorrect, and why the “correct”  answer is indeed correct.

What is the Monty Hall problem? It’s a probability puzzle named for the original host of an old TV game-show, Let’s Make a Deal. The puzzle acquired its name because the setup of the puzzle is similar to one of the challenges faced by contestants who appeared on the show. The puzzle goes like this:

1. There are three doors: 1, 2, and 3.

2. Behind one of the doors is an item of some value, which I denote with \$\$.

3. Behind each of the other two doors are items of little or no value, which I denote with ε (epsilon, often used in scientific notation to represent an error term).

4. The contestant chooses one of the doors, but the item behind it is not revealed (yet).

5. The host, who knows what is behind each of the doors, opens a door other than the one chosen by the contestant. The host always opens a door to reveal an ε. He is able to do so because no matter which door the contestant chooses, there is at least one other door that conceals an ε.

6. The host then asks the contestant if he wants to stick with the door he has chosen and accept the prize hidden behind it, or if he wants to switch to the other unopened door and accept the prize hidden behind that door. The host asks the question regardless of what is behind the door that the contestant chose originally; the question is not asked for the purpose of enticing the contestant to abandon a choice that would yield \$\$.

This is where the intuitive answer fails. The intuitive answer is to stick with the door already chosen. Why? It seems that the usual assumption is that there is an equal probability of finding \$\$ behind any door, so that switching won’t necessarily lead to a better outcome. Nor (if the usual assumption is correct) would switching lead to a worse outcome, so the contestant might as well switch. But there seems to be a psychological preference for standing pat.

Unlike many situations in which snap judgments are necessary and often effective — because they are based on training, practice, and the ability to pick up subtle clues (e.g., the spin on a pitched ball, the moment at which an opponent is vulnerable to counterattack) — the usual snap judgement about the Monty Hall problem is wrong. It is wrong because the assumption about probabilities is wrong. This becomes obvious when the problem is modeled:

The usual intuitive answer focuses on one  possible outcome (A, for instance), to the exclusion of other  possible outcomes (B and C). And the usual intuitive answer would be correct if A were the only possible outcome. But A is not the only possible outcome. The contestant’s choice of Door 1 does not preclude B and C as possible outcomes.

Given that B and C are also possible outcomes, the probability that \$\$ is behind Door 1 is 1/3. And that probability remains unchanged by the host’s revelation.

Further, given that B and C are also possible outcomes, the contestant’s real choice is between Door 1 (in this example) and whichever door the host does not open — not between Door 1 and Door 2 or Door 1 and Door 3. A look at the figure above tells you that if the contestant chooses whichever door the host does not open (either Door 2 or Door 3), the probability of revealing \$\$ is 2/3 (two of the three possible outcomes). (See this video for an elegantly simple explanation of the same principle.)

In general: The array of possible outcomes (A, B, and C) is always the same (even if displayed in a different order), so it does not matter whether the contestant’s initial choice is Door 1, Door 2, or Door 3. The probabilities associated with the contestant’s initial and second choices remain unchanged. You can work it out for yourself.

The Monty Hall problem is the kind of problem that warrants careful thought instead of a snap judgment. The Monty Hall problem should be a lesson to anyone who is confronted with a situation in which a snap judgment is unnecessary. The snap judgment can lead you badly astray.