Expressing Certainty (or Uncertainty)

I have waged war on the misuse of probability for a long time. As I say in the post at the link:

A probability is a statement about a very large number of like events, each of which has an unpredictable (random) outcome. Probability, properly understood, says nothing about the outcome of an individual event. It certainly says nothing about what will happen next.

From a later post:

It is a logical fallacy to ascribe a probability to a single event. A probability represents the observed or computed average value of a very large number of like events. A single event cannot possess that average value. A single event has a finite number of discrete and mutually exclusive outcomes. Those outcomes will not “average out” — only one of them will obtain, like Schrödinger’s cat.

To say that the outcomes will average out — which is what a probability implies — is tantamount to saying that Jack Sprat and his wife were neither skinny nor fat because their body-mass indices averaged to a normal value. It is tantamount to saying that one can’t drown by walking across a pond with an average depth of 1 foot, when that average conceals the existence of a 100-foot-deep hole.

But what about hedge words that imply “probability” without saying it: certain, uncertain, likely, unlikely, confident, not confident, sure, unsure, and the like? I admit to using such words, which are common in discussions about possible future events and the causes of past events. But what do I, and presumably others, mean by them?

Hedge words are statements about the validity of hypotheses about phenomena or causal relationships. There are two ways of looking at such hypotheses, frequentist and Bayesian:

While for the frequentist, a hypothesis is a proposition (which must be either true or false) so that the frequentist probability of a hypothesis is either 0 or 1, in Bayesian statistics, the probability that can be assigned to a hypothesis can also be in a range from 0 to 1 if the truth value is uncertain.

Further, as discussed above, there is no such thing as the probability of a single event. For example, the Mafia either did or didn’t have JFK killed, and that’s all there is to say about that. One might claim to be “certain” that the Mafia had JFK killed, but one can be certain only if one is in possession of incontrovertible evidence to that effect. But that certainty isn’t a probability, which can refer only to the frequency with which many events of the same kind have occurred and can be expected to occur.

A Bayesian view about the “probability” of the Mafia having JFK killed is nonsensical. Even If a Bayesian is certain, based on incontrovertible evidence, that the Mafia had JFK killed, there is no probability attached to the occurrence. It simply happened, and that’s that.

Lacking such evidence, a Bayesian (or an unwitting “man on the street”) might say “I believe there’s a 50-50 chance that the Mafia had JFK killed”. Does that mean (1) there’s some evidence to support the hypothesis, but it isn’t conclusive, or (2) that the speaker would bet X amount of money, at even odds, that incontrovertible evidence (if any) surfaces it will prove that the Mafia had JFK killed? In the first case, attaching a 50-percent probability to the hypothesis is nonsensical; how does the existence of some evidence translate into a statement about the probability of a one-off event that either occurred or didn’t occur? In the second case, the speaker’s willingness to bet on the occurrence of an event at certain odds tells us something about the speaker’s preference for risk-taking but nothing at all about whether or not the event occurred.

What about the familiar use of “probability” (a.k.a., “chance”) in weather forecasts? Here’s my take:

[W]hen you read or hear a statement like “the probability of rain tomorrow is 80 percent”, you should mentally translate it into language like this:

X guesses that Y will (or will not) happen at time Z, and the “probability” that he attaches to his guess indicates his degree of confidence in it.

The guess may be well-informed by systematic observation of relevant events, but it remains a guess. As most Americans have learned and relearned over the years, when rain has failed to materialize or has spoiled an outdoor event that was supposed to be rain-free.

Further, it is true that some things happen more often than other things but

only one thing will happen at a given time and place.

[A] clever analyst could concoct a probability of a person’s being shot by writing an equation that includes such variables as his size, the speed with which he walks, the number of shooters, their rate of fire, and the distance across the shooting range.

What would the probability estimate mean? It would mean that if a very large number of persons walked across the shooting range under identical conditions, approximately S percent of them would be shot. But the clever analyst cannot specify which of the walkers would be among the S percent.

Here’s another way to look at it. One person wearing head-to-toe bullet-proof armor could walk across the range a large number of times and expect to be hit by a bullet on S percent of his crossings. But the hardy soul wouldn’t know on which of the crossings he would be hit.

Suppose the hardy soul became a foolhardy one and made a bet that he could cross the range without being hit. Further, suppose that S is estimated to be 0.75; that is, 75 percent of a string of walkers would be hit, or a single (bullet-proof) walker would be hit on 75 percent of his crossings. Knowing the value of S, the foolhardy fellow offers to pay out \$1 million dollars if he crosses the range unscathed — one time — and claim \$4 million (for himself or his estate) if he is shot. That’s an even-money bet, isn’t it?

No it isn’t….

The bet should be understood for what it is, an either-or-proposition. The foolhardy walker will either lose \$1 million or win \$4 million. The bettor (or bettors) who take the other side of the bet will either win \$1 million or lose \$4 million.

As anyone with elementary reading and reasoning skills should be able to tell, those possible outcomes are not the same as the outcome that would obtain (approximately) if the foolhardy fellow could walk across the shooting range 1,000 times. If he could, he would come very close to breaking even, as would those who bet against him.

I omitted from the preceding quotation a sentence in which I used “more likely”:

If a person walks across a shooting range where live ammunition is being used, he is more likely to be killed than if he walks across the same patch of ground when no one is shooting.

Inasmuch as “more likely” is a hedge word, I seem to have contradicted my own position about the probability of a single event, such as being shot while walking across a shooting range. In that context, however, “more likely” means that something could happen (getting shot) that wouldn’t happen in a different situation. That’s not really a probabilistic statement. It’s a statement about opportunity; thus:

• Crossing a firing range generates many opportunities to be shot.
• Going into a crime-ridden neighborhood certainly generates some opportunities to be shot, but their number and frequency depends on many variables: which neighborhood, where in the neighborhood, the time of day, who else is present, etc.
• Sitting by oneself, unarmed, in a heavy-gauge steel enclosure generates no opportunities to be shot.

The “chance” of being shot is, in turn, “more likely”, “likely”, and “unlikely” — or a similar ordinal pattern that uses “certain”, “confident”, “sure”, etc. But the ordinal pattern, in any case, can never (logically) include statements like “completely certain”, “completely confident”, etc.

An ordinal pattern is logically valid only if it conveys the relative number of opportunities to attain a given kind of outcome — being shot, in the example under discussion.

Ordinal statements about different types of outcome are meaningless. Consider, for example, the claim that the probability that the Mafia had JFK killed is higher than (or lower than or the same as) the probability that the moon is made of green cheese. First, and to repeat myself for the nth time, the phenomena in question are one-of-a-kind and do not lend themselves to statements about their probability, nor even about the frequency of opportunities for the occurrence of the phenomena. Second, the use of “probability” is just a hifalutin way of saying that the Mafia could have had a hand in the killing of JFK, whereas it is known (based on ample scientific evidence, including eye-witness accounts) that the Moon isn’t made of green cheese. So the ordinal statement is just a cheap rhetorical trick that is meant to (somehow) support the subjective belief that the Mafia “must” have had a hand in the killing of JFK.

Similarly, it is meaningless to say that the “average person” is “more certain” of being killed in an auto accident than in a plane crash, even though one may have many opportunities to die in an auto accident or a plane crash. There is no “average person”; the incidence of auto travel and plane travel varies enormously from person to person; and the conditions that conduce to fatalities in auto travel and plane travel vary just as enormously.

Other examples abound. Be on the lookout for them, and avoid emulating them.