Analysis vs. Reality

In my days as a defense analyst I often encountered military officers who were skeptical about the ability of civilian analysts to draw valid conclusions from mathematical models about the merits of systems and tactics. I took me several years to understand and agree with their position. My growing doubts about the power of quantitative analysis of military matters culminated in a paper where I wrote that

combat is not a mathematical process…. One may describe the outcome of combat mathematically, but it is difficult, even after the fact, to determine the variables that made a difference in the outcome.

Much as we would like to fold the many different parameters of a weapon, a force, or a strategy into a single number, we can not. An analyst’s notion of which variables matter and how they interact is no substitute for data. Such data as exist, of course, represent observations of discrete events — usually peacetime events. It remains for the analyst to calibrate the observations, but without a benchmark to go by. Calibration by past battles is a method of reconstruction — of cutting one of several coats to fit a single form — but not a method of validation.

Lacking pertinent data, an analyst is likely to resort to models of great complexity. Thus, if useful estimates of detection probabilities are unavailable, the detection process is modeled; if estimates of the outcomes of dogfights are unavailable, aerial combat is reduced to minutiae. Spurious accuracy replaces obvious inaccuracy; untestable hypotheses and unchecked calibrations multiply apace. Yet the analyst claims relative if not absolute accuracy, certifying that he has identified, measured, and properly linked, a priori, the parameters that differentiate weapons, forces, and strategies.

In the end, “reasonableness” is the only defense of warfare models of any stripe.

It is ironic that analysts must fall back upon the appeal to intuition that has been denied to military men — whose intuition at least flows from a life-or-death incentive to make good guesses when choosing weapons, forces, or strategies.

My colleagues were not amused, to say the least.

I was reminded of all this by a recent exchange with a high-school classmate who had enlisted my help in tracking down a woman who, according to a genealogy website, is her first cousin, twice removed. The success of the venture is as yet uncertain. But if it does succeed it will be because of the classmate’s intimate knowledge of her family, not my command of research tools. As I said to my classmate,

You know a lot more about your family than I know about mine. I have all of the names and dates in my genealogy data bank, but I really don’t know much about their lives. After I moved to Virginia … I was out of the loop on family gossip, and my parents didn’t relate it to me. For example, when I visited my parents … for their 50th anniversary I happened to see a newspaper clipping about the death of my father’s sister a year earlier. It was news to me. And I didn’t learn of the death of my mother’s youngest brother (leaving her as the last of 10 children) until my sister happened to mention it to me a few years after he had died. And she didn’t know that I didn’t know.

All of which means that there’s a lot more to life than bare facts — dates of birth, death, etc. That’s why military people (with good reason) don’t trust analysts who draw conclusions about military weapons and tactics based on mathematical models. Those analysts don’t have a “feel” for how weapons and tactics actually work in the heat of battle, which is what matters.

Climate modelers are even more in the dark than military analysts because, unlike military officers with relevant experience, there’s no “climate officer” who can set climate modelers straight — or (more wisely) ignore them.

(See also “Modeling Is Not Science“, “The McNamara Legacy: A Personal Perspective“, “Analysis for Government Decision-Making: Hemi-Science, Hemi-Demi-Science, and Sophistry“, “Analytical and Scientific Arrogance“, “Why I Don’t Believe in ‘Climate Change’“, and “Predicting ‘Global’ Temperatures — An Analogy with Baseball“.)

Analysis for Government Decision-Making: Demi-Science, Hemi-Demi-Science, and Sophistry

Taking a “hard science” like classical mechanics as an epitome of science and, say, mechanical engineering, as a rigorous application of it, one travels a goodly conceptual distance before arriving at operations research (OR). Philip M. Morse and George E. Kimball, pioneers of OR in World War II, put it this way:

[S]uccessful application of operations research usually results in improvements by factors of 3 or 10 or more…. In our first study of any operation we are looking for these large factors of possible improvement…. They can be discovered if the [variables] are given only one significant figure,…any greater accuracy simply adds unessential detail.

One might term this type of thinking “hemibel thinking.” A bel is defined as a unit in a logarithmic scale corresponding to a factor of 10. Consequently a hemibel corresponds to a factor of the square root of 10, or approximately 3. (Philip M. Morse and George E. Kimball, Methods of Operations Research, originally published as Operations Evaluation Group Report 54, 1946, p. 38)

This is science-speak for the following proposition: Where there is much variability in the particular circumstances of combat, there is much uncertainty about the contributions of various factors (human, mechanical, and meteorological) the the outcome of combat. It is therefore difficult to assign precise numerical values to the various factors.

OR, even in wartime, is therefore, and at best, a demi-science. From there, we descend to cost-effectiveness analysis and its constituent branches: techniques for designing and estimating the costs of systems that do not yet exist and the effectiveness of such systems in combat. These methods, taken separately and together, are (to coin a term) hemi-demi-scientific — a fact that the application of “rigorous” mathematical and statistical techniques cannot alter.

There is no need to elaborate on the wild inaccuracy of estimates about the costs and physical performance of government-owned and operated systems, whether they are intended for military or civilian use. The gross errors of estimation have been amply documented in the public press for decades.

What is less well known is the difficulty of predicting the performance of systems — especially combat systems — years before they are commanded, operated, and maintained by human beings, under conditions that are likely to be far different than those envisioned when the systems were first proposed. A paper that I wrote thirty years ago gives my view of the great uncertainty that surrounds estimates of the effectiveness of systems that have yet to be developed, or built, or used in combat:

Aside from a natural urge for certainty, faith in quantitative models of warfare springs from the experience of World War II, when they seemed to lead to more effective tactics and equipment. But the foundation of this success was not the quantitative methods themselves. Rather, it was the fact that the methods were applied in wartime. Morse and Kimball put it well:

Operations research done separately from an administrator in charge of operations becomes an empty exercise. To be valuable it must be toughened by the repeated impact of hard operational facts and pressing day-by-day demands, and its scale of values must be repeatedly tested in the acid of use. Otherwise it may be philosophy, but it is hardly science. [Methods of Operations Research, p. 10]

Contrast this attitude with the attempts of analysts … to evaluate weapons, forces, and strategies with abstract models of combat. However elegant and internally consistent the models, they have remained as untested and untestable as the postulates of theology.

There is, of course, no valid test to apply to a warfare model. In peacetime, there is no enemy; in wartime, the enemy’s actions cannot be controlled. Morse and Kimball, accordingly, urge “hemibel thinking”:

Having obtained the constants of the operations under study… we compare the value of the constants obtained in actual operations with the optimum theoretical value, if this can be computed. If the actual value is within a hemibel (…a factor of 3) of the theoretical value, then it is extremely unlikely that any improvement in the details of the operation will result in significant improvement. [When] there is a wide gap between the actual and theoretical results … a hint as to the possible means of improvement can usually be obtained by a crude sorting of the operational data to see whether changes in personnel, equipment, or tactics produce a significant change in the constants. [Ibid., p. 38]

….

Much as we would like to fold the many different parameters of a weapon, a force, or a strategy into a single number, we can not. An analyst’s notion of which variables matter and how they interact is no substitute for data. Such data as exist, of course, represent observations of discrete events — usually peacetime events. It remains for the analyst to calibrate the observations, but without a benchmark to go by. Calibration by past battles is a method of reconstruction –of cutting one of several coats to fit a single form — but not a method of validation. Lacking pertinent data, an analyst is likely to resort to models of great complexity. Thus, if useful estimates of detection probabilities are unavailable, the detection process is modeled; if estimates of the outcomes of dogfights are unavailable, aerial combat is reduced to minutiae. Spurious accuracy replaces obvious inaccuracy; untestable hypotheses and unchecked calibrations multiply apace. Yet the analyst claims relative if not absolute accuracy, certifying that he has identified, measured, and properly linked, a priori, the parameters that differentiate weapons, forces, and strategies….

Should we really attach little significance to differences of less than a hemibel? Consider a five-parameter model, involving the conditional probabilities of detecting, shooting at, hitting, and killing an opponent — and surviving, in the first place, to do any of these things. Such a model might easily yield a cumulative error of a hemibel, given a twenty-five percent error in each parameter. My intuition is that one would be lucky if relative errors in the probabilities assigned to alternative weapons and forces were as low as twenty-five percent.

The further that one travels from an empirical question, such as the likely effectiveness of an extant weapon system under specific, quantifiable conditions, the more likely one is to encounter the kind of sophistry known as policy analysis. It is in this kind of analysis that one  — more often than not — encounters in the context of broad policy issues (e.g., government policy toward health care, energy, or defense spending). Such analysis is constructed so that it favors the prejudices of the analyst or his client, or support the client’s political case for a certain policy.

Policy analysis often seems credible, especially on first hearing or reading it. But, on inspection, it is usually found to have at least two of these characteristics:

  • It stipulates or quickly arrives at a preferred policy, then marshals facts, calculations, and opinions that are selected because they support the preferred policy.
  • If it offers and assesses alternative policies, they are not placed on an equal footing with the preferred policy. They are, for example, assessed against criteria that favor the preferred policy, while other criteria (which might be important ones) are ignored or given short shrift.
  • It is wrapped in breathless prose, dripping with words and phrases like “aggressive action,”grave consequences,” and “sense of urgency.”

No discipline or quantitative method is rigorous enough to redeem policy analysis, but two disciplines are especially suited to it: political “science” and macroeconomics. Both are couched in the language of real science, but both lend themselves perfectly to the old adage: garbage in, garbage out.

Do I mean to suggest that broad policy issues should not be addressed as analytically as possible? Not at all. What I mean to suggest is that because such issues cannot be illuminated with scientific rigor, they are especially fertile ground for sophists with preconceived positions.

In that respect, the model of cost-effectiveness analysis, with all of its limitations, is to be emulated. Put simply, it is to state a clear objective in a way that does not drive the answer; reveal the assumptions underlying the analysis; state the relevant variables (factors influencing the attainment of the objective); disclose fully the data, the sources of data, and analytic methods; and explore openly and candidly the effects of variations in key assumptions and critical variables.