analysis for decision-making

Analysis for Government Decision-Making: Demi-Science, Hemi-Demi-Science, and Sophistry

Taking a “hard science” like classical mechanics as an epitome of science and, say, mechanical engineering, as a rigorous application of it, one travels a goodly conceptual distance before arriving at operations research (OR). Philip M. Morse and George E. Kimball, pioneers of OR in World War II, put it this way:

[S]uccessful application of operations research usually results in improvements by factors of 3 or 10 or more…. In our first study of any operation we are looking for these large factors of possible improvement…. They can be discovered if the [variables] are given only one significant figure,…any greater accuracy simply adds unessential detail.

One might term this type of thinking “hemibel thinking.” A bel is defined as a unit in a logarithmic scale corresponding to a factor of 10. Consequently a hemibel corresponds to a factor of the square root of 10, or approximately 3. (Philip M. Morse and George E. Kimball, Methods of Operations Research, originally published as Operations Evaluation Group Report 54, 1946, p. 38)

This is science-speak for the following proposition: Where there is much variability in the particular circumstances of combat, there is much uncertainty about the contributions of various factors (human, mechanical, and meteorological) the the outcome of combat. It is therefore difficult to assign precise numerical values to the various factors.

OR, even in wartime, is therefore, and at best, a demi-science. From there, we descend to cost-effectiveness analysis and its constituent branches: techniques for designing and estimating the costs of systems that do not yet exist and the effectiveness of such systems in combat. These methods, taken separately and together, are (to coin a term) hemi-demi-scientific — a fact that the application of “rigorous” mathematical and statistical techniques cannot alter.

There is no need to elaborate on the wild inaccuracy of estimates about the costs and physical performance of government-owned and operated systems, whether they are intended for military or civilian use. The gross errors of estimation have been amply documented in the public press for decades.

What is less well known is the difficulty of predicting the performance of systems — especially combat systems — years before they are commanded, operated, and maintained by human beings, under conditions that are likely to be far different than those envisioned when the systems were first proposed. A paper that I wrote thirty years ago gives my view of the great uncertainty that surrounds estimates of the effectiveness of systems that have yet to be developed, or built, or used in combat:

Aside from a natural urge for certainty, faith in quantitative models of warfare springs from the experience of World War II, when they seemed to lead to more effective tactics and equipment. But the foundation of this success was not the quantitative methods themselves. Rather, it was the fact that the methods were applied in wartime. Morse and Kimball put it well:

Operations research done separately from an administrator in charge of operations becomes an empty exercise. To be valuable it must be toughened by the repeated impact of hard operational facts and pressing day-by-day demands, and its scale of values must be repeatedly tested in the acid of use. Otherwise it may be philosophy, but it is hardly science. [Methods of Operations Research, p. 10]

Contrast this attitude with the attempts of analysts … to evaluate weapons, forces, and strategies with abstract models of combat. However elegant and internally consistent the models, they have remained as untested and untestable as the postulates of theology.

There is, of course, no valid test to apply to a warfare model. In peacetime, there is no enemy; in wartime, the enemy’s actions cannot be controlled. Morse and Kimball, accordingly, urge “hemibel thinking”:

Having obtained the constants of the operations under study… we compare the value of the constants obtained in actual operations with the optimum theoretical value, if this can be computed. If the actual value is within a hemibel (…a factor of 3) of the theoretical value, then it is extremely unlikely that any improvement in the details of the operation will result in significant improvement. [When] there is a wide gap between the actual and theoretical results … a hint as to the possible means of improvement can usually be obtained by a crude sorting of the operational data to see whether changes in personnel, equipment, or tactics produce a significant change in the constants. [Ibid., p. 38]

….

Much as we would like to fold the many different parameters of a weapon, a force, or a strategy into a single number, we can not. An analyst’s notion of which variables matter and how they interact is no substitute for data. Such data as exist, of course, represent observations of discrete events — usually peacetime events. It remains for the analyst to calibrate the observations, but without a benchmark to go by. Calibration by past battles is a method of reconstruction –of cutting one of several coats to fit a single form — but not a method of validation. Lacking pertinent data, an analyst is likely to resort to models of great complexity. Thus, if useful estimates of detection probabilities are unavailable, the detection process is modeled; if estimates of the outcomes of dogfights are unavailable, aerial combat is reduced to minutiae. Spurious accuracy replaces obvious inaccuracy; untestable hypotheses and unchecked calibrations multiply apace. Yet the analyst claims relative if not absolute accuracy, certifying that he has identified, measured, and properly linked, a priori, the parameters that differentiate weapons, forces, and strategies….

Should we really attach little significance to differences of less than a hemibel? Consider a five-parameter model, involving the conditional probabilities of detecting, shooting at, hitting, and killing an opponent — and surviving, in the first place, to do any of these things. Such a model might easily yield a cumulative error of a hemibel, given a twenty-five percent error in each parameter. My intuition is that one would be lucky if relative errors in the probabilities assigned to alternative weapons and forces were as low as twenty-five percent.

The further that one travels from an empirical question, such as the likely effectiveness of an extant weapon system under specific, quantifiable conditions, the more likely one is to encounter the kind of sophistry known as policy analysis. It is in this kind of analysis that one  — more often than not — encounters in the context of broad policy issues (e.g., government policy toward health care, energy, or defense spending). Such analysis is constructed so that it favors the prejudices of the analyst or his client, or support the client’s political case for a certain policy.

Policy analysis often seems credible, especially on first hearing or reading it. But, on inspection, it is usually found to have at least two of these characteristics:

  • It stipulates or quickly arrives at a preferred policy, then marshals facts, calculations, and opinions that are selected because they support the preferred policy.
  • If it offers and assesses alternative policies, they are not placed on an equal footing with the preferred policy. They are, for example, assessed against criteria that favor the preferred policy, while other criteria (which might be important ones) are ignored or given short shrift.
  • It is wrapped in breathless prose, dripping with words and phrases like “aggressive action,”grave consequences,” and “sense of urgency.”

No discipline or quantitative method is rigorous enough to redeem policy analysis, but two disciplines are especially suited to it: political “science” and macroeconomics. Both are couched in the language of real science, but both lend themselves perfectly to the old adage: garbage in, garbage out.

Do I mean to suggest that broad policy issues should not be addressed as analytically as possible? Not at all. What I mean to suggest is that because such issues cannot be illuminated with scientific rigor, they are especially fertile ground for sophists with preconceived positions.

In that respect, the model of cost-effectiveness analysis, with all of its limitations, is to be emulated. Put simply, it is to state a clear objective in a way that does not drive the answer; reveal the assumptions underlying the analysis; state the relevant variables (factors influencing the attainment of the objective); disclose fully the data, the sources of data, and analytic methods; and explore openly and candidly the effects of variations in key assumptions and critical variables.