Modeling Is Not Science: Another Demonstration

The title of this post is an allusion to an earlier one: “Modeling Is Not Science“. This post addresses a model that is the antithesis of science. Tt seems to have been extracted from the ether. It doesn’t prove what its authors claim for it. It proves nothing, in fact, but the ability of some people to dazzle other people with mathematics.

In this case, a writer for MIT Technology Review waxes enthusiastic about

the work of Alessandro Pluchino at the University of Catania in Italy and a couple of colleagues. These guys [sic] have created a computer model of human talent and the way people use it to exploit opportunities in life. The model allows the team to study the role of chance in this process.

The results are something of an eye-opener. Their simulations accurately reproduce the wealth distribution in the real world. But the wealthiest individuals are not the most talented (although they must have a certain level of talent). They are the luckiest. And this has significant implications for the way societies can optimize the returns they get for investments in everything from business to science.

Pluchino and co’s [sic] model is straightforward. It consists of N people, each with a certain level of talent (skill, intelligence, ability, and so on). This talent is distributed normally around some average level, with some standard deviation. So some people are more talented than average and some are less so, but nobody is orders of magnitude more talented than anybody else….

The computer model charts each individual through a working life of 40 years. During this time, the individuals experience lucky events that they can exploit to increase their wealth if they are talented enough.

However, they also experience unlucky events that reduce their wealth. These events occur at random.

At the end of the 40 years, Pluchino and co rank the individuals by wealth and study the characteristics of the most successful. They also calculate the wealth distribution. They then repeat the simulation many times to check the robustness of the outcome.

When the team rank individuals by wealth, the distribution is exactly like that seen in real-world societies. “The ‘80-20’ rule is respected, since 80 percent of the population owns only 20 percent of the total capital, while the remaining 20 percent owns 80 percent of the same capital,” report Pluchino and co.

That may not be surprising or unfair if the wealthiest 20 percent turn out to be the most talented. But that isn’t what happens. The wealthiest individuals are typically not the most talented or anywhere near it. “The maximum success never coincides with the maximum talent, and vice-versa,” say the researchers.

So if not talent, what other factor causes this skewed wealth distribution? “Our simulation clearly shows that such a factor is just pure luck,” say Pluchino and co.

The team shows this by ranking individuals according to the number of lucky and unlucky events they experience throughout their 40-year careers. “It is evident that the most successful individuals are also the luckiest ones,” they say. “And the less successful individuals are also the unluckiest ones.”

The writer, who is dazzled by pseudo-science, gives away his Obamanomic bias (“you didn’t build that“) by invoking fairness. Luck and fairness have nothing to do with each other. Luck is luck, and it doesn’t make the beneficiary any less deserving of the talent, or legally obtained income or wealth, that comes his way.

In any event, the model in question is junk. To call it junk science would be to imply that it’s just bad science. But it isn’t science; it’s a model pulled out of thin air. The modelers admit this in the article cited by the Technology Review writer, “Talent vs. Luck, the Role of Randomness in Success and Failure“:

In what follows we propose an agent-based model, called “Talent vs Luck” (TvL) model, which builds on a small set of very simple assumptions, aiming to describe the evolution of careers of a group of people influenced by lucky or unlucky random events.

We consider N individuals, with talent Ti (intelligence, skills, ability, etc.) normally distributed in the interval [0; 1] around a given mean mT with a standard deviation T , randomly placed in xed positions within a square world (see Figure 1) with periodic boundary conditions (i.e. with a toroidal topology) and surrounded by a certain number NE of “moving” events (indicated by dots), someone lucky, someone else unlucky (neutral events are not considered in the model, since they have not relevant effects on the individual life). In Figure 1 we report these events as colored points: lucky ones, in green and with relative percentage pL, and unlucky ones, in red and with percentage (100􀀀pL). The total number of event-points NE are uniformly distributed, but of course such a distribution would be perfectly uniform only for NE ! 1. In our simulations, typically will be NE N=2: thus, at the beginning of each simulation, there will be a greater random concentration of lucky or unlucky event-points in different areas of the world, while other areas will be more neutral. The further random movement of the points inside the square lattice, the world, does not change this fundamental features of the model, which exposes dierent individuals to dierent amount of lucky or unlucky events during their life, regardless of their own talent.

In other words, this is a simplistic, completely abstract model set in a simplistic, completely abstract world, using only the authors’ assumptions about the values of a small number of abstract variables and the effects of their interactions. Those variables are “talent” and two kinds of event: “lucky” and “unlucky”.

What could be further from science — actual knowledge — than that? The authors effectively admit the model’s complete lack of realism when they describe “talent”:

[B]y the term “talent” we broadly mean intelligence, skill, smartness, stubbornness, determination, hard work, risk taking and so on.

Think of all of the ways that those various — and critical — attributes vary from person to person. “Talent”, in other words, subsumes an array of mostly unmeasured and unmeasurable attributes, without distinguishing among them or attempting to weight them. The authors might as well have called the variable “sex appeal” or “body odor”. For that matter, given the complete abstractness of the model, they might as well have called its three variables “body mass index”, “elevation”, and “race”.

It’s obvious that the model doesn’t account for the actual means by which wealth is acquired. In the model, wealth is just the mathematical result of simulated interactions among an arbitrarily named set of variables. It’s not even a multiple regression model based on statistics. (Although no set of statistics could capture the authors’ broad conception of “talent”.)

The modelers seem surprised that wealth isn’t normally distributed. But that wouldn’t be a surprise if they were to consider that wealth represents a compounding effect, which naturally favors those with higher incomes over those with lower incomes. But they don’t even try to model income.

So when wealth (as modeled) doesn’t align with “talent”, the discrepancy — according to the modelers — must be assigned to “luck”. But a model that lacks any nuance in its definition of variables, any empirical estimates of their values, and any explanation of the relationship between income and wealth cannot possibly tell us anything about the role of luck in the determination of wealth.

At any rate, it is meaningless to say that the model is valid because its results mimic the distribution of wealth in the real world. The model itself is meaningless, so any resemblance between its results and the real world is coincidental (“lucky”) or, more likely, contrived to resemble something like the distribution of wealth in the real world. On that score, the authors are suitably vague about the actual distribution, pointing instead to various estimates.

(See also “Modeling, Science, and Physics Envy” and “Modeling Revisited“.)

Macroeconomic Modeling Revisited

Modeling is not science. Take Professor Ray Fair, for example. He teaches macroeconomic theory, econometrics, and macroeconometric models at Yale University. He has been plying his trade since 1968, first at Princeton, then at M.I.T., and (since 1974) at Yale. Those are big-name schools, so I assume that Prof. Fair is a big name in his field.

Well, since 1983 Professor Fair has been forecasting changes in real GDP four quarters ahead. He has made dozens of forecasts based on a model that he has tweaked many times over the years. The current model can be found here. His forecasting track record is here.

How has he done? Here’s how:

1. The mean absolute error of his forecasts is 70 percent; that is, on average his predictions vary by 70 percent from actual rates of growth.

2. The median absolute error of his forecasts is 33 percent.

3. His forecasts are systematically biased: too high when real, four-quarter GDP growth is less than 3 percent; too low when real, four-quarter GDP growth is greater than 3 percent. (See figure 1.)

4. His forecasts have grown generally worse — not better — with time. (See figure 2.)

5. In sum, the overall predictive value of the model is weak. (See figures 3 and 4.)

FIGURE 1

Figures 1-4 are derived from The Forecasting Record of the U.S. Model, Table 4: Predicted and Actual Values for Four-Quarter Real Growth, at Fair’s website.

FIGURE 2

FIGURE 3

FIGURE 4

Given the foregoing, you might think that Fair’s record reflects the persistent use of a model that’s too simple to capture the dynamics of a multi-trillion-dollar economy. But you’d be wrong. The model changes quarterly. This page lists changes only since late 2009; there are links to archives of earlier versions, but those are password-protected.

As for simplicity, the model is anything but simple. For example, go to Appendix A: The U.S. Model: July 29, 2016, and you’ll find a six-sector model comprising 188 equations and hundreds of variables.

Could I do better? Well, I’ve done better, with the simple model that I devised to estimate the Rahn Curve. It’s described in “The Rahn Curve in Action“, which is part III of “Economic Growth Since World War II“.

The theory behind the Rahn Curve is simple — but not simplistic. A relatively small government with powers limited mainly to the protection of citizens and their property is worth more than its cost to taxpayers because it fosters productive economic activity (not to mention liberty). But additional government spending hinders productive activity in many ways, which are discussed in Daniel Mitchell’s paper, “The Impact of Government Spending on Economic Growth.” (I would add to Mitchell’s list the burden of regulatory activity, which grows even when government does not.)

What does the Rahn Curve look like? Mitchell estimates this relationship between government spending and economic growth:

Rahn curve_Mitchell

The curve is dashed rather than solid at low values of government spending because it has been decades since the governments of developed nations have spent as little as 20 percent of GDP. But as Mitchell and others note, the combined spending of governments in the U.S. was 10 percent (and less) until the eve of the Great Depression. And it was in the low-spending, laissez-faire era from the end of the Civil War to the early 1900s that the U.S. enjoyed its highest sustained rate of economic growth.

Elsewhere, I estimated the Rahn curve that spans most of the history of the United States. I came up with this relationship (terms modified for simplicity (with a slight cosmetic change in terminology):

Yg = 0.054 -0.066F

To be precise, it’s the annualized rate of growth over the most recent 10-year span (Yg), as a function of F (fraction of GDP spent by governments at all levels) in the preceding 10 years. The relationship is lagged because it takes time for government spending (and related regulatory activities) to wreak their counterproductive effects on economic activity. Also, I include transfer payments (e.g., Social Security) in my measure of F because there’s no essential difference between transfer payments and many other kinds of government spending. They all take money from those who produce and give it to those who don’t (e.g., government employees engaged in paper-shuffling, unproductive social-engineering schemes, and counterproductive regulatory activities).

When F is greater than the amount needed for national defense and domestic justice — no more than 0.1 (10 percent of GDP) — it discourages productive, growth-producing, job-creating activity. And because government spending weighs most heavily on taxpayers with above-average incomes, higher rates of F also discourage saving, which finances growth-producing investments in new businesses, business expansion, and capital (i.e., new and more productive business assets, both physical and intellectual).

I’ve taken a closer look at the post-World War II numbers because of the marked decline in the rate of growth since the end of the war (Figure 2).

Here’s the revised result, which accounts for more variables:

Yg = 0.0275 -0.340F + 0.0773A – 0.000336R – 0.131P

Where,

Yg = real rate of GDP growth in a 10-year span (annualized)

F = fraction of GDP spent by governments at all levels during the preceding 10 years

A = the constant-dollar value of private nonresidential assets (business assets) as a fraction of GDP, averaged over the preceding 10 years

R = average number of Federal Register pages, in thousands, for the preceding 10-year period

P = growth in the CPI-U during the preceding 10 years (annualized).

The r-squared of the equation is 0.74 and the F-value is 1.60E-13. The p-values of the intercept and coefficients are 0.093, 3.98E-08, 4.83E-09, 6.05E-07, and 0.0071. The standard error of the estimate is 0.0049, that is, about half a percentage point.

Here’s how the equation stacks up against actual 10-year rates of real GDP growth:

What does the new equation portend for the next 10 years? Based on the values of F, A, R, and P for 2008-2017, the real rate of growth for the next 10 years will be about 2.0 percent.

There are signs of hope, however. The year-over-year rate of real growth in the four most recent quarters (2017Q4 – 2018Q3) were 2.4, 2.6, 2.9, and 3.0 percent, as against the dismal rates of 1.4, 1.2, 1.5, and 1.8 percent for four quarters of 2016 — Obama’s final year in office. A possible explanation is the election of Donald Trump and the well-founded belief that his tax and regulatory policies would be more business-friendly.

I took the data set that I used to estimate the new equation and made a series of out-of-sample estimates of growth over the next 10 years. I began with the data for 1946-1964 to estimate the growth for 1965-1974. I continued by taking the data for 1946-1965 to estimate the growth for 1966-1975, and so on, until I had estimated the growth for every 10-year period from 1965-1974 through 2008-2017. In other words, like Professor Fair, I updated my model to reflect new data, and I estimated the rate of economic growth in the future. How did I do? Here’s a first look:

FIGURE 5

For ease of comparison, I made the scale of the vertical axis of figure 5 the same as the scale of the vertical axis of figure 2. It’s obvious that my estimate of the Rahn Curve does a much better job of predicting the real rate of GDP growth than does Fair’s model.

Not only that, but my model is less biased:

FIGURE 6

The systematic bias reflected in figure 6 is far weaker than the systematic bias in Fair’s estimates (figure 1).

Finally, unlike Fair’s model (figure 4), my model captures the downward trend in the rate of real growth:

FIGURE 7

The moral of the story: It’s futile to build complex models of the economy. They can’t begin to capture the economy’s real complexity, and they’re likely to obscure the important variables — the ones that will determine the future course of economic growth.

A final note: Elsewhere (e.g., here) I’ve disparaged economic aggregates, of which GDP is the apotheosis. And yet I’ve built this post around estimates of GDP. Am I contradicting myself? Not really. There’s a rough consistency in measures of GDP across time, and I’m not pretending that GDP represents anything but an estimate of the monetary value of those products and services to which monetary values can be ascribed.

As a practical matter, then, if you want to know the likely future direction and value of GDP, stick with simple estimation techniques like the one I’ve demonstrated here. Don’t get bogged down in the inconclusive minutiae of a model like Professor Fair’s.

The Pretence of Knowledge

Updated, with links to a related article and additional posts, and republished.

Friedrich Hayek, in his Nobel Prize lecture of 1974, “The Pretence of Knowledge,” observes that

the great and rapid advance of the physical sciences took place in fields where it proved that explanation and prediction could be based on laws which accounted for the observed phenomena as functions of comparatively few variables.

Hayek’s particular target was the scientism then (and still) rampant in economics. In particular, there was (and is) a quasi-religious belief in the power of central planning (e.g., regulation, “stimulus” spending, control of the money supply) to attain outcomes superior to those that free markets would yield.

But, as Hayek says in closing,

There is danger in the exuberant feeling of ever growing power which the advance of the physical sciences has engendered and which tempts man to try, “dizzy with success” … to subject not only our natural but also our human environment to the control of a human will. The recognition of the insuperable limits to his knowledge ought indeed to teach the student of society a lesson of humility which should guard him against becoming an accomplice in men’s fatal striving to control society – a striving which makes him not only a tyrant over his fellows, but which may well make him the destroyer of a civilization which no brain has designed but which has grown from the free efforts of millions of individuals.

I was reminded of Hayek’s observations by John Cochrane’s post, “Groundhog Day” (The Grumpy Economist, May 11, 2014), wherein Cochrane presents this graph:

The fed's forecasting models are broken

Cochrane adds:

Every serious forecast looked like this — Fed, yes, but also CBO, private forecasters, and the term structure of forward rates. Everyone has expected bounce-back growth and rise in interest rates to start next year, for the last 6 years. And every year it has not happened. Welcome to the slump. Every year, Sonny and Cher wake us up, and it’s still cold, and it’s still grey. But we keep expecting spring tomorrow.

Whether the corrosive effects of government microeconomic and regulatory policy, or a failure of those (unprintable adjectives) Republicans to just vote enough wasted-spending Keynesian stimulus, or a failure of the Fed to buy another $3 trillion of bonds, the question of the day really should be why we have this slump — which, let us be honest, no serious forecaster expected.

(I add the “serious forecaster” qualification on purpose. I don’t want to hear randomly mined quotes from bloviating prognosticators who got lucky once, and don’t offer a methodology or a track record for their forecasts.)

The Fed’s forecasting models are nothing more than sophisticated charlatanism — a term that Hayek applied to pseudo-scientific endeavors like macroeconomic modeling. Nor is charlatanism confined to economics and the other social “sciences.” It’s rampant in climate “science,” as Roy Spencer has shown. Consider, for example, this graph from Spencers’s post, “95% of Climate Models Agree: The Observations Must Be Wrong” (Roy Spencer, Ph.D., February 7, 2014):

95% of climate models agree_the observations must be wrong

Spencer has a lot more to say about the pseudo-scientific aspects of climate “science.” This example is from “Top Ten Good Skeptical Arguments” (May 1, 2014):

1) No Recent Warming. If global warming science is so “settled”, why did global warming stop over 15 years ago (in most temperature datasets), contrary to all “consensus” predictions?

2) Natural or Manmade? If we don’t know how much of the warming in the longer term (say last 50 years) is natural, then how can we know how much is manmade?

3) IPCC Politics and Beliefs. Why does it take a political body (the IPCC) to tell us what scientists “believe”? And when did scientists’ “beliefs” translate into proof? And when was scientific truth determined by a vote…especially when those allowed to vote are from the Global Warming Believers Party?

4) Climate Models Can’t Even Hindcast How did climate modelers, who already knew the answer, still fail to explain the lack of a significant temperature rise over the last 30+ years? In other words, how to you botch a hindcast?

5) …But We Should Believe Model Forecasts? Why should we believe model predictions of the future, when they can’t even explain the past?

6) Modelers Lie About Their “Physics”. Why do modelers insist their models are based upon established physics, but then hide the fact that the strong warming their models produce is actually based upon very uncertain “fudge factor” tuning?

7) Is Warming Even Bad? Who decided that a small amount of warming is necessarily a bad thing?

8) Is CO2 Bad? How did carbon dioxide, necessary for life on Earth and only 4 parts in 10,000 of our atmosphere, get rebranded as some sort of dangerous gas?

9) Do We Look that Stupid? How do scientists expect to be taken seriously when their “theory” is supported by both floods AND droughts? Too much snow AND too little snow?

10) Selective Pseudo-Explanations. How can scientists claim that the Medieval Warm Period (which lasted hundreds of years), was just a regional fluke…yet claim the single-summer (2003) heat wave in Europe had global significance?

11) (Spinal Tap bonus) Just How Warm is it, Really? Why is it that every subsequent modification/adjustment to the global thermometer data leads to even more warming? What are the chances of that? Either a warmer-still present, or cooling down the past, both of which produce a greater warming trend over time. And none of the adjustments take out a gradual urban heat island (UHI) warming around thermometer sites, which likely exists at virtually all of them — because no one yet knows a good way to do that.

It is no coincidence that leftists believe in the efficacy of central planning and cling tenaciously to a belief in catastrophic anthropogenic global warming. The latter justifies the former, of course. And both beliefs exemplify the left’s penchant for magical thinking, about which I’ve written several times (e.g., here, here, here, here, and here).

Magical thinking is the pretense of knowledge in the nth degree. It conjures “knowledge” from ignorance and hope. And no one better exemplifies magical thinking than our hopey-changey president.


Related reading: Walter E. Williams, “The Experts Have Been Wrong About a Lot of Things, Here’s a Sample“, The Daily Signal, July 25, 2018

Related posts:
Modeling Is Not Science
The Left and Its Delusions
Economics: A Survey
AGW: The Death Knell
The Keynesian Multiplier: Phony Math
Modern Liberalism as Wishful Thinking
“The Science Is Settled”
Is Science Self-Correcting?
“Feelings, Nothing More than Feelings”
“Science” vs. Science: The Case of Evolution, Race, and Intelligence
Modeling Revisited
Bayesian Irrationality
The Fragility of Knowledge
Global-Warming Hype
Pattern-Seeking
Babe Ruth and the Hot-Hand Hypothesis
Hurricane Hysteria
Deduction, Induction, and Knowledge
Much Ado about the Unknown and Unknowable
A (Long) Footnote about Science
The Balderdash Chronicles
The Probability That Something Will Happen
Analytical and Scientific Arrogance

Modeling, Science, and Physics Envy

Climate Skeptic notes the similarity of climate models and macroeconometric models:

The climate modeling approach is so similar to that used by the CEA to score the stimulus that there is even a climate equivalent to the multiplier found in macro-economic models. In climate models, small amounts of warming from man-made CO2 are multiplied many-fold to catastrophic levels by hypothetical positive feedbacks, in the same way that the first-order effects of government spending are multiplied in Keynesian economic models. In both cases, while these multipliers are the single most important drivers of the models’ results, they also tend to be the most controversial assumptions. In an odd parallel, you can find both stimulus and climate debates arguing whether their multiplier is above or below one.

Here is my take, from “Modeling Is Not Science“:

The principal lesson to be drawn from the history of massive government programs is that those who were skeptical of those programs were entirely justified in their skepticism. Informed, articulate skepticism of the kind I counsel here is the best weapon — perhaps the only effective one — in the fight to defend what remains of liberty and property against the depredations of massive government programs.

Skepticism often is met with the claim that such-and-such a model is the “best available” on a subject. But the “best available” model — even if it is the best available one — may be terrible indeed. Relying on the “best available” model for the sake of government action is like sending an army into battle — and likely to defeat — on the basis of rumors about the enemy’s position and strength.

With respect to the economy and the climate, there are too many rumor-mongers (“scientists” with an agenda), too many gullible and compliant generals (politicians), and far too many soldiers available as cannon-fodder (the paying public).

Scientists and politicians who stand by models of unfathomably complex processes are guilty of physics envy, at best, and fraud, at worst.

Physics Envy

Max Borders offers a critique of economic modeling, in which he observes that

a scientist’s model, while useful in limited circumstances, is little better than a crystal ball for predicting big phenomena like markets and climate. It is an offshoot of what F. A. Hayek called the “pretence of knowledge.” In other words, modeling is a form of scientism, which is “decidedly unscientific in the true sense of the word, since it involves a mechanical and uncritical application of habits of thought to fields different from those in which they have been formed.” (“The Myth of the Model,” The Freeman, June 10, 2010, volume 60, issue 5)

I’ve said a lot (e.g., here, here, here, here, here, here, here, and here) about modeling, economics, the social sciences in general, and the pseudo-science of climatology.

Models of complex, dynamic systems — especially social systems — are manifestations of physics envy, a term used by Stephen Jay Gould. He describes it in The Mismeasure of Man (1981) as

the allure of numbers, the faith that rigorous measurement could guarantee irrefutable precision, and might mark the transition between subjective speculation and a true science as worthy as Newtonian physics.

But there’s more to science than mere numbers. Quoting, again, from The Mismeasure of Man:

Science is rooted in creative interpretation. Numbers suggest, constrain, and refute; they do not, by themselves, specify the content of scientific theories. Theories are built upon the interpretation of numbers, and interpreters are often trapped by their own rhetoric. They believe in their own objectivity, and fail to discern the prejudice that leads them to one interpretation among many consistent with their numbers.

Ironically, The Mismeasure of Man offers a strongly biased and even dishonest interpretation of numbers (among other things). When a leading critic of physics envy falls prey to it, you know that he’s on to something.