What March Madness Tells Us About Forecasting
Stanford’s Amy Zegart argues in that the prediction business is getting easier.
From left, Creighton’s Grant Gibbs, Doug McDermott, and Avery Dingman react after learning their NCAA college basketball tournament assignment during a Selection Sunday viewing party. (AP photo by Nati Harnik)
In the coming days, President Obama will be reviewing intelligence, poring over data, soliciting expert opinions, examining history, mulling options, and fusing analysis with intuition and experience. No, not to order a drone strike, intervene in Syria, or combat Chinese cyber-hacking. Instead, Obama, like millions of Americans, will be picking his NCAA men’s basketball teams for March Madness.
Lots of industries involve predicting future outcomes: weather forecasters, Wall Street traders, doctors, pope-watchers, movie executives, and baseball scouts, to name a few. Each falls somewhere along a spectrum of predictability. March Madness “bracketology” — divining which teams will go how far in this year’s basketball tournament — lies on the easier end. Assessing national security threats and outcomes sits on the opposite, very, very hard end of the spectrum. These two extremes shed light on what factors make some human activities more susceptible to accurate prediction than others. And they suggest why more things are becoming predictable than we might imagine.
The first factor distinguishing the easier end of the predictability spectrum from the hard end is data — how much information there is about similar events in the past. Sports competitions are notoriously data-rich. That’s not to say sportscasters always get it right. March Madness wouldn’t be March Madness if they did: Everyone loves a Cinderella team that wins against the odds. But they don’t call it “winning against the odds” for nothing. Exactly how often does the worst-ranked team in a bracket win it all? Never. The lowest-seeded NCAA champion was Villanova in 1985, seeded eighth of 16 in its bracket. As we Louisville fans know, there’s a reason the usual suspects make it to the Final Four. History isn’t destiny, but in March Madness it’s a pretty darn good guide.
Intelligence analysts don’t have a rich historical store of comparable cases to help assess future outcomes. Consider the current nuclear crisis with Iran. Only nine countries have nuclear weapons. Five got the bomb so long ago that nobody had yet landed on the moon. North Korea is the most recent nuclear rogue, but the Hermit Kingdom’s weird ruling family hardly seems a generalizable model for anything. The only country that developed and then voluntarily dismantled its nuclear arsenal is South Africa — in large part because apartheid was crumbling and the outgoing white regime feared putting the bomb in the hands of a black government. Bracketology database this isn’t.
The second factor is bias, or more specifically how obvious bias is. Biases can’t be eliminated, but their distorting effects can be tempered if everyone knows about them. In sporting events, we wear our biases on our sleeves. Literally. Every March, everyone knows that I will overestimate Louisville’s chances of winning the NCAA because the Cardinals are my hometown team. And because this bias is obvious, my pro-Louisville forecast is taken with the appropriate grain of salt. But in the CIA, nobody walks around wearing a T-shirt that says, “I am often subject to confirmation bias, giving greater weight to data that supports my prior beliefs and discounting information that disconfirms them.” The more that hidden biases creep into analysis unnoticed and unchecked, the more problematic prediction becomes.
The third factor is asymmetric information. In March Madness, everyone has access to the same information, at least theoretically. Expertise depends mostly on how geeky you choose to be, and how much time you spend watching ESPN and digging up past stats. In intelligence, however, information is tightly compartmented by classification restrictions, leaving analysts with different pieces of data and serious barriers to sharing it. Imagine scattering NCAA bracket information across 1,000 people, many of whom do not know each other, some of whom have no idea what a bracket is or the value of the information they possess. They’re all told if they share anything with the wrong person, they could be disciplined, fired, even prosecuted. But somehow they have to collectively pick the winner to succeed.
The fourth factor is whether winning and losing are clear. This is big. Clear metrics make it possible to create feedback loops that analysts can use to improve their predictions in the future. In sports, winning and losing is obvious. In foreign policy, it isn’t. Is al Qaeda on the path to defeat? Is Iraq on the way to stabilization? Will the U.S. invasion of Afghanistan succeed? Who got it right or wrong? It’s hard to say. And the answer today may be different in a year, a decade, or a century. Today’s headlines and tomorrow’s history books are rarely the same. It’s hard for analysts to get better at predicting when they don’t know if their past predictions were ever any good.
Deception is the fifth and final factor. University of Louisville Coach Rick Pitino may be holding a little something back for the tournament — a new play, a different substitution. But this is small-bore stuff compared to what states and transnational actors do to conceal their real intentions and capabilities.
So much for the extremes. The big news is just how much the middle of the predictability spectrum is growing. Smart people are finding clever new ways of generating better data, identifying and unpacking biases, and sharing information unimaginable 20 or even 10 years ago. The result: A growing range of human activity has moved from the world of “analysis-by-gut-check” to “analysis by evidence.” Nobody predicted just how much can be predicted now.
My three favorite examples of this prediction revolution are election forecasting, medical decision-making, and studies of ethnic conflict.
In the 2012 presidential election, the New York Times’s Nate Silver wielded math and polling data to beat the experience and gut feel of longtime election pundits who forecasted a Romney victory. Conservative columnist George Will said his “wild card” was whether Minnesota would go to Romney, edging the Republican to a 321-electoral vote win. Peggy Noonan blogged on the eve of the election, “While everyone is looking at the polls and the storm, Romney’s slipping into the presidency.” Romney was slipping all right — in the polls she wasn’t watching. It was a big, public triumph of big data and good analysis over reasoning-by-anecdote-and-wishful-thinking that made old school pundits look old.
Evidence-based medicine has shown how doctors’ experience and judgment are often wrong. Harvard Professor David S. Jones’s new book, Broken Hearts, recounts how two of the most common treatments for heart disease, coronary bypass surgery and angioplasty, have been widely used for years because doctors believed — falsely, it turns out — that these procedures would extend life expectancy. Physicians reasoned that patients suffering from blocked arteries would live longer lives if the clogs could somehow be removed or circumvented. Bypass surgery did this by grafting veins or arteries from another part of the body into the heart vasculature. Angioplasty involved inserting a balloon into the blocked artery to compress and shrink the blockage, and then inserting a mesh-like stent to keep future clots from forming. In 1996, doctors performed a peak of 600,000 heart bypass operations. In the 2000s, more than a million angioplasties were performed annually. Yet when randomized clinical trials were conducted, results showed clearly that, except for a few of the sickest patients, these surgical treatments did not extend life expectancy any more than medication and lifestyle changes. And surgery imposed significant side-effect risks, including brain damage.
In political science, new large datasets and field experiments are revolutionizing how we think about ethnic conflict. For years, scholars and policymakers assumed that ethnic cleavages were the primary cause of civil wars. But that’s because they never considered how many ethnic cleavages did not cause civil wars. When Professors Jim Fearon and David Laitin did the math, they found that civil wars were more often caused by weak government capacity, low levels of economic development, and mountainous terrain than religious or ethnic differences.
The prediction business isn’t perfect. Big hairy outliers still happen with alarming frequency. But March Madness reminds us that not all guesswork has to be guesswork. Gut feel is overrated. And increasingly, it’s not the only game in town.
Amy Zegart is a senior fellow at the Hoover Institution, a faculty member at Stanford’s Center for International Security and Cooperation, and professor of political economy, by courtesy, at Stanford GSB. You can follow her on Twitter @AmyZegart.
This piece originally appeared in ForeignPolicy.com.
For media inquiries, visit the Newsroom.