The Signal and the Noise: Why So Many Predictions Fail--but Some Don’t | Nate Silver
- درباره کتاب
- بخشی از کتاب
UPDATED FOR 2020 WITH A NEW PREFACE BY NATE SILVER
"One of the more momentous books of the decade." —The New York Times Book Review
Nate Silver built an innovative system for predicting baseball performance, predicted the 2008 election within a hair’s breadth, and became a national sensation as a blogger—all by the time he was thirty. He solidified his standing as the nation's foremost political forecaster with his near perfect prediction of the 2012 election. Silver is the founder and editor in chief of the website FiveThirtyEight.
Drawing on his own groundbreaking work, Silver examines the world of prediction, investigating how we can distinguish a true signal from a universe of noisy data. Most predictions fail, often at great cost to society, because most of us have a poor understanding of probability and uncertainty. Both experts and laypeople mistake more confident predictions for more accurate ones. But overconfidence is often the reason for failure. If our appreciation of uncertainty improves, our predictions can get better too. This is the “prediction paradox”: The more humility we have about our ability to make predictions, the more successful we can be in planning for the future.
In keeping with his own aim to seek truth from data, Silver visits the most successful forecasters in a range of areas, from hurricanes to baseball to global pandemics, from the poker table to the stock market, from Capitol Hill to the NBA. He explains and evaluates how these forecasters think and what bonds they share. What lies behind their success? Are they good—or just lucky? What patterns have they unraveled? And are their forecasts really right? He explores unanticipated commonalities and exposes unexpected juxtapositions. And sometimes, it is not so much how good a prediction is in an absolute sense that matters but how good it is relative to the competition. In other cases, prediction is still a very rudimentary—and dangerous—science.
Silver observes that the most accurate forecasters tend to have a superior command of probability, and they tend to be both humble and hardworking. They distinguish the predictable from the unpredictable, and they notice a thousand little details that lead them closer to the truth. Because of their appreciation of probability, they can distinguish the signal from the noise.
With everything from the health of the global economy to our ability to fight terrorism dependent on the quality of our predictions, Nate Silver’s insights are an essential read.
“Not so different in spirit from the way public intellectuals like John Kenneth Galbraith once shaped discussions of economic policy and public figures like Walter Cronkite helped sway opinion on the Vietnam War . . . could turn out to be one of the more momentous books of the decade.” —New York Times Book Review
“Mr. Silver, just 34, is an expert at finding signal in noise . . . Lively prose—from energetic to outraged . . . illustrates his dos and don’ts through a series of interesting essays that examine how predictions are made in fields including chess, baseball, weather forecasting, earthquake analysis and politics… [the] chapter on global warming is one of the most objective and honest analyses I’ve seen . . . even the noise makes for a good read.” —New York Times
"A serious treatise about the craft of prediction—without academic mathematics—cheerily aimed at lay readers. Silver's coverage is polymathic, ranging from poker and earthquakes to climate change and terrorism." —New York Review of Books
"Mr. Silver's breezy style makes even the most difficult statistical material accessible. What is more, his arguments and examples are painstakingly researched . . ." —Wall Street Journal
"Nate Silver is the Kurt Cobain of statistics . . . His ambitious new book, The Signal and the Noise, is a practical handbook and a philosophical manifesto in one, following the theme of prediction through a series of case studies ranging from hurricane tracking to professional poker to counterterrorism. It will be a supremely valuable resource for anyone who wants to make good guesses about the future, or who wants to assess the guesses made by others. In other words, everyone." —The Boston Globe
"Silver delivers an improbably breezy read on what is essentially a primer on making predictions." —Washington Post
“The Signal and the Noise is many things—an introduction to the Bayesian theory of probability, a meditation on luck and character, a commentary on poker's insights into life—but it's most important function is its most basic and absolutely necessary one right now: a guide to detecting and avoiding bullshit dressed up as data . . . What is most refreshing . . . is its humility. Sometimes we have to deal with not knowing, and we need somebody to tell us that.” —Esquire
“[An] entertaining popularization of a subject that scares many people off . . .Silver’s journey from consulting to baseball analytics to professional poker to political prognosticating is very much that of a restless and curious mind. And this, more than number-crunching, is where real forecasting prowess comes from.” —Slate
“Nate Silver serves as a sort of Zen master to American election-watchers . . . In the spirit of Nassim Nicholas Taleb’s widely read The Black Swan, Mr. Silver asserts that humans are overconfident in their predictive abilities, that they struggle to think in probabilistic terms and build models that do not allow for uncertainty.” —The Economist
"Silver explores our attempts at forecasting stocks, storms, sports, and anything else not set in stone." —Wired
"The Signal and the Noise is essential reading in the era of Big Data that touches every business, every sports event, and every policymaker." —Forbes.com
“Laser sharp. Surprisingly, statistics in Silver’s hands is not without some fun.” —Smithsonian Magazine
“A substantial, wide-ranging, and potentially important gauntlet of probabilistic thinking based on actual data thrown at the feet of a culture determined to sweep away silly liberal notions like ‘facts.’” —The Village Voice
“Silver shines a light on 600 years of human intelligence-gathering—from the advent of the printing press all the way through the Industrial Revolution and up to the current day—and he finds that it's been an inspiring climb. We've learned so much, and we still have so much left to learn.” —MLB.com
“Nate Silver’s The Signal and the Noise is The Soul of a New Machine for the 21st century (a century we thought we’d be a lot better at predicting than we actually are). Our political discourse is already better informed and more data-driven because of Nate’s influence. But here he shows us what he has always been able to see in the numbers—the heart and the ethical imperative of getting the quantitative questions right. A wonderful read—totally engrossing." —Rachel Maddow, author of Drift
“Yogi Berra was right: ‘forecasting is hard, especially about the future.’ In this important book, Nate Silver explains why the performance of experts varies from prescient to useless and why we must plan for the unexpected. Must reading for anyone who cares about what might happen next.” —Richard Thaler, co-author of Nudge
About the Author
Nate Silver is the founder and editor in chief of FiveThirtyEight.com.
Excerpt. © Reprinted by permission. All rights reserved.
At about the time The Signal and the Noise was first published in September 2012, “Big Data” was on its way becoming a Big Idea. Google searches for the term doubled over the course of a year,1 as did mentions of it in the news media.2 Hundreds of books were published on the subject. If you picked up any business periodical in 2013, advertisements for Big Data were as ubiquitous as cigarettes in an episode of Mad Men.
But by late 2014, there was evidence that trend had reached its apex. The frequency with which Big Data was mentioned in corporate press releases had slowed down and possibly begun to decline.3 The technology research firm Gartner even declared that Big Data had passed the peak of its “hype cycle.”4
I hope that Gartner is right. Coming to a better understanding of data and statistics is essential to help us navigate our lives. But as with most emerging technologies, the widespread benefits to science, industry, and human welfare will come only after the hype has died down.
FIGURE P-1: BIG DATA MENTIONS IN CORPORATE PRESS RELEASES
I worry that certain events in my life have contributed to the hype cycle. On November 6, 2012, the statistical model at my Web site FiveThirtyEight “called” the winner of the American presidential election correctly in all fifty states. I received a congratulatory phone call from the White House. I was hailed as “lord and god of the algorithm” by The Daily Show’s Jon Stewart. My name briefly received more Google search traffic than the vice president of the United States.
I enjoyed some of the attention, but I felt like an outlier—even a fluke. Mostly I was getting credit for having pointed out the obvious—and most of the rest was luck.*
To be sure, it was reasonably clear by Election Day that President Obama was poised to win reelection. When voters went to the polls on election morning, FiveThirtyEight’s statistical model put his chances of winning the Electoral College at about 90 percent.* A 90 percent chance is not quite a sure thing: Would you board a plane if the pilot told you it had a 90 percent chance of landing successfully? But when there’s only reputation rather than life or limb on the line, it’s a good bet. Obama needed to win only a handful of the swing states where he was tied or ahead in the polls; Mitt Romney would have had to win almost all of them.
But getting every state right was a stroke of luck. In our Election Day forecast, Obama’s chance of winning Florida was just 50.3 percent—the outcome was as random as a coin flip. Considering other states like Virginia, Ohio, Colorado, and North Carolina, our chances of going fifty-for-fifty were only about 20 percent.5 FiveThirtyEight’s “perfect” forecast was fortuitous but contributed to the perception that statisticians are soothsayers—only using computers rather than crystal balls.
This is a wrongheaded and rather dangerous idea. American presidential elections are the exception to the rule—one of the few examples of a complex system in which outcomes are usually more certain than the conventional wisdom implies. (There are a number of reasons for this, not least that the conventional wisdom is often not very wise when it comes to politics.) Far more often, as this book will explain, we overrate our ability to predict the world around us. With some regularity, events that are said to be certain fail to come to fruition—or those that are deemed impossible turn out to occur.
If all of this is so simple, why did so many pundits get the 2012 election wrong? It wasn’t just on the fringe of the blogosphere that conservatives insisted that the polls were “skewed” toward President Obama. Thoughtful conservatives like George F. Will6 and Michael Barone7 also predicted a Romney win, sometimes by near-landslide proportions.
One part of the answer is obvious: the pundits didn’t have much incentive to make the right call. You can get invited back on television with a far worse track record than Barone’s or Will’s—provided you speak with some conviction and have a viewpoint that matches the producer’s goals.
An alternative interpretation is slightly less cynical but potentially harder to swallow: human judgment is intrinsically fallible. It’s hard for any of us (myself included) to recognize how much our relatively narrow range of experience can color our interpretation of the evidence. There’s so much information out there today that none of us can plausibly consume all of it. We’re constantly making decisions about what Web site to read, which television channel to watch, and where to focus our attention.
Having a better understanding of statistics almost certainly helps. Over the past decade, the number of people employed as statisticians in the United States has increased by 35 percent8 even as the overall job market has stagnated. But it’s a necessary rather than sufficient part of the solution. Some of the examples of failed predictions in this book concern people with exceptional intelligence and exemplary statistical training—but whose biases still got in the way.
These problems are not so simple and so this book does not promote simple answers to them. It makes some recommendations but they are philosophical as much as technical. Once we’re getting the big stuff right—coming to a better understanding of probably and uncertainty; learning to recognize our biases; appreciating the value of diversity, incentives, and experimentation—we’ll have the luxury of worrying about the finer points of technique.
Gartner’s hype cycle ultimately has a happy ending. After the peak of inflated expectations there’s a “trough of disillusionment”—what happens when people come to recognize that the new technology will still require a lot of hard work.
FIGURE P-2: GARTNER’S HYPE CYCLE
But right when views of the new technology have begun to lapse from healthy skepticism into overt cynicism, that technology can begin to pay some dividends. (We’ve been through this before: after the computer boom in the 1970s and the Internet commerce boom of the late 1990s, among other examples.) Eventually it matures to the point when there are fewer glossy advertisements but more gains in productivity—it may even have become so commonplace that we take it for granted. I hope this book can accelerate the process, however slightly.
This is a book about information, technology, and scientific progress. This is a book about competition, free markets, and the evolution of ideas. This is a book about the things that make us smarter than any computer, and a book about human error. This is a book about how we learn, one step at a time, to come to knowledge of the objective world, and why we sometimes take a step back.
This is a book about prediction, which sits at the intersection of all these things. It is a study of why some predictions succeed and why some fail. My hope is that we might gain a little more insight into planning our futures and become a little less likely to repeat our mistakes.
More Information, More Problems
The original revolution in information technology came not with the microchip, but with the printing press. Johannes Gutenberg’s invention in 1440 made information available to the masses, and the explosion of ideas it produced had unintended consequences and unpredictable effects. It was a spark for the Industrial Revolution in 1775,1 a tipping point in which civilization suddenly went from having made almost no scientific or economic progress for most of its existence to the exponential rates of growth and change that are familiar to us today. It set in motion the events that would produce the European Enlightenment and the founding of the American Republic.
But the printing press would first produce something else: hundreds of years of holy war. As mankind came to believe it could predict its fate and choose its destiny, the bloodiest epoch in human history followed.2
Books had existed prior to Gutenberg, but they were not widely written and they were not widely read. Instead, they were luxury items for the nobility, produced one copy at a time by scribes.3 The going rate for reproducing a single manuscript was about one florin (a gold coin worth about $200 in today’s dollars) per five pages,4 so a book like the one you’re reading now would cost around $20,000. It would probably also come with a litany of transcription errors, since it would be a copy of a copy of a copy, the mistakes having multiplied and mutated through each generation.
This made the accumulation of knowledge extremely difficult. It required heroic effort to prevent the volume of recorded knowledge from actually decreasing, since the books might decay faster than they could be reproduced. Various editions of the Bible survived, along with a small number of canonical texts, like from Plato and Aristotle. But an untold amount of wisdom was lost to the ages,5 and there was little incentive to record more of it to the page.
The pursuit of knowledge seemed inherently futile, if not altogether vain. If today we feel a sense of impermanence because things are changing so rapidly, impermanence was a far more literal concern for the generations before us. There was “nothing new under the sun,” as the beautiful Bible verses in Ecclesiastes put it—not so much because everything had been discovered but because everything would be forgotten.6
The printing press changed that, and did so permanently and profoundly. Almost overnight, the cost of producing a book decreased by about three hundred times,7 so a book that might have cost $20,000 in today’s dollars instead cost $70. Printing presses spread very rapidly throughout Europe; from Gutenberg’s Germany to Rome, Seville, Paris, and Basel by 1470, and then to almost all other major European cities within another ten years.8 The number of books being produced grew exponentially, increasing by about thirty times in the first century after the printing press was invented.9 The store of human knowledge had begun to accumulate, and rapidly.
FIGURE I-1: EUROPEAN BOOK PRODUCTION
As was the case during the early days of the World Wide Web, however, the quality of the information was highly varied. While the printing press paid almost immediate dividends in the production of higher quality maps,10 the bestseller list soon came to be dominated by heretical religious texts and pseudoscientific ones.11 Errors could now be mass-produced, like in the so-called Wicked Bible, which committed the most unfortunate typo in history to the page: thou shalt commit adultery.12 Meanwhile, exposure to so many new ideas was producing mass confusion. The amount of information was increasing much more rapidly than our understanding of what to do with it, or our ability to differentiate the useful information from the mistruths.13 Paradoxically, the result of having so much more shared knowledge was increasing isolation along national and religious lines. The instinctual shortcut that we take when we have “too much information” is to engage with it selectively, picking out the parts we like and ignoring the remainder, making allies with those who have made the same choices and enemies of the rest.
The most enthusiastic early customers of the printing press were those who used it to evangelize. Martin Luther’s Ninety-five Theses were not that radical; similar sentiments had been debated many times over. What was revolutionary, as Elizabeth Eisenstein writes, is that Luther’s theses “did not stay tacked to the church door.”14 Instead, they were reproduced at least three hundred thousand times by Gutenberg’s printing press15—a runaway hit even by modern standards.
The schism that Luther’s Protestant Reformation produced soon plunged Europe into war. From 1524 to 1648, there was the German Peasants’ War, the Schmalkaldic War, the Eighty Years’ War, the Thirty Years’ War, the French Wars of Religion, the Irish Confederate Wars, the Scottish Civil War, and the English Civil War—many of them raging simultaneously. This is not to neglect the Spanish Inquisition, which began in 1480, or the War of the Holy League from 1508 to 1516, although those had less to do with the spread of Protestantism. The Thirty Years’ War alone killed one-third of Germany’s population,16 and the seventeenth century was possibly the bloodiest ever, with the early twentieth staking the main rival claim.17
But somehow in the midst of this, the printing press was starting to produce scientific and literary progress. Galileo was sharing his (censored) ideas, and Shakespeare was producing his plays.
Shakespeare’s plays often turn on the idea of fate, as much drama does. What makes them so tragic is the gap between what his characters might like to accomplish and what fate provides to them. The idea of controlling one’s fate seemed to have become part of the human consciousness by Shakespeare’s time—but not yet the competencies to achieve that end. Instead, those who tested fate usually wound up dead.18
These themes are explored most vividly in The Tragedy of Julius Caesar. Throughout the first half of the play Caesar receives all sorts of apparent warning signs—what he calls predictions19 (“beware the ides of March”)—that his coronation could turn into a slaughter. Caesar of course ignores these signs, quite proudly insisting that they point to someone else’s death—or otherwise reading the evidence selectively. Then Caesar is assassinated.
“[But] men may construe things after their fashion / Clean from the purpose of the things themselves,” Shakespeare warns us through the voice of Cicero—good advice for anyone seeking to pluck through their newfound wealth of information. It was hard to tell the signal from the noise. The story the data tells us is often the one we’d like to hear, and we usually make sure that it has a happy ending.
And yet if The Tragedy of Julius Caesar turned on an ancient idea of prediction—associating it with fatalism, fortune-telling, and superstition—it also introduced a more modern and altogether more radical idea: that we might interpret these signs so as to gain an advantage from them. “Men at some time are masters of their fates,” says Cassius, hoping to persuade Brutus to partake in the conspiracy against Caesar.
The idea of man as master of his fate was gaining currency. The words predict and forecast are largely used interchangeably today, but in Shakespeare’s time, they meant different things. A prediction was what the soothsayer told you; a forecast was something more like Cassius’s idea.
The term forecast came from English’s Germanic roots,20 unlike predict, which is from Latin.21 Forecasting reflected the new Protestant worldliness rather than the otherworldliness of the Holy Roman Empire. Making a forecast typically implied planning under conditions of uncertainty. It suggested having prudence, wisdom, and industriousness, more like the way we now use the word foresight. 22
The theological implications of this idea are complicated.23 But they were less so for those hoping to make a gainful existence in the terrestrial world. These qualities were strongly associated with the Protestant work ethic, which Max Weber saw as bringing about capitalism and the Industrial Revolution.24 This notion of forecasting was very much tied in to the notion of progress. All that information in all those books ought to have helped us to plan our lives and profitably predict the world’s course.
• • •
The Protestants who ushered in centuries of holy war were learning how to use their accumulated knowledge to change society. The Industrial Revolution largely began in Protestant countries and largely in those with a free press, where both religious and scientific ideas could flow without fear of censorship.25
The importance of the Industrial Revolution is hard to overstate. Throughout essentially all of human history, economic growth had proceeded at a rate of perhaps 0.1 percent per year, enough to allow for a very gradual increase in population, but not any growth in per capita living standards.26 And then, suddenly, there was progress when there had been none. Economic growth began to zoom upward much faster than the growth rate of the population, as it has continued to do through to the present day, the occasional global financial meltdown notwithstanding.27