The invention of the printing press in 1440 led to the spread of information around Europe. Seventy-seven years later, the publication of Martin Luther’s Ninety-five Theses plunged the continent into centuries of religious war. But this dissemination of ideas also led to the Industrial Revolution, which brought about extraordinary economic growth.
In the 2000s, big data has made huge progress in areas from baseball to betting, but still, entire sectors, from financial catastrophes to natural disasters, are not being predicted accurately. The amount of data available makes it almost impossible for us to sort through it all, but if we begin to understand our natural biases, we can use the data to help us make better predictions.
Need to Know: Just as the printing press brought a wave of information that destabilized Europe but ultimately led to the Industrial Revolution, the rise of big data will bring a lot of bad information—noise—which we must learn to sift for the truth—the signal.
1: A Catastrophic Failure of Prediction
The financial crisis of 2008 was a result of a failure of prediction by the agencies who gave excellent ratings to securities known as collateralized debt obligations (CDOs), which included bad mortgages and were incredibly unsafe. The rating agencies profited from the abundance of CDOs, so they incentivized banks to continue producing more.
The unprecedented American housing bubble in the 2000s was result of people being encouraged to buy or flip homes even if they couldn’t afford to do so. This was supported by a financial market—lenders, brokers, and ratings agencies—which benefited from every sale—and from the incorrect belief that homeownership was always a profitable investment.
For each dollar that was being spent in housing sales, there were almost fifty dollars worth of trades in mortgage-backed securities. Institutions like Lehman Brothers were highly leveraged, and they were betting with money they didn’t actually have or that they had borrowed, which put them in precarious positions if the value of their portfolios declined even a small amount. This should have made investors reluctant to purchase assets, but the positive ratings from the credit agencies convinced unknowing buyers that these were solid purchases.
When Barack Obama took office in 2009, his stimulus package was meant to keep unemployment in check; in reality, the recession was worse than people knew at the time, causing the unemployment rate to go higher than his administration predicted it would.
The recession was caused by a series of poor predictions, each caused by predictors overlooking key pieces of information. In the lead-up to the housing crash, ratings agencies were creating models that didn’t include all data relevant to the current housing situation, making them useless. An important lesson was learned the hard way. We should always take all data into account, even data that disrupts our models and calls our accuracy into question. A false sense of confidence in accuracy can lead to avoidable disaster.
Need to Know: The housing crisis and subsequent financial collapse and economic recession were caused by a string of prediction errors, each a result of overlooking key information.
2: Are You Smarter Than a Television Pundit?
In the run-up to the 2008 presidential election, many television pundits were unable to predict the obvious: Barack Obama had a solid lead and was going to win. With this knowledge, Silver goes back to evaluate the predictions made on the public affairs show The McLaughlin Group and determines that, overall, they only got about half of their forecasts right. A similar trend could be seen in the 1980s, when political experts failed to predict the collapse of the Soviet Union, despite Gorbachev’s sincere efforts to reform the country and the dire economic straits within the USSR. Historically, forecasts from experts in a variety of subject areas were barely more accurate than random chance.
While studying the personalities of these experts, Philip Tetlock divided them into two categories—hedgehogs and foxes. Hedgehogs believe in big ideas and maintain that governing principles affect all behavior. Foxes believe in a plethora of small ideas and understand that there is nuance and complexity in the world.
Tetlock then found that the latter group is better at predicting. For instance, foxes would have been able to see that the USSR was an increasingly unstable country for many reasons, while hedgehogs saw only an “evil empire” or a socialist stronghold. But hedgehogs—with their bold, unwavering beliefs—make better TV guests. For hedgehogs, the more information they have, the more likely they are to twist the data to fit with their pre-held beliefs, missing or ignoring any information that would disrupt their forecasts.
FiveThirtyEight was founded by Silver’s desire to approach the 2008 Democratic primary with qualitative analysis rather than cable news fluff. It was founded on three “fox-like” principles:
Principle 1: Think Probabilistically
The forecasts on FiveThirtyEight are probabilistic, meaning they cover a range of likely outcomes, accounting for real world uncertainty. What this means practically is that an event with a 90% chance of happening will still not happen 10% of the time. This doesn’t mean the prediction was incorrect.
Principle 2: Today’s Forecast Is the First Forecast of the Rest of Your Life
Probabilities are moving targets and will change as new data is considered each day. A key to FiveThirtyEight predictions is the willingness to change as information becomes available.
Principle 3: Look for Consensus
Hedgehogs want to single-handedly predict a major event and bask in the glory of their skills, but foxes realize that the best way to forecast is to aggregate many predictions and look for consensus in the data.
Need to Know: Confident political pundits (hedgehogs) are likely to predict inaccurately because they are blinded by their own biases, while data-driven forecasters (foxes) can combine many perspectives to see the truth more accurately and make better predictions.
3: All I Care About Is W’s and L’s
In the baseball prediction system Silver created for Baseball Prospectus, PECOTA (Player Empirical Comparison and Optimization Test Algorithm), Silver had determined that Red Sox player Dustin Pedroia would be a success, despite scout reports that dismissed him. Silver’s predictions proved to be true, and when he sought an interview with Pedroia about these numbers, he realized that the key to the athlete’s success was his above-it-all attitude. Pedroia hadn’t listened to the scouting reports, which could have brought him down.
Baseball projection systems mus...