share
interactive transcript
request transcript/captions
live captions
download
|
MyPlaylist
YASAMIN MILLER: Welcome to the Survey Research Institute's seventh annual talk. My name's Yasamin Miller, and I'm the director of SRI. I'd like to thank all the co-sponsors of this event today-- the Office of the Senior Vice Provost, Statistical Sciences, Information Science, [INAUDIBLE], Government, Bronfenbrenner Center for Translational Research, Communications, Cognitive Science, and I also want to give a special thanks to the 88th annual Hotel Ezra Cornell organization here. Yes.
[APPLAUSE]
They helped coordinate this venue and interact it with their own event. And I want to extend a very warm welcome to all the Hotelies.
I'm so pleased to see that this talk has drawn such a large and diverse audience. The goal of these talks is to advance the knowledge and understanding around survey research. And we're really grateful to have a speaker today that's making survey research cool.
[LAUGHTER]
So please allow me to share some statistics and fun data with you. The Survey Research Institute has been doing polls and surveys since 1996, which is about the same time that our speaker won first place in a high school debate competition in his home in East Lansing, Michigan. So as many of you know, he's shown an unusual aptitude for math and numbers from an extremely early age. And by the time he was six, he was a huge baseball fan. And this would later prove to serve him very well.
But I'd like to share with you something that's not well known. Before college, he actually worked as a telephone interviewer at Michigan State Survey Group. And when asked at a job interview at a college, what was the value of that? Like, why would you do a telephone interview? He said, I learned how to call strangers, make them relax, and then cooperate with me. This is a virtue I've been extolling for years now.
So I'm hoping, after this talk, we see a huge surge in our applications for our telephone interviewer positions.
[LAUGHTER]
In 2000, he graduated with a bachelor's in economics from the University of Chicago. And he spent four years at a job that was not really a good fit for him. So out of boredom, he discovered online poker, and for a while he was living the poker dream. He was staying up late and playing poker, and then returning to work the next morning to sleep.
To the students out there, take note and be comforted that your first job out of college and out of Cornell, if it's boring, it could be the best thing that ever happened to you. So the earnings he made from gambling allowed him then to pursue his other love, baseball, and he developed PECOTA, the online modeling system for forecasting the career development of major league baseball players, which he sold to and then managed for Baseball Prospectus. He moved from baseball predictions to election predictions, clearly an obvious [INAUDIBLE], but a pretty bold move, because there are countless numbers of survey organizations out there that are dedicated solely to predicting election outcomes. And as you know, he accurately predicted the 2008 and then again the 2012 presidential results.
In 2009, he was named by Time magazine as one of the 100 most influential people in the world. He was only 31. His award-winning blog, FiveThirtyEight, was licensed for publication by the New York Times in 2010. So how's he able to do this? I mean, how's he able to take all that data out there and predict with such accuracy?
It is reported that something like 90% of the data in the universe has been generated in the last 10 years, mostly thanks to the web, but also to the proliferation of survey research organizations.
[LAUGHTER]
And we know that when we have more information than we can handle, we tend to pick and choose those facts we like to believe. And what our speaker has so eloquently proven is that big data does not mean the end of scientific methods-- in fact, quite the opposite, and he does this partly by aggregating polling data. And when used correctly, survey data can tell you a story that represents the truth. The key is to be able to separate out the signal and the noise.
And coincidentally, that just happens to be the title of his New York Times best-selling book, The Signal and The Noise-- Why So Many Predictions Fail, But Some Don't. So to help us understand how to use big data, and in particular, survey data, please join me in giving a warm welcome to SRI's seventh annual speaker, Nate Silver.
[APPLAUSE]
NATE SILVER: Well, thank you everyone. Can you all hear me OK? I really appreciate the introduction. I'm from the Midwest, so we don't-- you can't hear me? Oh, I should probably turn on the mic. Give me one second here. Testing? Oh, there we go.
I was going to say, I'm from the Midwest, we don't deal very well in the Midwest with that much, like, praise, really. Or technology, apparently, as well.
But I feel somewhat at home here in Cornell-- in Ithaca, rather. I know it's not technically the Midwest, but we're kind of in what I think of as the college hockey belt, basically, which encompasses kind of a certain northern part of the country, not stretching all the way to the Atlantic coast, but pretty far up to Minnesota or so.
But yeah. I'm here to talk to you about big data and what it can do, but really kind of what it also can't do, and why we need people like the people in this room to work really hard to translate this stuff into actual working knowledge in society. So I became a big deal, I guess-- this presentation's not loading, actually. Let me--
[LAUGHTER]
It's all some metaphor for the failures of-- OK, there we go. So this was the map that we had on November 6, 2012, where we have a forecast of the outcome in every state. You should see it's a probabilistic forecast. We got a lot of credit, for example, for calling Florida right when we had Obama winning-- having a 50.01% chance of winning there, basically a coin flip, and I got really lucky on that coin flip. But we got 50 out of 50 states right. And this FiveThirtyEight site had gotten a ton of traffic, had been very much a source of controversy, though, also, throughout the campaign.
And it seemed really out of whack to me. This is a graph of Google search traffic for my name and the vice president's name. It seemed way out of proportion, that why is-- this is definitely humblebrag, by the way-- but why is a data geek getting this much love? Fortunately, though, to make sure America hadn't gone totally insane, Justin Bieber still got 100 times more traffic than the vice president and I combined.
But why was this drawing so much attention? This model is-- I'm downplaying a little bit here, but it's not as complicated as the things that are used in other branches of economics or physics or many other fields. We're just basically averaging the polls, counting to 270-- it's how many votes you need to win the electoral college-- and then making some effort to account for the margin of error.
That last step is a little bit more complicated. Measuring uncertainty and becoming comfortable with uncertainty is something that people are not usually very good at. But still, it's a fairly basic model. If you'd taken a simpler approach, for example, our competitor, RealClearPolitics, they missed Florida. But just taking a simple average of every poll in the last week would have gotten you the same outcome in, I think, 48 or 49 out of the 50 states as well.
But really, in the context of the book I had written last year, The Signal to Noise, the fact that this was seen as such an accomplishment I think kind of spoke to how many problems we've really had in the field. And I think part of the problem is because our expectations are not necessarily matching the realities.
This is an article from 2008, from Wired magazine, June, 2008, where Chris Anderson said, because of the deluge of data, we no longer need the scientific method. We no longer need theory, in essence. We will just have the kind of truth rain out from the cloud of data instead.
But consider the timing of this article was written in the middle of 2008, when the world's economy was starting to collapse, pretty much. You had a massive housing bubble. You had a financial crisis triggered by the housing bubble and poor ratings given out by the credit rating agencies and a lot of really bad decisions that were data-driven decisions. Finance is a very data-driven field. But if you have a model that has bad assumptions, that doesn't think through the theory of the problem, then it'll get you nowhere in a hurry, in effect, nuke the entire global economy in a hurry instead.
And as I talk about in the book, and we'll go through some of these examples today, this is really more the rule than the exception, where there have been failures to predict earthquakes, there have been over and underpredictions in the spread of flu. September 11 might be thought of as a big misprediction, where there were some signals that unfortunately our defense agencies weren't able to detect and prevent the attack in advance. It's not been the best decade, exactly.
So, and here's another example, of course, looking at economic growth. We had another fairly bad jobs report today. The red line here represents what GDP growth would be if the economy were at its full productive capacity. The blue line is where we actually are.
So the economy is recovering, but we're still way behind. We never made up for the lost hit we took to productivity five years ago. So a lot of people are still unemployed as a result.
Even in fields like medicine, for example, this is a very geeky paper published by a guy named John P. A. Ioannidis. He used Bayesian statistics to make a prediction, in essence, that most published research findings in medical journals and other prestigious fields are false, meaning they couldn't be verified, wouldn't stand up to verification or prediction. And in fact, when Bayer Laboratories sought to take findings from the best medical journals, results that were deemed to be statistically significant and true, and tried to recreate those experiments in their own lab, they found that two thirds of the results failed. So you really have kind of a crisis of science, ironically, in this era of big data.
Even Google. In Google we trust. But Google is not immune from these problems, as well, where they have a product called Google Flu Trends, and the utility here is that they look at search patterns for terms like "flu," for example, or "flu vaccination." And because there's a lag in the CDC's reporting of flu, you can have basically instant numbers, so you know if the flu is spreading in your area.
But this year, they rather badly overpredicted the flu instead. People aren't quite sure why yet. It may be because people are changing their search patterns all the time, and maybe even talking about this affects the way people use Google and use it as a diagnostic tool for the flu. But we've had really a lot of missteps in the era of big data instead.
So that's my kind of premise here, is why isn't big data producing big progress? And it doesn't have a simple answer, but there are a few big themes here, I think. First, I want to put this in some historical context, where what happens when we get a lot more information than we're used to historically?
So here's a graph of big data. As Yasmin said, by some measures, 90% of the information in the world has been created in the past two years. Most of that is stupid, like YouTube memes. But nevertheless, we're getting a lot more data than we had before.
But similarly, if you go back about, oh, 600 years ago or so, the printing press was invented in 1440. Before that, there weren't really any books. There were, but they were hand-transcribed. It cost, in today's dollars, about $25,000 to get a copy of a book. So you think textbooks are expensive now, right, but much, much worse in the 15th century instead.
But the printing press reduced the cost of books to about $100 a copy or so. Still kind of expensive-- well, I guess cheap for a textbook, probably, right? But you had a big exponential increase in the number of people who were reading and literacy rates throughout Europe. This technology spread very, very quickly from one place in central Germany in 1450 to all over the continent 50 years later. Even by modern standards, that's a pretty quick spread of a new technological innovation.
So did you have a lot of growth through all this enlightened perspective people now had? Well, eventually you did. But first you had a lot of war. You had the English Civil War, the Thirty Years' War, the Spanish Inquisition, all in the first 100 years or so after the printing press. This is just a small smattering here of more wars. It was arguably the bloodiest century in European history, the 20th century being a leading co-contender for that distinction.
But what happened is that you had the spread of different ideas, which again, in the long run is a good thing, but Martin Luther's theses, for example, there were 200,000 copies published of them because of the printing press now. So you had people who instead of having some kind of community consensus about values, however right or wrong it might have been, you suddenly had people testifying to different ideas. And unlike in science, this idea that we have that you get more information, you make more progress, instead people just found different ways to disagree with one another.
Which kind of brings me to the present day, where you have these guys instead. This is a rather nerdy graph from a system called DW-NOMINATE that measures polarization in the US Congress. And so you have-- you probably can't read the small print, but you have time on the x-axis, and polarization on the y-axis.
So we set a record last year for the most polarized Congress in American history. It was also the least productive Congress in history, unsurprisingly, as measured by the number of actual laws passed. Nothing really gets done whatsoever.
But if you look at this graph, there's a couple of inflection points kind of around-- first around 1980 and then in the mid '90s. So this is partly coincidental in fact, but what was happening at this time? Well, in 1980, you had CNN debut, so cable news, right? And then '96, you have Fox News and MSNBC come online instead.
And so people were perceiving the news through a different filter than they might have before. It wasn't just the three major networks, which they may have been biased, they may have been wrong, but they're kind of presenting a common front as far as the news. Instead, you have something like this, where right now, on a typical night, Rachel Maddow's audience is only 1% Republican. Sean Hannity's audience, for some reason, 5% Democrats. Watch Sean Hannity. But people are literally perceiving two separate sets of information.
So why was FiveThirtyEight so controversial in the run-up to the election? It was because people are used to having their news cherry-picked for them. And if, for example, you run this experiment where I took the three best poll [INAUDIBLE] in the last weeks of the campaign. For Obama, on the one side of course, he's winning every state. Three best polls for Romney, on the other hand, he's actually not winning Pennsylvania and Michigan and some states, but enough that he would have won the electoral college instead. That's why people couldn't necessarily believe our simple average, because they were so used to getting a kind of cherry-picked result of polls instead.
For example, if they were reading the Drudge Report. This is the Drudge Report on November 6, 2012, or the morning, excuse me, of Election Day, where by this point, Hurricane Sandy had seemed to work to Obama's benefit. It was several weeks after Romney had had his great debate in Denver. But pretty clear that Obama, not a lock, but certainly had the momentum and was likely to win.
But instead, every story I've highlighted in red here is a favorable story for Romney. Every one in blue-- there aren't very many-- is a favorable story for Obama instead. So it literally is as biased as the stereotype, where they're presenting only the cherry-picked kind of news that's going to keep partisans and ideologues happy instead. When we have so much news to filter through, so much information to sort through, you're going to get a biased sample a lot of the time, and if you're not careful, you're going to get a sample that's actually quite detached from the consensus evidence that what reality really looks like.
So it's easy to make fun of people in politics. And they are pretty delusional, by the way. I mean, the reason I can look smart in politics is because the bar is very, very low there. You have an industry which is all about manipulating the truth basically. And some people, I think, sometimes have trouble distinguishing reality, much less the signal from the noise.
But even in academia and other disciplines, there are similar problems. So there's one very basic reason why, which is that as you have more and more inputs, more and more data sources-- so for example, say you have five variables here. You're running a test to test for significance, a relationship between the two variables.
So with five variables, you have 10 relationships to look at. But as the number of variables increases, you have an exponential increase in the number of two-way relationships that you have. So you double the number of variables to 10, you more than double, actually quadruple, roughly, the number of relationships to test to 45. Or for example, looking at economic data, where there are now 61,000 stats, in fact, in real time, by the Federal Reserve. So just running a two-way test, you have 1.86 billion combinations to look for.
Is there useful data in there? I mean, probably yes. The amount of armadillo production in Nevada is probably useful to someone. And there are some insights here. I'm not bemoaning it totally.
But what you really have here is you have some real, legitimate, new relationships that we're discovering and we're understanding causality, not just correlation. We're understanding relationships, not just something you found in a stat package, right? But that's not increasing as fast as the number of tests that we're running, so you have a widening signal-to-noise ratio. The gap between what we think we know and what we really know is increasing. And that's really dangerous, because we're human beings, and that tends to lead us to make some stupid decisions sometimes.
Of course, we also have-- excuse me, before I have a Marco Rubio moment here. You know, Marco Rubio has given speakers a reprieve for lifetime, basically. Anything awkward now involving a bottle of water, throughout history now, you can always just make the Rubio joke.
But we tend to have very kind of monkey brains, still, for the most part, where our intuitions about how to look at data were honed, in part, by evolutionary advantage. And if, for example, you're a caveman trying to perceive whether the rustling in the wind over yonder is, say, a tiger or a lion or just the wind, then you probably have an incentive to have a very active pattern detection system. There's some biological [INAUDIBLE] that distinguishes us from other creatures aren't very good at making predictions, for example.
If you do literally have chimpanzees who are up in trees, right, they have a whole system where when they see a dangerous snake, a cobra, or something, but they can't make inferences as we can. They can't say, oh, there's a path through the grass. It's about snake-sized and snake-shaped. Therefore, we can predict we should get out of the way. They wait till they actually see it, and sound an alarm.
Instead, we humans have very active, as I said, pattern detection capabilities. But it can lead us astray sometimes when presented big, often random streams of data. This, for example, is a set of six stock market charts, two which are real. They represent, say, five years of the Dow Jones Industrial Average. And four of them are just fake, where I just flipped a virtual coin, ones and zeros, and it has an uptick or a downtick as a result.
A Princeton economist named Burton Malkiel-- showed charts like this to a technical trader on Wall Street one time. They were all fake, random noise. And the guy was like, go buy this stock immediately. This stock is a dog. Making all these differences based on, you know, bullshit, in essence.
By the way, if you are good, then DNF are the real ones here. You should go apply at CNBC instead of completing your education at Cornell.
[LAUGHTER]
There is a kind of more advanced form of this that I think people who are working with big data can be prone to. It's a problem which is technically called overfitting. So say you have a nice kind of parabolic distribution of data. So this could be, for example, the temperature in Ithaca over the course of the year, where it starts out really cold and gets still pretty cold, and then it's kind of warm, and then it gets cold again before too long. Or it could be Alex Rodriguez's career, where he's productive, productive, productive, and then falls off a cliff. But fairly typical distribution.
So, but usually you don't get this robust a sample. You usually have some subset of the data instead. Maybe here are 20 random cases.
Now, having gotten to play God before and knowing what the true relationship looks like, then we can say, well, what's the real structure here? What's the real underlying relationship? We can say it's parabolic, but what people sometimes do instead is try and track down every last outlier.
This is called an overfit model, and it might give you a better R-squared in your stats package. It might seem superficially better when you're trying to explain away every data point. But the mask is so rich that it doesn't predict very well at all.
And here's kind of one real-world example, where people used data that seemed like big data but was actually very thin and caused some problems. So as I'm sure you guys know, there was a very severe earthquake in Japan in March 2011, magnitude 9.0 or 9.1, depending on which source you look at. But this is a chart showing the long-term relationship of earthquakes in this area.
One interesting empirical pattern that earthquakes follow is what's called the Gutenberg-Richter law. And so what that means is that for every degree of magnitude you go up, the earthquakes become 10 times less likely, meaning that for every 100 magnitude 5 earthquakes, you'll have 10 magnitude 6's over the long run or one magnitude 7, et cetera, et cetera. So you can make inferences about the likelihood of larger earthquakes from smaller ones.
But this is data from this part of Japan, just off the Pacific coast. And they had not had very many large earthquakes in this region over this 40-year period where you have high-quality data [INAUDIBLE]. Lots of magnitude 7's, 7.5's, but not a 7.8, not an 8.0.
So they convinced themselves that there were geological limits on how much of an earthquake you could have in this area instead. Instead of kind of fitting this data this way, which says, yes, you can have a 9.0 from time to time, they had an overfit model that looks like this, where it's like, well, our region is special and different. The seafloor in Japan is different. We can't have magnitude 9.0 earthquakes, so instead we'll design the nuclear reactor to withstand an 8.6 only.
But of course, they did have a 9.0 earthquake, and the reactor did fail, because they were somehow using-- well, the reason why, really, is because they were using data that is actually very thin, where there had not been a magnitude 8.0 earthquake in this region. But you're only supposed to have one every 30 years or so, which means you'd have a magnitude 9 once every 300 years or so.
To not have a 30-year event occur in a 40 or 50-year time span is not very surprising at all. It's like a 300 hitter going 0 for 4 on one particular day. But people not realizing how little history they really had led them astray, so looking at the broader global pattern, they tried to find something overly local and overfit and underpredicted the disaster as a result.
So the idea might be, well, if human beings are so biased in their perceptions, why can't we turn things over to the computers instead? I think this is also kind of an artificial solution that has caused some more problems than you might realize.
So this is a position from-- I'm a poker guy, by the way, as Yasamin mentioned. I don't have the patience for chess, really. But if you are a good chess player, you may recognize this position. This is from the first match between Garry Kasparov, the greatest chess player then and probably still in human history, against the supercomputer Deep Blue. So Kasparov here has the white pieces, and Deep Blue, the computer program, IBM's program, has the black pieces instead.
Now, if you're a really good chess player, you might know which player has the advantage in this position right here. I can't tell, but maybe you can. What's interesting, though, is that Deep Blue and Kasparov might take very different approaches to this question.
So computers have heuristics and code that they use to break down a position into discrete elements. So for example, in chess, there are values that are assigned to different pieces, comparing them to pawns. So for example, a queen is worth as much as nine pawns, or a bishop is worth as much as three pawns.
So Deep Blue can add up these results and say, I have 30 points. Kasparov has 29. I'm ahead. I'm probably going to win.
Kasparov, though, looking at the same position, he has much less computational power, probably 100,000 times less raw computational speed than Deep Blue does. But what he can see that Deep Blue doesn't is relationships, instead. So in chess, if your king is checkmated, you lose the game no matter how much materiel you have elsewhere.
And Deep Blue had not thought through its position-- programmed through its position very well strategically, where this king has been descended upon by three different pawns. Kasparov has his queen and his bishop are also able to move across the board quite freely. So all this pressure coming down on one area. In effect, Deep Blue's position is very dire, instead. A good [INAUDIBLE] would much rather have played the white pieces as Kasparov did.
So score one for Kasparov, it would seem. But it turned out that he lost the six-game match with Deep Blue, partly as a result of a move that would happen later in the same day.
So this is game one, again, and Deep Blue's position has now unraveled quite a bit. But computers don't have any pride, right? They don't get tired. So they'll keep being annoying and fighting on, just in the hope that Kasparov will have an aneurysm or something or will make a big blunder, will need a smoke break and will do something crazy, right?
But Deep Blue's position's really in a bad shape here, and it now makes a move which Kasparov can't figure out at all. It seems like a random bug, in essence, where it moves its rook from the square labeled D5, square labeled D1, accomplishes nothing whatsoever. Doesn't move it out of harm's way. Doesn't do anything offensive or defensive or anything else.
But so Kasparov keeps thinking about this-- and Deep Blue would actually quit a little bit later on in the match-- and decides that, hey, I'm Garry Kasparov. I can think 15 moves ahead. So if Deep Blue made this move and I can't understand it, that means it must be able to outthink me, right, it can think 20 moves ahead instead.
But I talked to IBM's programmers, a guy named Murray Campbell who helped design Deep Blue, and it turns out that what looks like a random bug actually was just a random bug.
[LAUGHTER]
There's a clock in chess. If you use all your time, then you forfeit no matter what. So it had failsafe in its code-- it's good, if you're coding, by the way, to have some failsafes-- where it would make a random legal move if it was about to time out. That's exactly what it did.
But Kasparov misperceived this as a sign of deep genius instead, and played quite badly throughout the rest of the match as a result. The second game, he resigned a position where he probably could have gotten a draw. Played very defensively, and not his normal attacking style, because he was psyched out by a bug.
And this is a common problem whenever we encounter some feature that we have a model that gives us a result that we think looks very promising, but we have to wonder, if it defies common sense, is it really getting us anywhere or not? So this is a map showing how to get from point A to point B. This is in Manhattan. I live in Chelsea by Madison Square Garden, basically.
I was going to an event at the Guggenheim on-- where is it-- 89th Street and 5th Avenue, roughly, right? So basically point A to point B. It's a grid. Streets are all numbered. Shouldn't be that hard. But here's the route that my taxi driver took instead.
[LAUGHTER]
So ordinarily you would think, oh, this taxi driver is trying to bilk you for a higher fare, right? But this was actually different, where I was looking at his GPS, and he was dutifully following the GPS the whole time. This is exactly as GPS told him to go.
And the reason why is because there's a road here that cuts north to south through Central Park called, I think, East Drive, that's a little bit longer. It's kind of windy and twisty as the crow flies. But still, the GPS said, this road has no traffic on it whatsoever. So it's a really expedient route to take to the Guggenheim.
The reason, though, it had no traffic was because it was closed, where it closes on some random weekdays for cleaning and whatever else, and so it was trying to exploit a bug, and as a result got very confused and routed me back and forth through Central Park three times. What's interesting, though, is that, you know, the cabbie was willing to override common sense.
I'm sure you guys may all have friends or partners who, they'll go to a city that they've been to 100 times and pull out their phone, right, and can't walk straight anymore till the phone pops up. But it is a problem. I know that some of you guys are in the hotel business, right? And if you find a location that you have an algorithm that says is perfect, for example, but there are no businesses there or there are business that fail, you should be a little bit careful, right? There might be things you're not seeing.
In fact, this apartment I bought in New York, I can't really tell all the story for-- but you know. I thought the apartment was underpriced relative to what we were paying for it. And then we found out about the neighbors that live nearby, about whom I shall not say any more. But sometimes there are reasons why variables that are not captured in your model can sometimes be implicit things that can cause a lot of problems instead.
So I used to call this last part Solutions, but I think that's a little presumptuous. It's a lifelong process to get rid of some of these issues, so I'll call this Suggestions instead.
This, by the way, is a neon sign that describes Bayes' theorem. All these approaches are vaguely Bayesian. I'm not going to talk about it in too much [INAUDIBLE] detail. But basically, Bayes' theorem is about how to weigh new information and new evidence against what you already know. And it's trickier than it might seem. That's the essence behind it.
So there are three tips that I think come out of this. Number one is think probabilistically. Number two, know where you're coming from. And number three is trial and error.
So think probabilistically first. One more story here. This is a levee, and I know this is kind of a really tacky looking slide, but imagine that you're North Dakota. It's about this time of year. Every time in March, April, you have a lot of runoff. The snow finally melts up in Manitoba and Saskatchewan and whatever, right?
It flows into the rivers, and so you have the rivers rise and crest every spring. So in 1997, you had an especially cold and snowy winter in the Great Plains and up in Canada, so the weather service was worried about flooding along the [INAUDIBLE] in North Dakota-- in Grand Forks, North Dakota, in fact.
But they said, look, you guys have a levee that goes up to 51 feet, and our prediction is that the flood will go to 49 feet instead. So people, by and large, didn't worry. Very few people bought insurance. People didn't evacuate unless they were close to the river.
But what the weather service forgot to tell people is that the margin on their forecast was plus or minus 9 feet. They were afraid, at that point, that if they communicated uncertainty in their forecast, people wouldn't believe it anymore. They would think that people weren't-- you know, there's a misconception that experts don't allow for uncertainty.
In fact, the opposite is true. The more someone is willing to admit to self-doubt, to describe their process, the smarter they are likely to be. That's why Dick Morris is very self-confident and kind of an idiot on TV and everything else.
But the actual flood, by the way, crested to 53 feet, so well within the margin of error. But it did flood three quarters of the town of Grand Forks, North Dakota, instead, and caused billions and billions of dollars in damage. By the way, this may have been a preventable disaster. You could put sandbags on top of levees. So you could have at least mitigated the damage or at least had the property be better insured and not require a big bailout instead.
Since then, though, the weather service has really learned the importance of communicating uncertainty to the public. If you see hurricanes-- I know they probably don't come up to Ithaca, exactly. But if you see forecasts for hurricanes on the East Coast, you'll have what's called a cone of uncertainty, or a cone of chaos. And that's been getting smaller every year, because the weather forecasts are getting more and more accurate.
So it used to be, about 30 years ago, that if you had hurricanes sitting in the Gulf of Mexico, they could only predict where it would go within 350 miles on average, three days in advance. So the entire-- you'd have to evacuate the entire South, basically, not a very useful thing to do.
But now, they're within about 100 miles instead. It's much more plausible to prepare a given region. And some of the big storms that did hit New York and the Northeast recently, like Irene and Sandy, were predicted especially well. I mean, those storms, you know, the exact county of landfall in New Jersey was predicted correctly four or five days in advance.
You can't stop nature, but you can evacuate people and minimize loss of life a lot. But the fact that the weather service makes so much progress, and that they understand uncertainty and literally the chaos of the weather, I think these two things are tied together, that being unafraid of ambiguity and uncertainty-- and this is a lot of what causes, by the way, political partisanship, is people want attitudes that remove all doubt about their having to think hard and answer political questions, right? If you just stick to the party line, you don't have to do very much mental work. And it might be a good shortcut, but it can also lead people astray when there are big problems that we face as a society, or even when they're just trying to forecast an election, potentially, instead.
But that gets into point two, which is what I call, know where you're coming from. Another way to put it, though, is just know what your biases are, as hard as that might be to do. So the metaphor here, or the map, is from Pearl Harbor, where we, in 1942, had naval stations-- 1941, excuse me-- in Dutch Harbor, Alaska, the Midway Islands, Wake atoll, and Guam, but we didn't have GPS satellites. They had only a very limited range of about 200 miles across that they could patrol.
So say you're Japan, and you want to move a fleet of six gigantic aircraft carriers through the Pacific. Well, where would you go? Well, how about where we have no naval stations at all?
They took a path-- this is kind of like the Dallas Cowboys' defense or something, right? But this giant fleet takes two weeks to move here. And we have no way to detect it at all, because there were holes in our defense. You had an asymmetric situation. Any time in national security where you have a weak point, you have a squeaky side door, then you're going to be attacked there and not where you have the armed guards instead.
But of course, it's hard to know what your biases are. And that's why, as I'll talk about in a moment, testing yourself is really important. But being aware and alert to the fact that you have biases, for example.
One anecdote from-- one study, rather, mentioned in Sheryl Sandberg's new book, Lean In, talks about what happens when CEOs or executives or HR managers are presented with resumes that have the same career record, the same jobs, but one has a female name attached and one has a male name. And often, the female is discriminated against.
But the study also found that the people who do the most discrimination are the ones who say they have no gender bias at all. The ones who are alert to the fact that they might perceive things differently based on a name and an implied gender tend to be more alert to self-correction, where people who think they're totally immune to all of it actually had much bigger biases instead.
The other bias to watch for, though, the other thing to remember is that, what we really want is accuracy, which is just another way to say truth. People, I think, sometimes have a bias towards having a very precise answer, because it removes the appearance of uncertainty, but without the actual fact of uncertainty instead.
You see in polling, for example, you have the Zogby interactive polls, for example. I don't want to demean all online polls. Some have gotten pretty good right now.
But the Zogby polls are not. They're kind of pretty much total crap. But they take giant sample sizes. They'll sample 20,000 or 30,000 people and they think that-- if you take a biased sample, though, of 20,000 people who are the types who are bored enough to answer a Zogby online poll, then you're going to have an inaccurate result. You can survey everyone who's bored enough in the world to answer a Zogby online poll. It still won't be representative instead. It's precise, maybe, but not accurate at all.
But the third suggestion here, and kind of moving toward a more optimistic take, is that this is really a trial and error process. In most fields where you have data analysis problem, then you encounter diminishing returns, where if you put in just a little effort to get better at quantifying and measuring things then you'll make a lot of progress in the near term, and then you kind of hit speed bumps down the road as you add more variables, more complexity to your model instead. So if you're in an organization or are working on a problem, thinking about where you stand on this kind of learning curve-- you know, sometimes just measuring things can be really quite valuable. And the failure to measure things can be really quite difficult, is the converse of that.
So for example, I spoke recently with a group of state school superintendents who want to use more analytics to measure educational performance. The problem there, though, is that we don't even have any agreement on how to measure student progress, really. There are great disputes about standardized tests and whether they really are measuring learning in a good way or not.
There's not tracking of students. If you have a student who goes from one part of New York state and moves from here to Syracuse or something, you actually lose track of that student instead. So in many fields, just measuring things, getting things down on paper, having high-quality data, can get you a lot of the way there.
In other fields that are more competitive, though, where people are getting the basics right, you have a water level, you have competition, and your advantage does come from the margin. That's why you can't just say, I'll do the easy 20% part and then profit.
You know, we are in a market economy, and so everyone else can do that as well. You have to kind of work hard for your edges, but you don't usually expect miracles. Instead, it's hard work. Big data is not going to absolve us of having to be smart and to work hard and to have a really rigorous competition in economy.
But there is happiness in the end, potentially. If you apply Bayesian method, meaning that you have a careful way of weighing new evidence, then you eventually converge toward the truth. If you're willing to-- first of all is they're not a sufficient set of conditions, but they are necessary, right, if you're willing to test your ideas through making actual verifiable predictions.
Some academic departments I know, not here, tend to resist this idea that, oh, we should make forecasts and make predictions, right? We're just theoretical scientists instead. But OK, theory is great. I'm not a guy who says you should abandon theory at all. But theory is useful because it can help you make better conclusions in the long run and test your ideas out. So prediction is a very intrinsic and important part of science.
And number two is be willing to revise your beliefs. People tend to be very stubborn, for the most part. There are exceptions, again, in the political press, and every new poll is a game-changer, right? But more of us tend to be too slow to revise their beliefs when they contradict what the tradition is or the status quo is instead.
So that's why Bayes' theorem can be useful in giving us some guidance toward moving toward the truth in the long run, because there is progress in the end. I told you the unhappy story about having all these wars here, but eventually you did have, in part because of the spread of knowledge, having books means you can actually spread and share ideas and record technology and pass it down from generation to generation. And between that and a lot of other things going on, having kind of a commercial economy, you eventually had economic growth, when you had had none before. You had this exponential growth pattern where even with the occasional global financial meltdown, we've had progress when we didn't have it before.
So I know this presentation might seem like it's a bit pessimistic at times. What I would say, though, is that through the broader course of human history, until about 1775, there wasn't really any economic growth at all, just enough to sustain a very slow increase in the population. So maintaining progress is hard. We're always going to have new technological developments that help us to do that. But we also need smart people who know how to marshal it and to turn big data and fast computers into actually producing benefits for society, for industry, and for our personal lives. So thank you very much.
[APPLAUSE]
And so I think we have a fair bit of times for questions here. So there should be mics coming around the room. I'm happy to talk about politics, sports, whatever you guys are curious about. Yeah? Oh, should I-- yeah.
AUDIENCE: So we're told the modern era of using computers in elections began in 1952 when Walter Cronkite and CBS contracted with UNIVAC, which on the basis of roughly the first 3.5 million votes, concluded 100 to one odds for Eisenhower, a result which seemed so unlikely in what was thought to be a close election, that they didn't even report it at first. But on the basis of this incredible success, of course, what we know, in all subsequent elections this became the methodology.
And my question for you is, now that your methodology has become so successful, where as you accurately describe it, it's not rocket science, it's relatively obvious. But nonetheless, a lot of talent to do it correctly and to properly hedge the efficacy. But I'm wondering if your expectation is now that you'll disappear in the noise, because in the next election, everybody will be employing this methodology.
NATE SILVER: No, look. I'm sure there will be a lot of versions of the FiveThirtyEight model in 2016. I mean, there were a few different versions, frankly, last year that were also pretty good. You had a guy at Emory University who had a model that also got all 50 states right, and in fact, had all 50 states right in June. You know, you had other people who had 48 or 49.
I mean, any model based in polls is kind of hoping the polls have a good year, number one. You can quantify the uncertainty. But you know you're not going to have that much credibility, wrongly or rightly, if you say, well, you know, these polls were wrong, therefore I was wrong too.
But no, there's going to be more competition, I think. And some of it will be bad. But [INAUDIBLE] that look, a lot of people who [INAUDIBLE] politics for a living are kind of hardcore partisans, and do you want kind of the Drudge Report style of filter, where they're getting only good news, basically, or only news that they're going to be interested in?
But there is a market for people who come from academic backgrounds or tech backgrounds or economics backgrounds who want more wonky politics coverage. And so it's kind of a market response to that, I think. And so yeah, I expect there to be a lot more FiveThirtyEight competitors four years hence.
I also think, though, that-- and this is something to remember if you guys are in a kind of data-driven discipline. The fact that at FiveThirtyEight we take a lot of care with how we write about politics and how we present the information in terms of data and charts and everything else, that's really important, too. That's a big differentiator for us from, say, the other academics that have really good models than maybe the modeling itself.
So even if you're a science person, it's really important to work on your writing skills. Don't write like an academic, even though you'll be encouraged to do so at different times. But that communication function is quite key.
AUDIENCE: Your results for the election actually seemed too accurate, given that they were-- I really appreciate the fact that they were probabilities, and leading into the election, a lot of the time it was sort of three to one Obama. And if you were an Obama fan, you're thinking, wow, one in four, that's not that unlikely. But to get every single state, I presume after the fact, you look at how your model did and did some sort of analysis. And was it really sort of probabilistically sound after the fact?
NATE SILVER: So there are a couple of issues here, which are a tiny bit technical. But one issue is that we assume that the error in different states is correlated, right? So some elections, you will have a very good result, where you get zero or one or two states wrong, and then some, you'll have an election where it's a total disaster and you get 10 states wrong instead.
And so because of that pattern, that's why we would expect a [INAUDIBLE] basically to have a disastrous election every once in awhile. Frankly, with response rates to polling going down and down every year, it's a miracle that we haven't had one where the polls have been way off recently. But that's part of what that's trying to measure, where even so, yeah, getting all 50 right, in my view, was lucky. The model would have given itself about a 20% or 25% chance of doing that. So yeah, that was quite fortunate.
But even there, by the way, there were some states where we were several points off, but they all happened to miss where Obama kind of did better than the polls predicted, but still one mistake. So in Colorado, he won by five and a half instead of one and a half points. So you call the winner right, but still a four-point miss. Had it been in the other direction, then Romney would have won Colorado instead. Same with Florida, et cetera, et cetera, and you would have missed several states instead.
AUDIENCE: [INAUDIBLE]
NATE SILVER: Well, I don't know that intuition is all that well understood, in terms of it can mean a lot of different things. And the way I tend to think of it is kind of, what's a good-- intuition boils down to kind of shortcuts that we take when we don't have time or the inclination to make a more thorough analysis.
In my days playing poker, one thing poker players are pretty good at is sizing up a fairly complicated problem and coming to a fairly good first-cut estimate of how likely they are to be ahead or behind in the hand. Given the way the hand's played, the cards on the board, I'd say, well, I think there's a 70% chance I'm ahead, so I'm going to be pretty aggressive about getting my money. But I'll change my mind if he makes a big bet, goes all in on top of me.
I do think people sometimes, though, fall in love with their intuitions. They forget that, yes, we have a lot of things to do in everyday life. We can't apply analytics to figure out how to get to the grocery store faster. It's just not worth the effort, necessarily.
But sometimes people fall in love with the shortcuts because they're neater and cleaner than the actual more complicated solution. And again, that's part of what happens with political partisanship, is you kind of have a heuristic saying, oh, taxes are bad, or you're just saying, oh, we want more economic equality. And that might be a good principle, but then people can apply them so dogmatically, sometimes, that they kind of override rational policy-making in some cases.
So intuition is great. And work to develop your intuition. But remember, it is an approximation. And so is a model. An model is an approximation as well, maybe a second or a third approximation.
But remember that we are kind of humble creatures relative to this whole gigantic world, and our subjective point of view necessarily comes with biases and hang-ups and perverse incentives of different kinds. And being aware of that, and not falling in love with the model of the world, whether from your intuition or even from a stat package instead I think is pretty important. Oh.
AUDIENCE: [INAUDIBLE]
NATE SILVER: So I get this question a lot. And people sometimes say, well, if I read FiveThirtyEight and I know that Obama, or I guess Rubio or whoever in 2016, Hillary is going to win, then why bother voting? And that's a very, very dangerous precedent, I think.
One story I know from New York, for instance, is we had a mayoral race there four years ago, 2009, where you had Bloomberg, who was pretty popular, but Bloomberg kind of changed the law to give himself a third term, basically. And so Bloomberg was, like, 15 points ahead in the polls, so I had a lot of friends who said, you know, I was going to vote for Bloomberg. He's done an OK job. But you know, he's going to win, and this whole third term thing is kind of sketchy, so I cast a protest vote for Bill Thompson instead. And Thompson almost won, came within three points, because people reacted to the polls and changed their behavior as a result.
Conversely, in the Iowa caucuses last year, you had Rick Santorum who suddenly surged in a CNN poll, with a 300-person sample, really high margin of error, probably just a random fluke. But it triggered lots of news coverage, right, it gets really slow in Iowa in December. A CNN poll showing movement in the race is pretty exciting, right? So it became a self-fulfilling prophecy instead.
So yeah, people can react to polls, can react to the news more broadly. I'm not sure that FiveThirtyEight is necessarily any different from that. But yeah, I mean, because people sometimes perceive, oh, you have Romney with an 80% chance-- or, excuse me, Obama with an 80% chance of winning, well, that means that Romney has a 20% chance of winning as well. It's a pretty large probability in context.
If you're a Democrat, that's a really catastrophic thing, if you're a Democrat, for Romney to win. If I said, oh, there's a 20% we'll be hit by a meteor tomorrow and the Earth will blow up, you'd take that pretty seriously, probably, right? So people should remember that not only is that contingent on them not knowing what the model is saying, but also that it's contingent on them kind of fulfilling their duty or their interests in voting.
AUDIENCE: Mr. Silver, I was actually just going to ask the question the gentleman before me asked, but you know, polls, how we can change, maybe, how a person might vote, supposedly, because Obama was showing to be winning by a large percentage, a voter might think that, you know, maybe I shouldn't vote for Romney, I should vote for Obama. So aren't polls, in a way, biased themselves? Aren't they supporting a candidate by showing that this person has a higher chance of winning?
NATE SILVER: Well, you know, in some countries-- I think France, for example, maybe Italy as well-- there's a blackout period where you can't do polling in the 72 hours before people vote. And I think the concern is exactly that.
Although keep in mind also that polls can be a way to detect electoral fraud, potentially. I mean, be very careful about this. I don't mean in a US context where you're a couple points off. But if you have a poll that's grossly off in result, then you can-- it could be one sign that something untoward has occurred, potentially.
But there's kind of this question of, oh, what's the utility of polling in general? I think the kind of 2012 election kind of demonstrates this, in part. I don't mean oh, it's cool to predict the election in advance. But the fact that actually, instead of having the media say, oh, Romney is surging in Pennsylvania because I saw a lot of Romney yard signs driving around Harrisburg, right? You actually take a random sample, in theory, of actual Pennsylvanians and see what they have to say instead.
And the reason why a lot of news organizations like the New York Times still pay a lot for polling-- it's getting really expensive, or it is expensive to do really good polling-- is because you can represent the man or woman on the street and not have a reporter or a pundit make false inferences instead. So I think polling still serves some useful functions, even though it can get sticky in terms of how people, how voters react to polls in a tactical way sometimes.
AUDIENCE: What is your opinion on prediction markets?
NATE SILVER: So my opinion on prediction markets is kind of like my opinion on markets in general, which is that they have lots of flaws, but probably every other system is worse, right? So we're stuck with them, I think.
I mean, the one good thing that prediction markets can do is that they give people good incentives, where money is not a perfect incentive. It's a transparent incentive where people are putting bets on the line in order to be right. A lot of the heroes in my book are gamblers, basically, because gamblers don't have to deal with any BS, right? You put a bet down. If you're right, then you win money, and if you're wrong, then you lose money. There's a lot of luck in the short term, but in the long term, it evens out.
With that said, you know, I mean, there have been different studies about whether the markets are better than, say, the FiveThirtyEight method. In theory, they should be, because they can take that information and then layer on top of that other models that are also good, or whatever extraordinary circumstance you might have. In fact, it looks like the [INAUDIBLE] are a little better than the markets instead.
I think one reason why is that you don't really have professional traders in these markets, where you have-- you know, the most you can win on [INAUDIBLE] trade is several thousands of dollars or tens of thousands. But you know, for example, oil stocks on November 7 went down by, I think, $15 billion, because Obama winning another term was seen as bad for oil. It was seen as good for health care device manufacturers. Those gained $1 billion in market cap.
So people are making bets on the election and making them in the stock market. The person you get on [INAUDIBLE] traders thing is probably not as sophisticated. But still, I would rather have that as a reality check against the pundits and so forth. People are willing to put their money where their mouth is. I think that's a good thing.
AUDIENCE: So one question on your thoughts on our ability to process this data, where, I mean, as you said, we're generating more and more data. Like, in my field, biology, we're generating data at a pace that outpaces Moore's law by a factor of 10. So, like, the computational power simply cannot keep up with the rate that we are generating data.
And do you have any insight or opinion on how to actually process this? I mean, leaving aside making smart models, it's getting harder and harder to make stupid models about this data. So do you have any ideas on that?
NATE SILVER: Well, and so this is kind of-- I can't speak to the hardware issues. I don't know very much about that. But this is one reason why, for example, this idea that we're going to have this kind of singularity occurring seems quite dubious to me, because you always have a limiting factor, for example, where we have cell phones that are getting more and more powerful every day, but the battery lasts for about 45 minutes now, it seems. Maybe you can last through a subway ride playing some Angry Birds or something, and then you can't make a phone call the rest of the day.
So there's always limiting factors here, and for growth to occur, if you kind of remember that chart from before, where one reason why the printing press did not really produce progress was because you had other limiting factors in place, like you didn't really have a market economy in place, a lab for competition that gave people incentives to make products and to build technologies instead. And humans are the ultimate limiting factor, though, really, I think. There aren't going to be that many cases where we have this big data set and we know what we want to do with it but the computers aren't fast enough. I think we're more likely going to have the reverse occur, where you have plenty of technology, essentially infinite data, infinite technology, but we are limited in our capacity to make good use of it.
AUDIENCE: Hi. With younger generations polling very differently from older generations, particularly on issues such as gay marriage and marijuana legalization, do you see-- are you able to, I guess I'd say, like, predict the future, in terms of a lot of policy changes and things like that, just account for that? I mean, I know there are lots of changing opinions for a whole number of different reasons. But could you just talk about that, like, perhaps anything that you think is inevitable or almost inevitable in the future?
NATE SILVER: Well, I mean, gay marriage continually gaining ground is a pretty safe bet, because of this generational aspect, where if you replace-- "replace" is the polite term, right? But when an 80-year-old passes on and is replaced [INAUDIBLE] by an 18-year-old who just registers to vote, the 80-year-old is about 25% likely to support gay marriage rights. This 18-year-old is about 70%, 75% likely to do so, so you have a very organic process. And the process of people actually coming out to their coworkers and their families and their friends also produces shifts of opinion there as well.
But apart from that, I think people sometimes have to be careful about assuming that a trend that is moving in one direction will continue to do so. So there's a lot of momentum now behind, for example, marijuana legalization or decriminalization. But there also was a lot of it in the 1970s, and then the kind of crime wave you had in the '80s and the Just Say No campaign really reversed that very sharply instead.
So I think in general, you know, if there's not a generational component-- and there is some to marijuana use, right-- but it's hard to assume things are going to change. Abortion, for example, is another case where it hasn't fluctuated for many years. It's been the same number of people across different generations, and it's been 50-50 for 40 years now, ever since Roe v. Wade. So every issue is little bit different. And gay marriage is unique in that it's the one that seems like it's really predictable instead.
AUDIENCE: Over here. Could you talk a little bit about outliers and why you stopped playing poker and why Warren Buffett is an outlier and why there are baseball players and poker players who seem to be consistent outliers when most great performers can't achieve that level consistently forever?
NATE SILVER: Look, in most things, you have a kind of bell curve distribution of talent, and so you have a few people who are really talented. You also have a lot of people who are really lucky, though, right? Of all the people who have gotten rich from Wall Street, where you have a population of five Warren Buffetts who are true geniuses. You also have a population of thousands of average to good to very good but not Warren Buffett-level traders. Some of them are going to make lucky bets over the course of a career. A few of them are actually going to cheat as well, right?
But of all highly successful traders, probably one's a real genius, 90 are lucky, and nine are cheaters or something, right? It's some ratio like that instead. So it doesn't mean there aren't people who can beat the markets. There probably are.
If you look at poker, and we can do some online data mining, there really are some players who, over the long run, really are clearly better than their peers. But it really is the 1% or 2%, whereas you have a big middle class of people who are kind of treading water and are very vulnerable to environmental conditions. So every stock trader looked like a genius during the dot-com boom, and then an idiot during the dot-com bust, just by kind of getting the group mean instead. Yeah?
AUDIENCE: [INAUDIBLE]
NATE SILVER: No, so, yeah. So we do weight different polls differently. And we make kind of a strong prior assumption that-- well, there are two different things we do.
Number one, we weight polls differently, and number two, we debias the polls. So for example, if a poll, call them, like, Rasmussen Reports, right, always has a poll that is two points more GOP-leaning than the median poll or the average poll, then we just strip that two points of bias right out instead. It's pretty predictable a lot of time.
But we weight polls a lot more that do call cell phones. Like, there's no more excuse now to not call cell phones along with land lines. You're missing about a third of the American population outright.
And a lot of people, like me, I think I technically have a land line, which came with my cable package, but I never pick it up, right? Only a pollster, or a telemarketer, or maybe my mom, right, but she has email now, would dare call the land line at home, right? So you never answer it, right? And that causes big problems as well.
But yeah, I think we really did see this year that some of the cheapo kind of robo-pollsters did quite badly. There's also evidence, by the way, that they cheat off the good polls, that if the good polls-- if, for example, if you have a case in the primaries where you only had robo-polls, and they did quite badly, but when you have a good pollster weigh in as well, then they all calibrate to that and herd toward it as a result. And so I think the really high-quality stuff is still important.
AUDIENCE: [INAUDIBLE] this is a change of habit, what do you think about cyber-terrorism [INAUDIBLE]
NATE SILVER: So cyber-terrorism, it worries me a bit, in part because it's something that hasn't happened before, and I think we're bad at dealing with threats that might be known in theory, but which have not been actuated yet instead. The fact that so many systems in our financial system are now automated, for instance, where the-- I thought the flash crash which happened, I think, in 2010, where the stock market went down by 700 points, the Dow, and up by the same amount all within 15 minutes, like, that was really quite scary, I think. In some ways, that's scarier than the 1987 crash, almost, but didn't get nearly as much attention as well.
So yeah, I mean, can you have people manipulate information? If we're kind of an information-based world, can you cause cascades of misinformation and as a result, then yeah, maybe, right? And that can be kind of the forum of cyber-terrorism, apart from just kind of finding some EM wave that disables our computers. It can be more like, are there misinformation campaigns that occur that can cause bad results based on people making decisions very, very quickly?
AUDIENCE: You've talked a lot about [INAUDIBLE]
NATE SILVER: Well, what do you mean by trends?
AUDIENCE: [INAUDIBLE] The type of cars that people will be driving in 10 years.
NATE SILVER: So I think trend analysis is usually one of two things. Either it's BS, or some people who are taste makers get credit for being predictors instead.
Right, so say, for example, you have a consortium of fashion designers who say, well, we think brown is going to be the in color this season and we're going to manufacture a lot of dresses in brown, and throw big marketing dollars behind earth tones, and have our best models wear these on the runway. Well, you can be sure that brown clothes are going to sell really well. But it's because they're marketed, not because they correctly predicted how public taste will evolve.
But yeah, I'm suspicious of people who predict trends. I think usually what that means is they're describing something that's happened in the recent past, and not really making a prediction about what will happen in the future, necessarily. But I don't know. Frankly, just to determine what's going to happen in the near term, in the here and now, is difficult enough sometimes. So what's going to happen 10 years down the road, maybe I'm not as worried about.
AUDIENCE: Excuse me? Thank you. What are your thoughts on people who don't quite understand metrics like [INAUDIBLE] replacement in VORP and win shares being used by baseball announcers and general managers like the Seattle Mariners, who don't quite understand how to use them and are kind of misinterpreting it and screwing up their teams instead of just sticking to the role of observation that kind of propelled pre Moneyball Oakland Athletics?
NATE SILVER: Well, I mean, look. With anything, [INAUDIBLE] parents using Twitter or something, right? Technology in the hands of the uninitiated can be quite dangerous sometimes.
And you had teams, you know, so the Toronto Blue Jays, for example, kind of in the Moneyball era, right, they would take it too far and draft these mediocre college players who no one else wanted who were very low-variance players. So you were guaranteed that they'd be mediocre major league benchwarmers. Most players don't even make the major leagues, so it's good, in a sense, right, but it also means you're never really going to really build a championship team. You have a lot of 4A players, so to speak.
But, I mean, the analytic stuff in baseball now is really quite cutthroat. I went to the Sloan Sports business conference this year in Boston, where kind of like the stat geeks are like stars now. It's very, very strange.
But almost every team now, teams you wouldn't think of, the Cleveland Browns are a very stat-driven team now, for example. Just hired a whole new front office to try and do new things for that franchise. And so it's kind of at the point now where if you're not doing analytics in baseball, you're guaranteed to do pretty badly. But also some teams that are doing analytics wrongly are going to struggle as well.
AUDIENCE: I was wondering if you could talk a little bit about if you think there's a difference in the theoretical maximum of how accurate you can be when you're predicting something as complex as human performance, like in baseball, versus predicting something like a vote, which is more like a binary action. Someone's going to go one way or the other. Do you think that we reach a point where we can't get any more accurate when it comes to baseball, or is that sort of always push forward?
NATE SILVER: Oh, sure. No, absolutely, right? And there are two things people sometimes say, well, you know, is baseball predictable or not? Well, in one sense, it's predictable, in the sense that we can quantify and measure what happens on the baseball field really, really well. It's probably the world's best data set. In the other sense, though, even the best teams only win two thirds of their games. Even the best hitters only get on base four out of 10 times.
So the question is, there are kind of two graphs. It's kind of how, in a cosmic sense, how predictable is this phenomenon or not, and then how close we are to maximizing the available space to us kind of human beings? In baseball, it's not that predictable on a cosmic scale, but we're doing a very good job of predicting what we can instead. And so we're kind of brushing up, I think, against the limits in a lot of respects there.
AUDIENCE: In baseball, do you think we're going to reach a point where we have defensive metrics that are as useful for predictive purposes as offensive metrics at the moment?
NATE SILVER: Yeah, I mean, I think I'm not quite as up on the state of the art there, but I think we're certainly getting there. If you're now at the point where you basically have a three-dimensional recording of whatever happens on the field, I mean, there's going to be some creativity required in how you manipulate that data. But it seems like that's really what you want in the long term.
There are issues, I think, related to-- people forget that just as offensive data can be noisy, so will defensive data, especially from, like, say, the outfield positions where a guy might get a couple of put-outs per game. It's not a terribly large sample size, necessarily.
So people sometimes-- defensive stats, because they'll have a guy who's not perceived to be a very good defender had a really good rating one year. Well, Brady Anderson hit 15 home runs one year, right? He was not a great baseball player, right? But you have flukes occur from time to time.
So I think the defensive stats have undoubtedly improved a lot. Like I said, I don't know enough about the details of them to recommend one system. But I think we're already pretty much there, and I think some of the smarter teams have begun to understand the value of defense and really incorporate it into their player evaluations. OK.
AUDIENCE: So thanks, Nate. It was a great talk. You mentioned earlier the importance of communication, and I was wondering if you could talk a little bit about sort of the perceptive difference of reporting things like election outcomes as percentages. You know, Obama has an 80% chance of winning, versus talking about things like spreads. You know, a 51 to 49 spread, it might come out, statistically, as the same thing, but it communicates something very different about the numbers and the shape of what's going on. Can you talk about--
NATE SILVER: Yeah, it's really hard, right, in part because maybe we need a different vocabulary to distinguish kind of percentages in polls from the percent probability. Maybe we just have to invent different words that we start to use, because you would have people say, if we have Obama with a 70% chance of winning, Romney 30%, people would say, oh, Nate's predicting Obama will win 70 to 30. It's like, no. That's not really it at all.
I tried to use sports analogies a lot, where for example, if you're watching the NCAA tournament, and you have a team that has possession of the ball up by three points with 30 seconds to go, right, that's a really close game. But that team apparently is going to win, 90% or 95% of the time, right?
So what I was trying to say is, 2012 was a pretty close election. It wasn't a landslide. But you also had a lot of certainty toward the end, where Obama's lead was pretty robust in the electoral college. You had a lot of polling. So you were 90% sure what was going to happen instead.
But no, it's really hard. People don't-- if you look at-- I mean, a lot of behavioral economics is really about how people don't perceive probabilities as they would match the objective function here. And it's really hard to do. And you try and educate your readership to some extent, but that's also why my risks last November were very, very much unhedged, right?
You know, in theory you know you're going to get this 10% outcome occurring some of the time. Heidi Heitkamp, the new Democratic senator in North Dakota, we only gave her an 8% chance of winning, and she won instead. But no one cared about the North Dakota Senate race. A lot of people cared about the presidency instead. So you're always kind of dodging bullets, I suppose.
SPEAKER 1: And we have time for one more question.
AUDIENCE: Thank you. It was a great talk. So I had an opportunity last month to hear another talk about big data from somebody else who was sort of interested in the political realm as well. This was the chief technology officer of Blue State Digital.
And when he talked about big data in the context of the Obama campaign, he was looking at it from a very organizational perspective. He used language like "big data will be the backbone of the organization moving forward." And he talked a lot about the relationship of big data to culture. And he used the phrase, "where big data is enabling a culture of optimization." So I was wondering if you have thoughts on the relationship of big data to culture and where you see things going. Thank you.
NATE SILVER: So when I hear a phrase like "the culture of optimization," that sounds a little Orwellian to me, almost. And you kind of wonder, who is optimizing for whom? If there's some surplus produced by an efficiency, does that benefit the consumer or the kind of producer instead?
And it depends on the field. Obviously, in a lot of ways, for example, I think about restaurant reviews a lot. I really like going out to nice restaurants, and how much do Yelp reviews affect your business, potentially? In fact, there have been some studies saying that if you give a negative Yelp review to a business that hasn't been reviewed that many times, it can do thousands of dollars of damage to their bottom line. Or a hotel, potentially.
And how does that change incentives? How much gaming of the system might there be? Will business ever go too far in saying the customer is never wrong, and disadvantage other customers as a result?
So yeah. I mean, you do have fields-- so we talk about polls. But they have a lot more data, the Obama campaign, where they have kind of a file on literally every person in the country, with a lot of information about them.
By the way, just knowing someone's first and last name, you can make a lot of predictions about them. You can probably infer someone's gender, almost for sure. Very often their race. Very often their age, because certain names go into or out style. If you know someone named George, for example, he's probably older. If you know their zip code, because voting is so geographic now, plus their first name, and you can have a very good model of how they might vote.
But you know, so it's an interesting statistical problem, but how that data is used to manipulate them, then sometimes I wonder whether that's really going to help us in the end. But do we have any more time? Or are we-- OK.
YASAMIN MILLER: We're about done. I want to thank you very much.
[APPLAUSE]
Over the past decade, we have mispredicted earthquakes, flu rates and even terrorist attacks. Yet we seem to have access to more data and computing power than ever.
"Why isn't big data producing big progress?" asked statistician, author and NY Times blogger Nate Silver during an April 5, 2013 talk at Cornell.
Known for his innovative analyses of political polling, Silver is author of "The Signal and the Noise: Why So Many Predictions Fail - but Some Don't" and author of the New York Times political blog FiveThirtyEight. Silver first gained national attention during the 2008 presidential election, when he correctly predicted the results of the primaries and the presidential winner in 49 states.
The talk was part of the Survey Research Institute Speaker Series.