Last night, Maria asked me how the polls could have been so incorrect as to predict a Hillary Clinton victory. Since she was not the only person who asked this, and not everyone reads the comments in my blog – shock – I thought I’d follow up with a post about it.
First of all, in terms of the popular vote, exactly what the polls predicted happened. Hillary Clinton won the popular vote by a small amount – currently, 400,000 votes and counting. So, in this respect, the polls – and the Central Limit Theorem – got it correct.
There is the fact, though, that in America, the person with the largest number of votes does not win the presidency. This also happened with Al Gore, who had the most votes but lost the election. We have an electoral college, which counts votes from people in Wyoming more than say, votes from people in California. Strange, but true.
So, why were the polls wrong in predicting the state vote? One could say that it violated an assumption of the Central Limit Theorem of large samples, that there were fewer polls of each state than national polls. I don’t think that is the explanation, though. If that were the case, error should have occurred equally in both directions, both under- and over-predicting the Clinton vote. That did not happen.
There is a statistical term for when the error is predominantly in one direction – bias. I said in my previous post
Another assumption of every statistical technique and hence usually unstated is that your data are reliable, in this case that people didn’t lie to you because they were too embarrassed to tell you that they were voting for Trump.
This is usually called ‘Social desirability bias’ when you give what you think you are SUPPOSED to think as opposed to what you really think. So, if you realize, as someone on twitter says, that voting for Trump means either people will think you are a racist, sexist or stupid, or at the very least, that racism and sexism in candidates are not a deal-breaker for you, then you are more likely to say you will vote for Clinton. It’s the poll version of people who don’t laugh at a racist joke when their one black friend is around.