What Went Wrong With the 2020 Presidential Election Polling? Some Early Theories!

Dear Commons Community,

Once again the polling during our recent presidential election left a lot to be desired. We were given to believe that our pollsters had learned from the errors of 2016. Some observers are theorizing that they did, and that that the 2020 election posed a new set of problems. Nate Cohn has an article in today’s New York Times that reviews what went wrong with the polling in this year’s election. He leans to the idea that this election was different than 2016 for a number of reasons including the overwhelming turnout and the coronavirus pandemic. Cohn’s entire article is below. It is a good analysis.

Tony

———————————————————————————————

New York Times

What Went Wrong With Polling? Some Early Theories

By Nate Cohn

November 11, 2020

Asking for a polling post-mortem at this stage is a little bit like asking a coroner for the cause of death while the body is still at the crime scene. You’re going to have to wait to conduct a full autopsy.

But make no mistake: It’s not too early to say that the polls’ systematic understatement of President Trump’s support was very similar to the polling misfire of four years ago, and might have exceeded it.

For now, there is no easy excuse. After 2016, pollsters arrived at plausible explanations for why surveys had systematically underestimated Mr. Trump in the battleground states. One was that state polls didn’t properly weight respondents without a college degree. Another was that there were factors beyond the scope of polling, like the large number of undecided voters who appeared to break sharply to Mr. Trump in the final stretch.

This year, there seemed to be less cause for concern: In 2020, most state polls weighted by education, and there were far fewer undecided voters.

But in the end, the polling error in states was virtually identical to the miss from 2016, despite the steps taken to fix things. The Upshot’s handy “If the polls were as wrong as they were in 2016” chart turned out to be more useful than expected, and it nailed Joe Biden’s one-point-or-less leads in Pennsylvania, Georgia and Arizona.

The polls were off in 2020 in almost the same ways they were off in 2016.

In Maine and Nebraska, two electoral votes are apportioned to the winner of the state popular vote, and the rest of the votes are given to the winner of the popular vote in each congressional district. (Maine has two congressional districts, and Nebraska has three.) Poll error in 2016 is calculated using averages of state polls conducted within one week of Election Day.

The national polls were even worse than they were four years ago, when the industry’s most highly respected and rigorous survey houses generally found Hillary Clinton leading by four points or less — close to her 2.1-point popular-vote victory. This year, Mr. Biden is on track to win the national vote by around five percentage points; no major national live-interview telephone survey showed him leading by less than eight percentage points over the final month of the race.

The New York Times/Siena College polls were also less accurate than they were in 2018 or four years ago. In 2016, the last two Times/Siena polls were among a very small group of polls to show Mr. Trump tied or ahead in Florida and North Carolina. This time, nearly all of the Times/Siena surveys overestimated Mr. Biden to about the same extent as other surveys.

In the months ahead, troves of data will help add context to exactly what happened in this election, like final turnout data, the results by precinct, and updated records of which voters turned out or stayed home. All of this data can be appended to our polling, to nail down where the polls were off most and help point toward why. But for now, it’s still too soon for a confident answer.

In the broadest sense, there are two ways to interpret the repeat of 2016’s polling error. One is that pollsters were entirely wrong about what happened in 2016. As a result, the steps they took to address it left them no better off. Another is that survey research has gotten even more challenging since 2016, and whatever steps pollsters took to improve after 2016 were canceled out by a new set of problems.

Of these two, the latter interpretation — real improvements canceled out by new challenges — may make the most sense.

“I think our polls would have been even worse this year had we employed a pre-2016 methodology,” said Nick Gourevitch of Global Strategy Group, a Democratic polling firm that took steps to better represent Mr. Trump’s supporters. “These things helped make our data more conservative, though clearly they were not enough on their own to solve the problem.”

Joe Biden may have won the election, but the margin of victory was much closer than the experts predicted. Why?

The explanation for 2016’s polling error, while not necessarily complete or definitive, was not contrived. Many state pollsters badly underrepresented the number of voters without a college degree, who backed Mr. Trump in huge numbers. The pollsters went back to their data after 2016, and found that they would have been much closer to the election result if they had employed the standard education adjustments that national surveys have long used. An Upshot analysis of national surveys found that failing to weight by education cost Mr. Trump about four points in polling support — enough to cover much of the 2016 polling error. Other pollsters had similar findings.

But this time, education weighting didn’t seem to help. State and national polls consistently showed Mr. Biden faring far better than Mrs. Clinton did among white voters without a degree. Last week’s results made it clear that he didn’t.

Over all, the final national surveys in 2020 showed Mr. Trump leading by a margin of 58 percent to 37 percent among white voters without a degree. In 2016, they showed Mr. Trump ahead by far more, 59-30. The results by county suggest that Mr. Biden made few gains at all among white voters without a degree nationwide, and even did worse than Mrs. Clinton’s 2016 showing in many critical states.

In contrast, the 2016 polls did show the decisive and sharp shift among white voters without a degree, but underestimated its effect in many states because they underestimated the size of the group. Many state polls showed college graduates representing half of the likely electorate in 2016, compared with about 35 percent in census estimates.

The poll results among seniors are another symptom of a deeper failure in this year’s polling. Unlike in 2016, surveys consistently showed Mr. Biden winning by comfortable margins among voters 65 and over. The final NBC/WSJ poll showed Mr. Biden up 23 points among the group; the final Times/Siena poll showed him up by 10. In the final account, there will be no reason to believe any of it was real.

This is a deeper kind of error than ones from 2016. It suggests a fundamental mismeasurement of the attitudes of a large demographic group, not just an underestimate of its share of the electorate. Put differently, the underlying raw survey data got worse over the last four years, canceling out the changes that pollsters made to address what went wrong in 2016.

It helps explain why the national surveys were worse than in 2016; they did weight by education four years ago and have made few to no changes since. It also helps explain why the error is so tightly correlated with what happened in 2016: It focuses on the same demographic group, even if the underlying source of the error among the group is quite different.

Polling clearly has some serious challenges. The industry has always relied on statistical adjustments to ensure that each group, like white voters without a degree, represents its proper share of the sample. But this helps only if the respondents you reach are representative of those you don’t. In 2016, they seemed to be representative enough for many purposes. In 2020, they were not.

So how did the polls get worse over the last four years? This is mainly speculation, but consider just a few possibilities:

The president (and the polls) hurt the polls. There was no real indication of a “hidden Trump” vote in 2016. But maybe there was one in 2020. For years, the president attacked the news media and polling, among other institutions. The polls themselves lost quite a bit of credibility in 2016.

It’s hard not to wonder whether the president’s supporters became less likely to respond to surveys as their skepticism of institutions mounted, leaving the polls in a worse spot than they were four years ago.

“We now have to take seriously some version of the Shy Trump hypothesis,” said Patrick Ruffini, a Republican pollster for Echelon Insights. It would be a “problem of the polls simply not reaching large elements of the Trump coalition, which is causing them to underestimate Republicans across the board when he’s on the ballot.”

(This is different from the typical Shy Trump theory that Trump supporters don’t tell pollsters the truth.)

A related possibility: During his term, Mr. Trump might have made gains among the kinds of voters who would be less likely to respond to surveys, and might have lost additional ground among voters who would be more likely to respond to surveys. College education, of course, is only a proxy for the traits that predict whether someone might back Mr. Trump or respond to a poll. There are other proxies as well, like whether you trust your neighbor; volunteer your time; are politically engaged.

Another proxy is turnout: People who vote are likelier to take political surveys. The Times/Siena surveys go to great lengths to reach nonvoters, which was a major reason our surveys were more favorable for the president than others in 2016. In 2020, the nonvoters reached by The Times were generally more favorable for Mr. Biden than those with a track record of turning out in recent elections. It’s possible that, in the end, the final data will suggest that Mr. Trump did a better job of turning out nonvoters who backed him. But it’s also possible that we reached the wrong low-turnout voters.

The resistance hurt the polls. It’s well established that politically engaged voters are likelier to respond to political surveys, and it’s clear that the election of President Trump led to a surge of political engagement on the left. Millions attended the Women’s March or took part in Black Lives Matter protests. Progressive activists donated enormous sums and turned out in record numbers for special elections that would have never earned serious national attention in a different era.

This surge of political participation might have also meant that the resistance became likelier to respond to political surveys, controlling for their demographic characteristics. Are the “MSNBC moms” now excited to take a poll while they put Rachel Maddow on mute in the background? Like most of the other theories presented here, there’s no hard evidence for it — but it does fit with some well-established facts about propensity to respond to surveys.

The turnout hurt the polls. Political pollsters have often assumed that higher turnout makes polling easier, since it means that there’s less uncertainty about the composition of the electorate. Maybe that’s not how it worked out.

Heading into the election, many surveys showed something unusual: Democrats faring better among likely voters than among registered voters. Usually, Republicans hold the turnout edge.

Take Pennsylvania. The final CNN/SSRS poll of the state showed Mr. Biden up by 10 points among likely voters, but by just five among registered voters. Monmouth showed Mr. Biden up by seven among likely voters in a “high-turnout” scenario (which it ended up being), but by five points among registered voters. Marist? It had a lead of six points among likely voters and five points among registered voters. The ABC/Washington Post showed a seven-point lead for Mr. Biden among likely voters and a four-point lead among registered voters.

It’s still too soon to say whether Republican turnout beat Democratic turnout, but it sure seems possible. In Florida, the one state where we do have hard turnout data, registered Republicans outnumbered registered Democrats by about two percentage points among those who actually voted, even though Democrats outnumber Republicans among registered voters by about 1.5 points in the state. Here, there is no doubt that the turnout was better for the president than the polls suggested, whether they’re private polls or the final Times/Siena poll — which showed registered Republicans with an edge of 0.7 points.

If Mr. Trump fared better among likely voters than among registered voters in Pennsylvania, a fundamental misfire on the estimate of turnout could very quickly explain some of the miss.

Unlike the other theories presented here, this one can be proved false or true. States will eventually update their voter registration files with a record of whether voters turned out in the election. We’ll be able to see the exact composition of the electorate by party registration, and we’ll also be able to see which of our respondents voted. Perhaps Mr. Trump’s supporters were likelier to follow through. We might start to get data from North Carolina and Georgia in the next few weeks. Other states might take longer.

The pandemic hurt the polls. Remember those Times/Siena polls from October 2019 that showed Mr. Biden narrowly leading Mr. Trump? They turned out to be very close to the actual result, at least outside of Florida. They were certainly closer than the Times/Siena polls conducted since.

It wasn’t just the Times/Siena polls that were closer to the mark further ahead of the election. Results from pollsters in February and March look just about dead-on in retrospect, with Mr. Biden leading by about six points among registered voters nationwide, with a very narrow lead in the “blue wall” states, including a tied race in Wisconsin.

One possibility is that the polls were just as poor in October 2019 as in October 2020. If so, Mr. Trump actually held a clear lead during the winter. Maybe. Another possibility is that the polls got worse over the last year. And something really big did happen in American life over that time: the coronavirus pandemic.

“The basic story is that after lockdown, Democrats just started taking surveys, because they were locked at home and didn’t have anything else to do,” said David Shor, a Democratic pollster who worked for the Obama campaign in 2012. “Nearly all of the national polling error can be explained by the post-Covid jump in response rates among Dems,” he said.

Circumstantial evidence is consistent with that theory. We know that the virus had an effect on the polls: Pollsters giddily reported an increase in response rates. High-powered studies showed Mr. Biden gaining in coronavirus hot spots, seeming to confirm the assumption that the pandemic was hurting the president.

But if Mr. Shor is right, the studies weren’t showing a shift in the attitudes of voters in hot spots; rather, it was a shift in the tendency for supporters of Mr. Biden to respond to surveys.

Adding to the intrigue: There is no evidence that the president fared worse in coronavirus hot spots, contrary to the expectations of pundits or studies. Instead, Mr. Trump fared slightly better in places with high coronavirus cases than in places with lower coronavirus cases, controlling for demographics, based on the preliminary results by county so far. This is most obviously true in Wisconsin, one of the nation’s current hot spots and the battleground state where the polls underestimated Mr. Trump the most. The final polls in Wisconsin — including the final Times/Siena poll — showed Mr. Biden gaining in the state, even as polls elsewhere showed Mr. Trump making gains.

Don’t forget the Hispanic vote. There’s one state in particular where the polls were much worse in 2020 than in 2016: Florida, where Mr. Trump made huge gains among Hispanic voters.

What happened in Miami-Dade County was stunning. Mr. Biden won by just seven points in a county where Mrs. Clinton won by 29 points. No pollster saw the extent of it coming, not even those conducting polls of Miami-Dade County or its competitive congressional districts.

Most polls probably weren’t even in the ballpark. The final Times/Siena poll of Florida showed Mr. Biden with a 55-33 lead among Hispanic voters. In the final account, Mr. Biden may barely win the Hispanic vote in the state.

Sign up for The Upshot Newsletter: Analysis that explains politics, policy and everyday life, with an emphasis on data and charts.

What happened in Miami-Dade was not just about Cuban-Americans. Although Democrats flipped a Senate seat and are leading the presidential race in Arizona, Mr. Trump made huge gains in many Hispanic communities across the country, from the agricultural Imperial Valley and the border towns along the Rio Grande to more urban Houston or Philadelphia.

Many national surveys don’t release results for Hispanic voters because any given survey usually has only a small sample of the group. It will be some time until the major pollsters post their results to the Roper Center, a repository of detailed polling data. Then we’ll be able to dig in and see exactly what the national polls showed among this group.

But if the Florida polls are any indication, it’s at least possible that national surveys missed Mr. Trump’s strength among Hispanic voters. It seems entirely possible that the polls could have missed by 10 points among the group. If true, it would account for a modest but significant part — maybe one-fourth — of the national polling error.

This entry is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.

Tony's Thoughts

CUNY education news technology

What Went Wrong With the 2020 Presidential Election Polling? Some Early Theories!

Need help with the Commons?