Polling, statistics, and data science

Thad · Postby **Thad** » Fri Sep 16, 2022 12:43 pm

So I've kinda been talking about polls just wherever over the years, most recently in the Barefoot and Pregnant thread because of the obvious importance of Dobbs in the upcoming election and how historical models are ill-prepared to deal with this kind of major event. But I figured hey, how about a dedicated thread to talk about polls and related topics?

And before I continue, standard Nate Silver disclaimer: Nate Silver says a lot of stupid shit, but he's still very good at statistics and data analysis, and at explaining those things in easy-to-understand terms. He is a valid source to cite in his area of expertise! And pretty much nowhere else.

Anyway, Silver's got an article up today titled Will The Polls Overestimate Democrats Again? The title's a little clickbaity (tl;dr maybe but there's no good data-driven reason to assume so) but the article is good and thorough.

Here's a bit I found interesting:

As I mentioned, the Deluxe version of our forecast gives Democrats a 71 percent and 29 percent chance of keeping the Senate and House, respectively. But the Deluxe forecast isn’t just based on polls: It incorporates the fundamentals I mentioned earlier, along with expert ratings about these races. Furthermore, it accounts for the historical tendency of the president’s party to perform poorly at the midterms, President Biden’s mediocre (although improving) approval rating and the fact that Democrats may not perform as well in polls of likely voters as among registered voters. As the election approaches, it tends to put more weight on the polls and less on these other factors, but it never zeros them out completely. (In this respect, it differs from our presidential forecast.)

By contrast, the Lite version of our forecast, which is more or less a “polls-only” view of the race, gives Democrats an 81 percent chance of keeping the Senate and a 41 percent chance of keeping the House. It also suggests that they’ll win somewhat more seats: There are 52.4 Democratic Senate seats in an average Lite simulation as compared with 50.8 in a Deluxe simulation, or 212 Democratic House seats in an average Lite simulation versus 209 in a Deluxe simulation. Notably, this corresponds to current polls overstating Democrats’ position by the equivalent of 1.5 or 2 percentage points. Put another way, we should think of a race in which the polling average shows Democrats 2 points ahead as being tied.

(links not included in copy-paste)

Those gaps are pretty massive -- the polls-only forecast shows Democrats 10 points more likely to keep the Senate, and 12 points more likely to keep the House, than the "Deluxe" forecast, which considers other factors and the correlations they've had with historical outcomes.

The "Deluxe" forecast is a good model -- remember when people were pointing and laughing at Silver in 2016 because his model gave Trump a 30% chance of winning and obviously if you looked at the polls there was no way it was that high? -- but, by definition, its reliance on historical data means it's going to be accurate most of the time but is ill-prepared for dealing with major unexpected events.

Polls aren't perfect, but they factor in people's reactions to Dobbs. Historical data doesn't.

It's not that Biden's approval rating and the economy and this being a midterm and the election still being two months away don't matter; they do. But a model that assumes they have the same weights relative to reproductive rights that they have historically is missing key context that's not easy to model statistically. I think this time, the polls might be the more reliable indicator of how the election is going to go, even though it's still two months out, and I think they may still be underrating Democrats' chances.

The evidence for that view isn't really statistically significant yet, and as always I could be wrong. But merely asking the question of which statistical model is closer to what the final result will be highlights a lot of what polling analysis is about -- it's not just a question of "What does the data say?" but "Which data?" and "Why might it be biased or just plain wrong?"

Thad · Postby **Thad** » Thu Sep 22, 2022 1:23 pm

Enthusiasm for upcoming midterms is at all-time high, NBC News poll shows

Here’s our poll, going back to 2006, on those expressing high interest two months before the midterm:

Sept. 2006: 55%
Aug/Sept. 2010: 53%
Sept. 2014: 51%
Sept. 2018: 58%
Sept. 2022: 64%
Here was the eventual turnout in these elections (total votes for U.S. House):

2006: 81 million (40% of voting eligible population)
2010: 87 million (41%)
2014: 79 million (37%)
2018: 114 million (50%)
2022: ???

Another positive sign that turnout could be high enough to overcome the GOP's disenfranchisement efforts.

Thad · Postby **Thad** » Wed Mar 20, 2024 7:45 pm

Trump is leading the polls, but there's plenty of time for Biden to catch up

If you see anyone taking current presidential polls too seriously, tell them at this point in 1980 Carter was polling 14 points ahead of Reagan.

beatbandito · Postby **beatbandito** » Wed Mar 20, 2024 9:00 pm

At this point in 2016 Clinton was polling 10 points ahead of Trump.

Cait · Postby **Cait** » Wed Mar 20, 2024 10:05 pm

"Polls remain as accurate as ever," writes pollster with vested interest in being perceived as contributing something useful.

Thad · Postby **Thad** » Thu Mar 21, 2024 12:21 am

...you, ah...know that writing about polls doesn't make you a pollster, right?

Like, if writing about a thing made you the thing I'd be a goddamn Ninja Turtle by now.

Mongrel · Postby **Mongrel** » Thu Mar 21, 2024 12:59 am

Nice try, DONATELLO.

Upthorn · Postby **Upthorn** » Thu Mar 21, 2024 10:55 am

I think Cait was taking a shot at Nate Silver, rather than Thad? The Nate post may be old, but it's still on the current page of this thread

Cait · Postby **Cait** » Fri Mar 22, 2024 7:57 am

No, I was talking about the author of the article Thad linked, the current lead on 538, the 'editorial director of data analytics'.

Thad · Postby **Thad** » Thu Mar 28, 2024 12:50 pm

Philip Bump: Americans broadly support abortion access. Will it win Biden reelection?

Political coalitions are formed out of people with differing and at times competing priorities. There are an enormous number of people for whom abortion access is a key motivator this November. There is also an enormous number of people for whom the economy is. There is an enormous number of people for whom both are.

The question that’s most worth asking, then, isn’t whether the issue of abortion will pull people to the polls to vote for Biden (or against Trump). It will. The question is whether the scale of that pull will be significant and whether it will outweigh other motivators for other voters, particularly in a presidential election year.

That is a very difficult question to answer.

The polling data here is probably more useful at this point in the election than individual approval for Biden or Trump, but it's still pretty ambiguous, and relies on a lot of facts we don't know yet.

For example, I think if the abortion access amendment is on the ballot in Arizona, then Biden will win Arizona. If it isn't, then I don't know what happens. There are questions like that that matter a lot.

Polling, statistics, and data science

Polling, statistics, and data science

Re: Polling, statistics, and data science

Re: Polling, statistics, and data science

Re: Polling, statistics, and data science

Re: Polling, statistics, and data science

Re: Polling, statistics, and data science

Re: Polling, statistics, and data science

Re: Polling, statistics, and data science

Re: Polling, statistics, and data science

Re: Polling, statistics, and data science

Who is online