2022 U.S. House of Representatives Elections Predictions

Grid of 8 subplots by proprietary congressional district type showing distribution curves of simulated two-way contest margin per district. The 50 races we're watching most closely are in bright colors & the rest are in grey. Details in text.

Greetings and welcome to our House midterm predictions! Since the previous post, we’ve analyzed the entire national voter file and identified 21 ideological voter clusters. We then grouped the newly drawn congressional districts by “type” based on the relative proportions of the ideological voter clusters within them. (We will elaborate on this methodology in a future post but are releasing these predictions now as Election Day approaches. As a reminder, the features we use for ideological voter clustering are a subset of HaystaqDNA’s issue scores and do not explicitly regard information about demographics or partisan preference.)

After assigning each voter to an ideological voter cluster and each district to a congressional district type, we simulate election margins. The general idea is that for each district we know a priori the frequency of voters in each voter cluster and need to generate random variable priors for any other variable(s) to run simulations. Using historical turnout data (2018 general election), we arrive at beta priors for each voter cluster’s turnout rate specific to district type, and using partisan primary participation and party registration we arrive at beta priors for probability of turned out voters casting their vote for Democrat/Republican/other. Using those beta priors we run Monte Carlo simulations to predict generic three-way vote share from which we calculate the two-way major party margin. (While some districts’ third party vote share in the simulations was proportionally significant, in no instance did it exceed either of the major parties’ vote share.)

This is a good time to mention some of our assumptions. Since our priors are based on historical data, they are not specific to any candidate. This cycle we consider every district as though it has one generic Republican and one generic Democrat running. Louisiana’s 4th congressional district (LA04) is on our Top 50 list even though there is a Republican candidate running unopposed because our calculations show that in a generic contest a Democrat would win 25 times out of 100 against a Republican. The nature of our priors this cycle also means that our predictions do not change based on new data other than updated voter lists; we have some ideas for candidacy-specific adjustments in the future but kept things simpler this time as a baseline. Contest scenarios that do exist this cycle but do not match our conditions — such as same-party opposition, minor party opposition, unopposed — did not make it to the Top 50 list with the exception of LA04.

We calculate a 99.5% chance that the electorates in anywhere from 234 to 254 districts would elect a generic Democrat. The top ten most common tipping point districts — i.e., per simulation, the district with the 218th largest margin of victory in favor of whichever party de facto wins the House — are as follows, in descending order of frequency of appearance: CA49, IA03, PA04, NY03, NY18, NJ03, IA01, NJ05, NH01, and WI03.

While you can get a decent idea of the distribution of margin predictions in the Top 50 plot above, here are some hard numbers to accompany the visual (NE02 was slightly over 50 before rounding to the nearest integer, so it’s in the R-favored list… they’re also .png from Jupyter notebook, please ignore aspect ratios):

Of districts not in our 50 most competitive list, the ones whose electorates we believe would flip the seat from Democrat to Republican given a generic contest are, in order of increasing probability (all R wins at least 86 in 100): NJ07, MI09, AK00, AZ02, FL21, CA05, TN05, GA06, AZ09, FL05, and CA20.

Conversely, of districts not in our 50 most competitive list, the ones whose electorates we believe would flip the seat from Republican to Democrat given a generic contest are, in order of increasing probability (all D wins at least 86 in 100): IA01, TX23, OH10, WI01, CA22, WA03, AZ04, NM02, OH01, FL27, MI04, NY11, CA25, CA21, MI07, CA50, CA39, IL13, MI10, TX34, FL25, MI03, CA04, MI06, CA42, PA12, and CA08.

Of the districts without current officeholders, including new districts, in a generic contest we believe the electorates in MT02, TX38, and IN02 would elect the Republican whereas the electorates in CO08, FL22, FL28, NC14, OR06, and TX37 would elect the Democrat. All of these predicted outcome probabilities are at least 96 wins out of 100. Note that FL13 has no incumbent and is in our most competitive list.

Finally, we predict all districts not mentioned above are not among the most competitive and have electorates which would elect a generic candidate of the same party affiliation as the current officeholder.

We’ve tried to keep this predictions announcement straightforward; if you have any questions, please do not hesitate to reach out to contact@volsweep.com & thank you for reading!

NOTE: For coding purposes we rename the at large district for state xx to “xx01.” We try to revert to “xx00” but if you see the former please interpret it as at large.



Previous
Previous

National Voter Clusters

Next
Next

Texas voter clusters