With the election about a week away (and plenty of material stored up for the quiz coming up), it seems like a good spot to take a break from quizzable material and focus on the coming the election.

We've glanced at FiveThirtyEight.com a few times this semester. We're going to take a closer look today.

FiveThirtyEight got its start in 2008 and correctly predicted the results of the presidential election in 49 states. In 2012, they did a little better.

In 2016, FiveThirtyEight gave Trump a 30% chance of winning - much better than most other outlets and, when you look at it from the probabilistic perspective, Trumps victory is hardly shocking.

FiveThirtyEight maintains an election forecast page, where we see that they currently give Joe Biden an 87% chance of victory.

To be clear, 13% events happen a lot - about 13% of the time! Kevin Durant misses a free throw almost 13% of the time. If he was shooting a free throw to win the NBA championship, most sports fans wouldn't just walk away!

One thing that's so great about FiveThirtyEight's work is how clearly the explain it, with published data. If you check out the explanation of their forecast, though, you'll probably notice a lot of unfamiliar terminology - particularly, when they discuss simulation.

Part of our objective here today is to explain what that is and why it's important when dealing with the complications introduced by the electoral college.

A recent poll of 965 likely voters by Investor's Business Daily, suggests that 49.6% of them will vote for Joe Biden while 44.7% of them will vote for Donald Trump.

What does this say in terms of assigning a probability to the outcome of the popular vote?

Recalling our recent work on the difference between two multinomial proportions, we might compute $$T = \frac{p_B-p_T}{SE},$$ where $p_B=0.496$ is the proportion of folks indicating support for Biden, $p_T=0.447$ is the proportion of folks indicating support for Trump, and $$SE=\sqrt{\frac{(p_B+p_T)-(p_B-p_T)^2}{n}} \approx0.03122.$$

As it turns out, $T\approx1.569486$. If we compare this against the standard normal, we find that $$P(Z < T) \approx 0.94173.$$ I guess this could be interpreted as saying that Biden has a 94% chance of winning the popular vote.

Of course, winning the popular vote is not the same as winning the election.

The president is chosen directly by the Electoral College. Each state appoints, in such manner as the legislature thereof may direct, a number of electors, equal to the number of senators and representatives. That yields 538 total electors; thus, a candidate needs 270 votes from the electoral college to win the presidency.

Every state but Maine and Nebraska allocate (by state law) all their electors to the candidate that wins that state's popular vote.

Maine and Nebraska award 2 electoral votes according to the state winner and awards 1 more vote for each congressional district.

In addition, the District of Columbia has 3 electors

The village of Bexley needs a new mayor to be elected (in an emergency situation) by three fickle city council members. The candidates are

Joe and Don

The electors (with probability $P_J$ of voting for Joe) are:

Name: | Nancy | Chuck | Mitch |
---|---|---|---|

$P_J$: | 0.8 | 0.7 | 0.2 |

What's the probability that Joe wins?

In order to win, Joe needs at least two of Nancy (N), Chuck (C), or Mitch (M) to vote for him. The possibilities with their probabilities are:

NC | NM | CM | NCM |

$0.8\times0.7\times0.8$ | $0.8\times0.3\times0.2$ | $0.2\times0.7\times0.2$ | $0.8\times0.7\times0.2$ |

$0.448$ | $0.048$ | $0.028$ | $0.112$ |

If we add those probabilities up, we get $0.636$.

Suppose we want to apply that technique to the Electoral college. We need to assess the probability associated with 56 regions: the 50 states, the congressional districts for Maine and Nebraska, and for the District of Columbia.

Since each of these districts can go one of two possible ways, that leads to \[2^{56} = 72057594037927936\] different scenarios to consider. That's far too many terms to evaluate on our most powerful computer.

Simulation is the process of running a mathematical model of a process on a computer.

In the context of statistics, we can run the model a large number of times and count the number of occurrences of an event of interest to estimate the probability of that event.

We discussed simulation of coin flips earlier this semester as a way to understand the basics of probability theory.

We can use our probability estimates for each electoral region to simulate the current electoral process. For each region, we randomly pick a winner (Trump or Biden) according to the assessed probability. We then add up the votes and determine the winner. We do that a large number of times and count the number of Trump victories vs the number of Biden victories to estimate the probabilities.

FiveThirtyEight's explanation of their forecast considers a huge number of other factors. These include

- External factors (like COVID and the economy)
- Correlation between states

Another way to look at this process that focuses on key states is called the electoral tree.

Here's the electoral tree for this year: