It's NCAA tournament time! So, over the course of the next few weeks, we'll have 63 opportunties to ask the question:

What's the likelihood that This Team beats That Team?

For example,

- Brackets,
- Kaggle, and
- Kaggle Brackets.

Suppose we have $N$ teams that play a lot of games and, at the end of the season, play a tournament. How can we assess the relative strenghts of the teams and use that to make predictions?

The Log5 formula is a simple formula based just on winning percentages. It states that the probability that Team A beats Team B, denoted $p_{A,B}$ should be $$ p_{A,B} = \frac{p_A - p_A \times p_B}{p_A+p_B - 2\times p_A \times p_B}, $$ where $p_A$ and $p_B$ simply denote the winning percentages of the teams.

Here are a few reasons to think that the Log5 formula might be, at least, reasonable:

- If both teams are undefeated or winless, then $p_{A,B}=p_{B,A} = 0/0 = NaN$,
- Otherwise,
- If Team A is winless, then $p_{A,B}=0$,
- If Team A is undefeated, then $p_{A,B} = 1$,

(since the numerator and denominator are both $1-p_B$).

- If $p_A=p_B=p$, then $p_{A,B}=(p-p^2)/(2p-2p^2) =1/2$
- The total probability is one: $$\begin{align} p_{A,B} + p_{B,A} &= \frac{p_A - p_A \times p_B}{p_A+p_B - 2\times p_A \times p_B} + \frac{p_B - p_B \times p_A}{p_B+p_A - 2\times p_B \times p_A}. \\ &= \frac{p_A - p_B\times p_A + p_B - p_B \times p_A}{p_B+p_A - 2\times p_B \times p_A} = 1. \end{align}$$

Suppose we have a league of three teams that play a bunch of games. At the end of the season, the results look like so:

Team A | Team B | Team C | |
---|---|---|---|

Team A | 0 | 4 | 1 |

Team B | 2 | 0 | 2 |

Team C | 4 | 3 | 0 |

Note that the entry in row $i$ and column $j$ indicates how many times team $i$ beat team $j$. For example, Team A beat Team B 4 times.

Continuing with the same table,

Team A | Team B | Team C | |
---|---|---|---|

Team A | 0 | 4 | 1 |

Team B | 2 | 0 | 2 |

Team C | 4 | 3 | 0 |

we see that Team A won 5 games and lost 6 games for a winning percentage of $$p_A = 5/11 \approx 45\%.$$

We can compute Team B's winning percentage in a similar way and we get $$p_B = 4/11 \approx 36\%.$$ Thus, $$p_{A,B} = \frac{(5/11)-(5/11)\times(4/11)}{(5/11)+(4/11)-2\times(5/11)(4/11)} \approx 0.593.$$ That seems quite reasonable!