It's NCAA tournament time! So, over the course of the next few weeks, we'll have 63 opportunties to ask the question:
What's the likelihood that This Team beats That Team?
For example,
Suppose we have $N$ teams that play a lot of games and, at the end of the season, play a tournament. How can we assess the relative strenghts of the teams and use that to make predictions?
The Log5 formula is a simple formula based just on winning percentages. It states that the probability that Team A beats Team B, denoted $p_{A,B}$ should be $$ p_{A,B} = \frac{p_A - p_A \times p_B}{p_A+p_B - 2\times p_A \times p_B}, $$ where $p_A$ and $p_B$ simply denote the winning percentages of the teams.
Here are a few reasons to think that the Log5 formula might be, at least, reasonable:
Suppose we have a league of three teams that play a bunch of games. At the end of the season, the results look like so:
Team A | Team B | Team C | |
---|---|---|---|
Team A | 0 | 4 | 1 |
Team B | 2 | 0 | 2 |
Team C | 4 | 3 | 0 |
Note that the entry in row $i$ and column $j$ indicates how many times team $i$ beat team $j$. For example, Team A beat Team B 4 times.
Continuing with the same table,
Team A | Team B | Team C | |
---|---|---|---|
Team A | 0 | 4 | 1 |
Team B | 2 | 0 | 2 |
Team C | 4 | 3 | 0 |
we see that Team A won 5 games and lost 6 games for a winning percentage of $$p_A = 5/11 \approx 45\%.$$
We can compute Team B's winning percentage in a similar way and we get $$p_B = 4/11 \approx 36\%.$$ Thus, $$p_{A,B} = \frac{(5/11)-(5/11)\times(4/11)}{(5/11)+(4/11)-2\times(5/11)(4/11)} \approx 0.593.$$ That seems quite reasonable!