An archive of Mark's Spring 2018 Numerical Analysis course.

Eigenranking

mark

(10 pts)

Use an eigenvalue analysis to rank your favorite team’s conference in your favorite season. The conference should have at least 5 teams but, as you’ll be entering a matrix by hand, I recommend that it not be too large. Be sure to include the source of your data and all the code you use to produce your answer. You might double check to make sure that your code runs when copied from our forum into a Jupyter notebook.

No favorite team?!!? Might I suggest something like the 1968 Big Ten Football Season or the 2012 Big South Basketball season or the 2001 AFC Central or this year’s Women’s Big South? No duplicates, though - everyone gets to do their own analysis!

theoldernoah

The 2015 Panthers and the NFC South/East

I chose the 2015 Panthers as my favorite team. Because the NFC South only has 4 teams, we are combining the NFC South and NFC East, as each team from both divisions played each other at least once.

I got all the stats from here (If you add the teams name at the end of the link it takes you to their entire regular season.) Now, let’s jump into the code.

nfc_south_east = [
    'Panthers','Falcons','Saints','Bucs','Giants','Redskins',
    'Eagles','Cowboys'
]
import numpy as np
M = np.matrix([
[0,1,2,2,1,1,1,1],
[1,0,0,0,1,1,1,1],
[0,2,0,1,1,0,0,1],
[0,2,1,0,0,0,1,1],
[0,0,0,1,0,1,0,1],
[0,0,1,1,1,0,2,1],
[0,0,1,0,2,0,0,1],
[0,0,0,0,1,1,1,0]
])
from scipy.linalg import eig
vals, vecs = eig(M)
vals

# Out: 
# array([ 4.28727348+0.j        ,  1.00749950+0.j        ,
#     -0.71794231+1.89827189j, -0.71794231-1.89827189j,
#     -0.94231293+1.55897184j, -0.94231293-1.55897184j,
#      -0.98713125+0.20464143j, -0.98713125-0.20464143j])

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
[nfc_south_east[i] for i in ranking]

# Out:
# ['Panthers',
# 'Falcons',
# 'Redskins',
# 'Bucs',
# 'Saints',
# 'Eagles',
# 'Giants',
# 'Cowboys']

[vec[i] for i in ranking]

# Out 
# [0.62498069505912424,
# 0.3667320837825902,
# 0.34912184168651172,
# 0.34256462501387774,
# 0.34015089155098027,
# 0.21588709630130931,
# 0.20312502737999952,
# 0.17916607590152081]

Walking through the code, first we see the NFC South/East defined with each team in order as they are in the matrix. We then define the matrix which tells us how many times each team in row i defeated each team in column j. We then used the eig command from scipy.linalg to tells us what the largest eigenvalue was in absolute value. The result told us the first eigenvalue was the largest, so we then used
vec = abs(vecs[:,0]) and ranking = np.argsort(vec).tolist() and ranking.reverse() to calculate our rankings. We then used the list of teams from the NFC South/East to create the rankings. The final lines tell us the relative strengths of each team according to our eigenvector.

You may notice that the Falcons are ranked above the Redskins, while the Redskins had 6 wins and the Falcons had 5. This error can be put on the simplicity of our ranking system, though it is strange the error occurred near the top of the rankings.

mark

@theoldernoah Looks great! I made one little edit so that all the output is commented.

Regarding:

I don’t think that “error” is the correct term here, though some folks will certainly find fault with it. I think that this shows that strength of schedule is important. In the following, M2 is the matrix that tells us how many times teams i and j played one another. We then sort the rows and columns according to the rankings:

M2 = M+M.transpose()
M2[ranking][:,ranking]

# Out:
# matrix([
#   [0, 2, 1, 2, 2, 1, 1, 1],
#   [2, 0, 1, 2, 2, 1, 1, 1],
#   [1, 1, 0, 1, 1, 2, 2, 2],
#   [2, 2, 1, 0, 2, 1, 1, 1],
#   [2, 2, 1, 2, 0, 1, 1, 1],
#   [1, 1, 2, 1, 1, 0, 2, 2],
#   [1, 1, 2, 1, 1, 2, 0, 2],
#   [1, 1, 2, 1, 1, 2, 2, 0]])

The Falcons and Redskins correspond to rows 1 and 2 (after row 0, of course). Since the rows and columns are now ordered according to the ranking, we can see that the Redskins played the bottom 3 teams twice each and everyone else only once. The Falcons played the bottom 3 teams once each and everyone else twice except the Redskins.

anonymous_user

I chose the 1996 Florida Gators, who won the SEC and National Championships that year. I was in the stands for several of these games, though my seats were generally terrible.

Here we look at the Gators’ conference, the 1996 SEC. All data was sourced from https://www.sports-reference.com/cfb/conferences/sec/1996-schedule.html.

I used the following code:

from scipy.linalg import eig
import numpy as np
SEC = [
    'alabama','arkansas','auburn','florida','georgia',
    'kentucky','louisiana state','mississippi','mississippi state','south 
    carolina','tennessee','vanderbilt']

M=np.matrix([
    [0,1,1,0,0,1,1,1,0,0,0,1],
    [0,0,0,0,0,0,0,1,1,0,0,0],
    [0,1,0,0,0,0,0,1,1,1,0,0],
    [1,1,1,0,1,1,1,0,0,1,1,1],
    [0,0,1,0,0,0,0,0,1,0,0,1],
    [0,0,0,0,1,0,0,0,1,0,0,1],
    [0,1,1,0,0,1,0,1,1,0,0,1],
    [0,0,0,0,1,0,0,0,0,0,0,1],
    [1,0,0,0,0,0,0,1,0,1,0,0],
    [0,1,0,0,1,1,0,0,0,0,0,1],
    [1,1,0,0,1,1,0,1,0,1,0,1],
    [0,0,0,0,0,0,0,0,0,0,0,0]
])
vals, vecs = eig(M)
vals
#array([  0.00000000e+00+0.j        ,   0.00000000e+00+0.j        ,
     #2.57238438e+00+0.j        ,  -7.56393108e-01+1.65010183j,
    #-7.56393108e-01-1.65010183j,  -2.36408675e-01+1.19048775j,
    #-2.36408675e-01-1.19048775j,  -7.64715836e-01+0.j        ,
     #8.89675094e-02+0.31114877j,   8.89675094e-02-0.31114877j,
     #3.86403335e-16+0.j        ,   0.00000000e+00+0.j        ])
vec = abs(vecs[:,2])
ranking = np.argsort(vec).tolist()
ranking.reverse()
ranking
[SEC[i] for i in ranking]
#['florida',
# 'tennessee',
# 'alabama',
# 'louisiana state',
#'auburn',
#'mississippi state',
#'georgia',
#'south carolina',
#'kentucky',
#'arkansas',
#'mississippi',
#'vanderbilt']

We first import the necessary packages for python, and define a list of the team names in the SEC, as well as the adjacency matrix M, which at position i,j stores the number of times team i beat team j, as indexed by the list of team names. We use the eig command to solve the eigensystem of this matrix, and find that the largest eigenvalue is in position 2. We use the corresponding eigenvector to assign a rank to each team, matched by the index order.

Ultimately, the ranking makes sense, with the National and Conference champs up top, and nationally ranked Tennessee with Junior Quarterback Peyton Manning who would go on to win the National Championship the next year coming in at second place.

funmanbobyjo

I chose the 2016-2017 Big South Conference for woman’s basketball. Each team played each other twice. Data was used from http://bigsouthsports.com/standings.aspx?path=wbball&standings=1067.

Code:

womans_big_south = [
'Liberty','Radford','UNC Asheville','High Point','Gardner-Webb','Campbell',
'Presbyterian','Charleston Southern','Longwood', "Winthrop"
]
import numpy as np
M = np.matrix([
[0,1,0,1,1,2,1,2,2,2],
[1,0,2,1,2,2,1,1,2,2],
[2,0,0,0,1,0,1,1,2,2],
[1,1,2,0,2,0,2,1,2,2],
[1,0,1,0,0,0,0,2,2,2],
[0,0,2,2,2,0,2,1,2,2],
[1,1,1,0,2,0,0,1,2,2],
[0,1,1,1,0,1,1,0,2,2],
[0,0,0,0,0,0,0,0,0,1],
[0,0,0,0,0,0,0,0,1,0]
])
from scipy.linalg import eig
vals, vecs = eig(M)
vals

This code puts the teams into rows and columns from left to right and up to down in order of the teams listed. So the top row represents the wins of Liberty, the second row Radford and so on.

Out[1]:
      array([ 6.43047348+0.j        , -0.88617868+2.74048792j,
       -0.88617868-2.74048792j, -0.87808449+0.j        ,
       -0.97048036+1.11533513j, -0.97048036-1.11533513j,
       -0.91953546+1.37093199j, -0.91953546-1.37093199j,
       1.00000000+0.j        , -1.00000000+0.j        ])

This is the output of eigenvalues. The first eigenvalue of 6.4 is the highest so we will use a 0 for the following code.

 vec = abs(vecs[:,0])
 ranking = np.argsort(vec).tolist()
 ranking.reverse()
 ranking

[womans_big_south[i] for i in ranking]

This orders the teams by rank from best to worst according to their win-lose ratio.

 Out[5]:
   ['Radford',
   'Liberty',
    'High Point',
    'Campbell',
    'Presbyterian',
    'Charleston Southern',
    'UNC Asheville',
    'Gardner-Webb',
    'Winthrop',
    'Longwood']


 In [6]:
         [vec[i] for i in ranking]
         less than a minute ago
         0.027 seconds
 Out[6]:
         [0.4801749546207964,
          0.42492306684237974,
          0.41117584485363995,
          0.39830800052249909,
          0.28391994556352101,
          0.2836578900733201,
          0.25047579482447974,
          0.19325398745239719,
          0.0,
          0.0]

This is a list of the eigenvalues that shows relative strength of the team.

jorho85

So, I wanted to look at my favorite Baseball season of all time, which would be 1995 when the Braves won the world series. I was able to find a win lose table that showed the games played against each individual team here at the 1995 baseball season. Note I used the Record vs. opponents table.

Before we start I thought it would be intresting to look at the NL east standings at the end of the season and see how the ranking analysis, while also only looking at the games played within the division changed these standings.

  1. Atlanta Braves 90-54
  2. New York Mets 69-75
  3. Philadelphia Phillies 69-75
  4. Florida Marlins 67-76
  5. Montreal Expos 66-78

First, I made a matrix of Team names so that I could have python keep track of which team was in which spot.

NL_east = ['Atlanta Braves','New York Mets','Philadelphia Phillies','Florida Marlins','Montreal Expos']

So with this matrix I have made the Braves team 0 and the Expos team 4 respectively. Then I recorded the wins each team had against the other and stored that information in a matrix.

import numpy as np
M = np.matrix([
[0,5,7,10,9],
[8,0,7,6,6],
[6,6,0,7,5],
[3,7,6,0,6],
[4,7,8,7,0]])

Then using the same method from the EigenRankings page, I had python find the eigenvalues and eigenvectors of matrix M.

from scipy.linalg import eig
vals, vecs = eig(M)
vals

Which gave me a matrix of eigenvalues:

array([ 25.71733914+0.j        ,  -6.39686350+3.77976441j,
    -6.39686350-3.77976441j,  -6.46180607+1.03872398j,
    -6.46180607-1.03872398j])

From this we see that 25.71722814 is both the only real and largest eigenvalue for the matrix M, thus from the Perron-Frobenius theorem we know that eigenvalue and its corresponding eigenvector are dominant, and the dominant eigenvector will tell us the strength of the teams we are ranking. Since that eigenvalue was in the 0^{th} position I used the same method from EigenRankings to get the corresponding eigenvector and then order those entries from largest to smallest.

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
ranking
[0, 1, 4, 2, 3]

Now I can have python tell me what team each of these numbers corresponded to using the following code

[NL_east[i] for i in ranking]
['Atlanta Braves',
'New York Mets',
'Montreal Expos',
'Philadelphia Phillies',
'Florida Marlins'] 

From this we see that while the Expos overall record for the 95 season had them in last place, the strength as compared to other teams in the NL east has them ranked 3rd, which implies that if they had only played teams in their own division they would have had a better season. Anyway to view these strength ratings we can use the code:

[vec[i] for i in ranking]
[0.51039006086196836,
 0.46656331153731462,
 0.4423501351720362,
 0.41942918069871438,
 0.38759022484105227]

From the strength ratings it looks as if the top four teams were relatively close in strength, but the Braves still managed to win 21 more games than the Mets, which implies that the Braves did most of their winning outside of the division.

nathan

2004-2005 ACC Basketball season

(AKA the last time Wake Forest was ever any good at anything in sports)

(Also happens to be the last season they had Chris Paul)

If you ask any Wake Forest fan, they will tell you that this was that this season was the last time that WFU was a powerhouse in basketball. After Chris Paul left for the NBA, Wake Forest went from being 13-3 and 2nd in conference in the 04-05 season, to 3-13 and last in conference in the 05-06 season. WFU basketball has consistently been one of the worst teams in the ACC after this season (cries internally).

So while we’re at it, let’s reminisce on the good 'ol days.

We will need the following packages:

from scipy.linalg import eig
import numpy as np

The matrix for the 2004-2005 ACC season is below. The sum of values in row i is the total number of conference wins, and the sum of values in column i is the total number of conference losses for the corresponding team.

ACC = [
    'North Carolina','Wake Forest','Duke','VA Tech','Miami','GA Tech','NC State','Maryland',
    'Clemson','Florida State','Virginia'
]

M = np.matrix([
[0, 0, 1, 0, 1, 1, 2, 2, 3, 2, 2],#UNC
[1, 0, 1, 1, 2, 1, 2, 1, 1, 1, 2],#WFU
[1, 1, 0, 1, 2, 2, 1, 0, 0, 1, 2],#Duke
[0, 0, 1, 0, 2, 1, 1, 1, 1, 0, 1],#VT
[0, 0, 0, 0, 0, 0, 1, 1, 2, 2, 1],#Miami
[0, 1, 0, 0, 2, 0, 0, 0, 2, 2, 1],#GT
[0, 0, 0, 1, 0, 2, 0, 2, 1, 0, 1],#NCSU
[0, 0, 2, 1, 0, 1, 0, 0, 0, 1, 2],#Mary
[0, 0, 0, 2, 0, 0, 0, 1, 0, 2, 0],#Clem
[0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1],#FSU
[0, 0, 0, 1, 0, 0, 1, 0, 1, 1, 0],#UVA
])

vals, vecs = eig(M)
vals

Out[8]:
array([ 6.68450325+0.j        , -0.35534622+2.56618006j,
       -0.35534622-2.56618006j, -0.80045596+2.4349769j ,
       -0.80045596-2.4349769j , -1.95263259+0.j        ,
        0.17927545+0.j        , -0.79511589+0.4613889j ,
       -0.79511589-0.4613889j , -0.50465498+1.09103758j,
       -0.50465498-1.09103758j])

As we can see, largest eigenvalue is at position 0. Thus, the dominant eigenvector is the in the zeroth position. For simplicity, we can sort this eigenvector from largest to smallest. I include the eigenvector with the team names.

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
[ACC[i] + ' ---------------- ' + str(vec[i]) for i in ranking]

Out[27]:
['Wake Forest ---------------- 0.47494803476',
 'North Carolina ---------------- 0.453235112579',
 'Duke ---------------- 0.418093946278',
 'VA Tech ---------------- 0.283211171401',
 'Maryland ---------------- 0.269904962174',
 'GA Tech ---------------- 0.253788977773',
 'NC State ---------------- 0.24491747343',
 'Miami ---------------- 0.199934020145',
 'Clemson ---------------- 0.175861970045',
 'Florida State ---------------- 0.169611302918',
 'Virginia ---------------- 0.130690626476']

Look at this! WFU is ranked first despite having a slightly worse record than UNC. Go Deacs! This may be because WFU beat UNC in their only match. UNC would then go on to win the NCAA tournament.

When looking more closely at the eigenvector values, we can see that this season was heavily dominated by WFU, UNC, and Duke. After Duke, the 4th place team drops by 0.14 points.

All data was found here: https://www.sports-reference.com/cbb/conferences/acc/2005.html

CestBeau

We all remember the 2015-2016 NCAA Basket Tournament when Villanova bested UNC in the finals. Well, lets see how dominant Villanova was coming into the Championships. I decided to rank the 2015-2016 Big East Conference, it had 10 teams in total.

The first thing I did was list the teams and create a score matrix.

import numpy as np 
Big_East =["Butler","Creighton","DePaul","Georgetown","Marquette","Providence","Seton Hall","St. John's","Villanova","Xavier"]
Matrix = np.matrix([[0,1,2,2,1,0,2,2,0,0],
               [1,0,2,1,1,0,1,2,0,1],
               [0,0,0,0,1,1,0,1,0,0],
               [0,1,2,0,1,0,0,2,0,1],
               [1,1,1,1,0,2,0,2,0,0],
               [2,2,1,2,0,0,0,2,1,0],
               [0,1,2,2,2,2,0,2,0,1],
               [0,0,1,0,0,0,0,0,0,0],
               [2,2,2,2,2,1,2,2,0,1],
               [2,1,2,1,2,1,1,2,1,0]])

Then I used the scipy package to get the eigenvalues of this matrix

from scipy.linalg import eig
vals, vecs = eig(Matrix)
vals

array([ 6.46667856+0.j        , -0.94925450+2.68593791j,
   -0.94925450-2.68593791j, -0.81051027+1.69727814j,
   -0.81051027-1.69727814j,  0.59413298+0.j        ,
   -1.03767371+0.43890421j, -1.03767371-0.43890421j,
   -0.55926963+0.j        , -0.90666494+0.j        ])

The largest eigenvalue of 6.466 is first so we will use 0 for the following,

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
ranking
[Big_East[i] for i in ranking]

['Villanova', 'Xavier', 'Seton Hall','Providence','Butler', 'Creighton', 'Marquette',
'Georgetown',  'DePaul',
 "St. John's"]

This gives us the output we expected, Villanova leading the pack with Xavier and Seton Hall close behind…but how close behind. We can find out by seeing the values of the dominant eigenvector.

 [vec[i] for i in ranking]
[0.55991085197552537,
0.451192600185163,
 0.36953279671682088,
 0.32699628363033117,
 0.27730678759987443,
 0.26443080324811691,
 0.23027638182611157,
 0.17780063976753332,
 0.088287256522301905,
 0.013652643424603825]

Villanova ranks far above Xavier who similarly beats out Seton Hall by a large margin.

Bara223

I am looking at the 2012-2013 NFL season when the Baltimore Ravens made it all the way winning Super Bowl XLVII. Since AFC North only has 4 teams, I am also pulling in the AFC East.

To start we need to do is pull the appropriate libraries

import numpy as np 
from spicy.linalg import eig 

And the matrix for the season would look like

afc_teams = [
'Steelers','Ravens','Browns','Bengals','Patriots','Jets','Dolphins','Bills'
]
M = np.matrix([
[0,1,1,1,0,1,0,1],
[1,0,2,1,2,0,0,0],
[1,0,0,1,0,0,0,0],
[1,1,1,0,0,1,0,0],
[0,0,0,0,0,2,2,2],
[0,0,0,0,0,0,1,1],
[0,0,0,1,0,1,0,1], 
[0,0,1,0,0,1,1,0]
])

The matrix so that the element in space (i,j) is how many times the team in row i beat column j . When we are going to want to look at the ranking of each of the teams we will compute the eigensystem of the matrix and the largest eigenvalue will be used.

array([ 3.37903644+0.j        ,  0.84897022+0.7149968j ,
    0.84897022-0.7149968j , -1.14031087+0.89750678j,
   -1.14031087-0.89750678j, -0.28316098+0.j        ,
   -1.25659708+0.17997325j, -1.25659708-0.17997325j])

vec = abs(vecs[:,2])
ranking = np.argsort(vec).tolist()
ranking.reverse()
ranking
[afc_teams[i] for i in ranking]
['Patriots',
 'Ravens',
 'Browns',
 'Bengals',
 'Jets',
 'Bills',
 'Steelers',
 'Dolphins']

This shows us the embarrassing result of the Ravens (10-6) coming out to second compared to the Patriots (12-4). This is only encapsulating the games that these teams played against each other including the Ravens beating the Patriots in the AFC Championship. The strength of the teams is shown by

[vec[i] for i in ranking]
 [0.74289583347694177,
 0.39402289687863717,
 0.36168595473028037,
 0.23912296687075568,
 0.2079729104459202,
 0.1641598188064686,
 0.16379877463551606,
 0.088453366480447687]

We see that with the strengths of the teams that most of them are relatively close to each other. The Bills and Steelers are within thousandths of each other. The Ravens/Browns and Jets/Bills. are also relatively close to each other each being ~0.03 off from each other. The Patriots are way above the rest of the division with a whopping 0.74 strength. This may be the only time that they’ve inflated anything.

This data was collected from http://www.nfl.com/schedules/2012/POST

Lorentz

Here is an updated ranking of the Big South Conference, I used data from https://www.masseyratings.com/scores.php?s=298892&sub=10672&all=1[Massey Rankings].
The beginning of the code deals with processing the data, which I believe can be used for any other seasons downloaded from masseyratings.

teams=open("Big_South2018teams.txt")
games=open("Big_South2018games")

team_names=[];
for line in teams:
    line = line.strip()
    columns = line.split()
    name = columns[1]
    name_id = int(columns[0].replace(',',''))
    team_names.append(name)
    print(name_id,name)
teams.close()
n=10; #number of teams
list = itertools.combinations(range(1,n+1),2) #Create possible combinations of match-ups.
s=(n+1,n+1) # Initialize the adjacency matrix.
M=np.zeros(s)
for a, b in list:
    aBeatb=0;
    bBeata=0;
    aPlayedb=0;
    games.seek(0) #Reset the iteration through the file.
    for line in games:
        line = line.strip()
        columns = line.split()
        team1 = int(columns[1].replace(',',''))
        team2 = int(columns[4].replace(',',''))
         
        if team1==a and team2==b:
            aPlayedb+=1
            aBeatb+=1
            
        if team1==b and team2==a:
            aPlayedb+=1
            bBeata+=1
        
    M[a,b]=aBeatb
    M[b,a]=bBeata
    
games.close()
M=np.delete(M,(0),axis=0)
M=np.delete(M,(0),axis=1)
print(M)

vals, vecs = eig(M)

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()

[team_names[i] for i in ranking]

results!
M =

[[0. 1. 1. 1. 2. 1. 2. 1. 0. 1.]
[1. 0. 1. 1. 1. 2. 2. 1. 0. 1.]
[1. 1. 0. 1. 1. 1. 2. 1. 1. 0.]
[1. 1. 1. 0. 1. 2. 1. 1. 1. 0.]
[1. 1. 1. 1. 0. 1. 2. 1. 2. 2.]
[1. 0. 1. 1. 1. 0. 0. 0. 0. 0.]
[0. 1. 0. 1. 0. 2. 0. 0. 0. 0.]
[1. 1. 1. 1. 2. 3. 2. 0. 1. 2.]
[2. 3. 1. 1. 1. 2. 2. 1. 0. 1.]
[1. 1. 3. 2. 0. 2. 2. 1. 1. 0.]]

Ranking

[‘Radford’,
‘UNC_Asheville’,
‘Liberty’,
‘Winthrop’,
‘Campbell’,
‘Charleston_So’,
‘High_Point’,
‘Gardner_Webb’,
‘Longwood’,
‘Presbyterian’]

brian

I chose to analyze the 2017-18 Women’s Big South regular season. There are ten teams in this conference and each team played each other twice. The data I used in this analysis can be found here: http://bigsouthsports.com/standings.aspx?path=wbball&standings=1067.

Let’s walk through the code.

Below, a vector containing the names of the teams in the conference was created. Below that is the adjacency matrix for the conference. The rows, i, and columns, j, correspond to the teams. If one reads across a team’s row they will see the number of wins the team had over an opponent designated by j. Likewise, if one reads down a team’s column they will see the number of losses the team suffered by an opponenent designated by i.

womens_big_south=['Liberty', 'Radford', 'UNC Asheville', 'High Point', 'Presbyterian', 
                  'Campbell', 'Gardner-Webb', 'Charleston Southern', 'Longwood', 'Winthrop']

import numpy as np
M=np.matrix([
[0, 1, 1, 2, 2, 2, 2, 2, 2, 2],
[1, 0, 1, 2, 2, 2, 1, 2, 2, 2],
[1, 1, 0, 1, 0, 2, 2, 2, 1, 2],
[0, 0, 1, 0, 1, 1, 1, 2, 2, 2],
[0, 0, 2, 1, 0, 1, 2, 1, 0, 2],
[0, 0, 0, 1, 1, 0, 2, 1, 1, 2],
[0, 1, 0, 1, 0, 0, 0, 2, 2, 2],
[0, 0, 0, 0, 1, 1, 0, 0, 2, 1],
[0, 0, 1, 0, 2, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 1, 1, 0],
])

Next a command from a linear algebra library was imported and used to detrmine the largest eigenvalue of the adjacency matrix.

from scipy.linalg import eig

vals, vecs=eig(M)
vals
array([ 6.77406933+0.j        , -0.94362080+2.93744799j,
       -0.94362080-2.93744799j, -0.90631504+1.37552192j,
       -0.90631504-1.37552192j, -0.25412708+0.32398093j,
       -0.25412708-0.32398093j, -0.80812856+0.44588597j,
       -0.80812856-0.44588597j, -0.94968637+0.j        ])

With the largest eigenvalue known the next step is to use the command argsort to order the corresponding eigenvector. As you can see, the rankings seem reasonable, with Liberty, Radford, and UNC Asheville, the teams with the top three records, in the top three spots and Winthrop, which has the worst record by far, in last place.

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
[womens_big_south[i] for i in ranking]
['Liberty',
 'Radford',
 'UNC Asheville',
 'Presbyterian',
 'High Point',
 'Gardner-Webb',
 'Campbell',
 'Longwood',
 'Charleston Southern',
 'Winthrop']

Last, one can examine the relative strengths of the teams above as determined by the eigenvector.

[vec[i] for i in ranking]
[0.53250964786287791,
 0.50402667503213971,
 0.40010037537610588,
 0.28788959346948595,
 0.27189021766867499,
 0.22142860537468073,
 0.20832877842272651,
 0.18168945900723249,
 0.13376980610047617,
 0.04656865023996376]
Aisling

I chose Liverpool as my favorite team and looked at the English Premier League 2 division 1 so far this season. I got my data from https://www.premierleague.com/results?team=U21. Note that all of the teams did not play the same number of games so each row had to be multiplied by a normalization constant, i.e. the number of games that team played, to account for this. Note that in soccer, there is often ties and so to account for this in the data I gave each team 0.5 points instead of 1 if there was a draw.

We have 12 teams:

premier_league = ['Leicester-City','Liverpool', 'Arsenal', 'Everton','Swansea-City',
             'West-Ham-United', 'Chelsea', 'Derby-County', 'Manchester-City',
             'Tottenham-Hotspur','Sunderland','Manchester-United'
            ]

From the data the resulting matrix with normalization constants:

import numpy as np
M = np.matrix([
[0, 2/19, 1/19, 2/19, 1/19, 1.5/19, 1/19, 1/19, 1/19, 1.5/19, .5/19, .5/19],
[0, 0, 1/18, 1/18, 1/18, 1/18, 1/18, 1/18, 1/18, 1/18, 2/18, 1/18],
[1/17, 0, 0, 1/17, 0, 1/17, .5/17, 1.5/17, 2/17, 0, 2/17, 2/17],
[0, 0, 1/17, 0, 2/17, .5/17, 2/17, 1/17, 1/17, 2/17, 0, .5/17],
[0, 1/18, 1/18, 0, 0, 1/18, 2/18, 1.5/18, 1/18, 1/18, 1/18, .5/18],
[.5/17, 1/17, 0, .5/17, 1/17, 0, .5/17, 0, 1/17, 1.5/17, 1/17, 2/17],
[1/17, 0, .5/17, 0, 0, .5/17, 0, 1/17, 1.5/17, 1/17, 1.5/17, 1.5/17],
[0, 1/17, .5/17, 1/17, .5/17, 1/17, 0, 0, 0, 1/17, 1/17, 1.5/17],
[1/17, 1/17, 0, 0, 1/17, 0, .5/17, 1/17, 0, 1/17, 1.5/17, .5/17],
[.5/18, 1/18, 1/18, 0, 1/18, .5/18, 0, 0, 0, 0, .5/18, 2/18],
[.5/18, 0, 0, 1/18, 0, 1/18, .5/18, 1/18, .5/18, 1.5/18, 0, 0],
[1.5/17, 0, 0, .5/17, .5/17, 0, .5/17, .5/17, .5/17, 0, 1/17, 0]
])

Compute the ranking:

from scipy.linalg import eig
vals, vecs = eig(M)
vals
vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
ranking

[premier_league[i] for i in ranking]

Out:

['Leicester-City',
 'Arsenal',
 'Liverpool',
 'Everton',
 'Swansea-City',
 'West-Ham-United',
 'Chelsea',
 'Manchester-City',
 'Derby-County',
 'Tottenham-Hotspur',
 'Sunderland',
 'Manchester-United']

[vec[i] for i in ranking]

Out:

[0.41093745983779228,
 0.34849836880859575,
 0.33250252988866236,
 0.33119113821286705,
 0.30826805608173635,
 0.28502109251610447,
 0.26634602871485447,
 0.25262677037197495,
 0.2449952525856261,
 0.21307732074838048,
 0.19366011069126468,
 0.18835663526068014]

Notice that the program ranked the teams in almost the same order as I put them in, meaning that this ranking mostly agreed with the rankings found on the site above. Arsenal in this ranking is above Liverpool, whereas on the Premier League site, Liverpool is just above Arsenal. The ‘points’ or rating each team has is also different.

dakota

Since most of my family hails from Maryland, I have to have some allegiance to the Maryland Terrapins Women’s Basketball team. I decided to look at last year’s Big 10 Conference, seeing as Maryland won the 2016-2017 Big 10 Tournament. Data can be found here, with information for each individual team obtained by clicking on the appropriate link.

The first task to ranking the teams post-conference is to create an adjacency matrix and an array of the team names. After the matrix is created, it is possible to then find the dominant eigenvector for said matrix. The following code outlines the above process.

big_ten_2017 = ['Maryland', 'Ohio', 'Michigan', 'Indiana', 'Purdue',
                'Michigan State', 'Penn State','Northwestern', 'Iowa',
                'Minnesota', 'Illinois', 'Wisconsin', 'Nebraska', 'Rutgers',
]
import numpy as np
M = np.matrix([
   [0,0,1,1,1,1,1,1,2,2,2,1,1,1], #Maryland
   [1,0,1,1,1,0,1,1,1,2,1,2,2,1], #Ohio
   [0,0,0,1,1,0,0,1,1,1,1,2,2,1], #Michigan
   [0,0,1,0,1,0,2,1,1,1,1,1,0,1], #Indiana
   [0,0,0,0,0,2,1,1,1,1,1,1,1,1], #Purdue
   [0,1,1,1,0,0,1,0,0,2,1,1,1,0], #Michigan State
   [0,0,1,0,1,0,0,1,1,1,2,1,1,0], #Penn State
   [0,0,0,1,1,1,0,0,0,0,1,1,1,2], #Northwestern
   [0,0,0,0,0,1,0,1,0,1,1,1,1,2], #Iowa
   [0,0,0,0,0,0,0,1,0,0,1,1,1,1], #Minnesota
   [0,0,0,0,0,0,0,0,1,0,0,0,1,1], #Illinois
   [0,0,0,0,0,0,0,0,0,0,1,0,1,1], #Wisconsin
   [0,0,0,1,0,1,0,0,0,0,0,0,0,1], #Nebraska
   [0,0,0,0,0,1,1,0,0,0,0,1,0,0], #Rutgers
])
from scipy.linalg import eig
vals, vecs = eig(M)
vals
#Out[]: array([ 5.81307191+0.j        , -0.50294621+2.76406247j,
               -0.50294621-2.76406247j, -0.20073325+1.55404447j,
               -0.20073325-1.55404447j, -0.33720888+1.19311789j,
               -0.33720888-1.19311789j,  0.23130464+0.14723498j,
                0.23130464-0.14723498j, -0.95209379+0.64741278j,
               -0.95209379-0.64741278j, -1.04788214+0.j        ,
               -0.62091740+0.52722402j, -0.62091740-0.52722402j])

Given that the first value is the largest, that is our dominant eigenvector. We can then utilize that eigenvector to give us a preliminary possible ranking of the teams, demonstrated by the following code.

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
[big_ten_2017[i] for i in ranking]
#Out[]: ['Ohio',
         'Maryland',
         'Indiana',
         'Michigan State',
         'Purdue',
         'Michigan',
         'Penn State',
         'Northwestern',
         'Iowa',
         'Nebraska',
         'Rutgers',
         'Minnesota',
         'Illinois',
         'Wisconsin']

What is interesting is that this is fairly different from the actual post-conference/pre-tournament ranking assigned to each team, the largest jump being three places down or up for a few teams. I believe this to be because many teams had equal win-loss records and this method of ranking does not take into account any other factors.

Finally, we can use our current information to create an array of relative strengths for each team based on our eigenvector.

[vec[i] for i in ranking]
#Out[]: [0.47544101300304487,
         0.46283013935113881,
         0.31489498150366696,
         0.30504622085468475,
         0.29827886993275543,
         0.2867578121473966,
         0.24613018619686708,
         0.23626300025249711,
         0.18881821077068381,
         0.1244837750650855,
         0.10369193429466309,
         0.10111087795354988,
         0.071733831312807153,
         0.051592263966288257]
opernie

I chose the 1968 Big Ten Conference because Ohio State won. All data is taken from https://www.sports-reference.com/cfb/conferences/big-ten/1968-schedule.html. There were 10 teams involved.

from scipy.linalg import eig
import numpy as np
BTC = ['Ohio State','Michigan','Purdue','Minnesota','Indiana','Iowa','Michigan State',
       'Illinois','Northwestern','Wisconsin']
M = np.matrix([
[0,0,0,0,0,0,0,0,0,0],
[1,0,0,0,0,0,0,0,0,0],
[1,0,0,1,0,0,0,0,0,0],
[0,1,0,0,0,1,0,0,0,0],
[0,1,1,1,0,0,0,0,0,0],
[1,0,1,0,1,0,0,0,0,0],
[1,1,1,1,1,0,0,0,0,0],
[1,1,1,1,1,1,0,0,0,0],
[1,1,1,0,0,1,1,1,0,0],
[1,1,0,1,1,1,1,0,1,0]
])
vals, vecs = eig(M)
vals

Looks like we just made a matrix whoms rows and columns represent teams and whose entries are set such that each row represents the wins of the corresponding team.
This code outputs this array of eigenvalues:

array([ 0.        +0.        j,  0.        +0.        j,
        0.        +0.        j,  0.        +0.        j,
        1.39533699+0.        j, -0.46035519+1.13931768j,
       -0.46035519-1.13931768j, -0.47462662+0.        j,
        0.        +0.        j,  0.        +0.        j])

All those zeros seem to worry me.
We can now rank each team based on the teams corresponding eigenvectors.

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist() 
ranking.reverse()
ranking
[BTC[i] for i in ranking]

Note this code outputs the relative strengths of each team from worst to best; looking pretty reasonable.

['Wisconsin',
 'Northwestern',
 'Illinois',
 'Michigan State',
 'Iowa',
 'Indiana',
 'Minnesota',
 'Purdue',
 'Michigan',
 'Ohio State']
dumptruckman

I chose the NBA Eastern Conference for 2016-17. I found the data here http://www.landofbasketball.com/results_by_team/2016_2017_<teamname>.htm replacing <teamname> with the team name such as cavaliers. There were 8 teams in the playoffs. I set them up in an array like so:

nba_eastern_teams = ['Celtics', 'Bulls', 'Wizards', 'Hawks', 'Raptors', 'Bucks', 'Cavaliers', 'Pacers']

I then put the number of wins in a matrix where the team in row i beats the team in column j and the order of each row and column matches the order of the teams in nba_eastern_teams.

import numpy as np

matchups = np.matrix([
[0,2,2,1,1,2,1,3],
[2,0,1,1,2,1,4,2],
[2,3,0,3,1,3,1,3],
[2,3,1,0,2,3,3,1],
[3,1,2,1,0,3,1,2],
[1,3,1,1,1,0,1,3],
[3,0,2,1,3,3,0,3],
[0,2,1,2,1,1,1,0]])

I then ran the matrix through scipy’s eig function.

from scipy.linalg import eig

vals, vecs = eig(matchups)
vals

# OUTPUT
# array([ 12.50439193+0.j        ,  -1.60785975+3.63828514j,
#         -1.60785975-3.63828514j,  -1.78356837+1.75905859j,
#         -1.78356837-1.75905859j,  -2.34084941+0.j        ,
#         -1.69034313+0.46030418j,  -1.69034313-0.46030418j])

The highest eigenvalue is in position 0 of the array, so I will use that eigenvector to find the rankings.

vec = abs(vecs[:,5])
ranking = np.argsort(vec).tolist()
ranking.reverse()
[nba_eastern_teams[i] for i in ranking]

# OUTPUT
# ['Raptors',
#  'Celtics',
#  'Wizards',
#  'Bulls',
#  'Pacers',
#  'Bucks',
#  'Hawks',
#  'Cavaliers']

Unfortunately this is nowhere close to the results of the playoffs which can be found here: https://www.basketball-reference.com/leagues/NBA_2017_standings.html
It seems that this ranking system is not sufficient to predict strength in NBA playoffs.

Sampson

So I will start off by saying I don’t sport. With this in mind, I chose the 2007-2008 Big South Conference for no particular reason. My team array is the following:

Big_South_2007 = ('UNC_Asheville','Winthrop','High_Point','Liberty','Virginia_Military_Institute',
'Coastal_Carolina','Radford','Charlston_Southern')

I then made a matrix for all the teams where a 1 denotes a win and a 0 denotes a loss.

M = np.matrix([
[0,2,1,1,1,2,1,2],
[0,0,1,2,2,1,2,2],
[1,1,0,1,2,0,2,1],
[1,0,1,0,1,2,1,2],
[1,0,0,1,0,2,0,2],
[0,1,2,0,0,0,2,1],
[1,0,0,1,2,0,0,1],
[0,0,1,0,0,1,1,0]
])

Then using the eig(genvalues) function from scipy I produced the eigenvalues and eigenvectors.

vals, vecs = eig(M)
vals
# Out
# array([ 6.29549010 +0.00000000e+00j, -0.99313005 +2.97731726e+00j,
 # -0.99313005 -2.97731726e+00j, -0.82024172 +1.35458277e+00j,
 #  -0.82024172 -1.35458277e+00j, -0.66874657 +0.00000000e+00j,
 #  -1.00000000 +2.35513869e-16j, -1.00000000 -2.35513869e-16j])

Finally, I produced the rankings from the eigenvectors. UNCA is on top as per the usual, followed closely by Winthrop. UNCA and Winthrop played each other in the final, and Winthrop came through with the win. The only difference between my rankings and those of the NCAA are that the Virginia Military Institute and Coastal Carolina were switched. Both teams had identical wins and losses, but obviously the NCAA weighed other variables for their rankings.

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
ranking
[Big_South_2007[i] for i in ranking]

# Out 
#['UNC_Asheville',
#'Winthrop',
#'High_Point',
#'Liberty',
#'Coastal_Carolina',
#'Virginia_Military_Institute',
#'Radford',
#'Charlston_Southern']
Cornelius

Not only do I not find sports interesting, I am morally opposed to them. I shudder to think about all of the wasted time and resources that have gone into the perpetuation of what amounts to adults moving a ball around an open space.

With that being said I will do the 2017 - 2018 Men’s Big South Tournament.

The list of teams that are playing is titled “teams”. The corresponding matrix represents “M” represents the wins and losses of the respective teams against one another. It is how many times did team in row i defeat the team in column j.

teams = ["UNCA","Radford","Winthrop","Campbell","Liberty","Gardner_Webb",
     "High Point","Charleston Southern","Presbyterian","Longwood"]

import numpy as np
from scipy.linalg import eig

M = np.matrix([
    [0,1,1,2,1,1,1,3,2,2],
    [1,0,2,1,3,1,1,1,2,3],
    [1,1,0,1,0,3,2,1,2,2],
    [0,1,1,0,2,1,1,2,2,1],
    [2,0,2,1,0,1,1,1,2,1],
    [1,1,0,1,1,0,1,1,2,1],
    [1,1,0,1,1,1,0,1,1,2],
    [0,1,1,0,1,1,2,0,2,2],
    [0,0,0,0,0,0,1,1,0,2],
    [0,0,0,1,1,1,1,0,0,0]
])
vals, vecs = eig(M)
vals

This code produces the array:

array([ 8.86458228+0.j        , -1.42251792+2.22250211j,
   -1.42251792-2.22250211j, -0.41582791+2.08380196j,
   -0.41582791-2.08380196j, -1.36105506+1.40560219j,
   -1.36105506-1.40560219j, -1.34319525+0.j        ,
   -0.56129263+0.54682978j, -0.56129263-0.54682978j])

This array of eigenvalues represents the relative strengths of each team. Using this array we can create a list that reflects this ranking. Running the following code we can then create this list:

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
[teams[i] for i in ranking]

Here is the result:

['Radford',
 'UNCA',
 'Winthrop',
 'Liberty',
 'Campbell',
 'High Point',
 'Charleston Southern',
 'Gardner_Webb',
 'Longwood',
 'Presbyterian']
MatheMagician

I decided to do the mens Big South 2018 soccer conference results.

We are not good at this sport like basketball, so I find it humbling. The code below calls the necessary packages and shows my matrix. Data came from http://bigsouthsports.com/standings.aspx?path=msoc. Please note, in soccer ties occur so I chose to use 1/2 to represent this phenomena.

Big_South = [
‘high point’,‘radford’,‘campbell’,‘liberty’,‘gardner webb’,‘presbyterian’,
‘longwood’,‘unca’,‘winthrop’
]
import numpy as np
M = np.matrix([
[0,1,0,1,1,1,1,1,1],
[0,0,1,1,1,1/2,1/2,1,1],
[1,1,0,1,1/2,0,1,1,1],
[0,0,0,0,1,1,1,1,1],
[0,0,1/2,0,0,1,1,1,1],
[0,1/2,1,0,0,0,1,1/2,1],
[0,1/2,0,0,0,0,0,1,0],
[0,0,0,0,0,1/2,0,0,1],
[0,0,0,0,0,0,1,0,0]
])
from scipy.linalg import eig
vals, vecs = eig(M)
vals

This code spits out the following array:

array([ 2.95131358+0.j , -0.35737137+1.37011868j,
-0.35737137-1.37011868j, 0.35863523+0.j ,
-0.46059505+0.96682266j, -0.46059505-0.96682266j,
-0.51052977+0.j , -0.58174310+0.64503625j,
-0.58174310-0.64503625j])

Then by using the following code, it lists the teams in the following order.

vec = abs(vecs[:,0])
ranking = np.argsort(vec).tolist()
ranking.reverse()
[Big_South[i] for i in ranking]

[‘campbell’,
‘high point’,
‘radford’,
‘presbyterian’,
‘gardner webb’,
‘liberty’,
‘longwood’,
‘unca’,
‘winthrop’]