(20 pts)
In this problem, you're going to perform PCA on a small data set in 2D.
First, generate your data by taking by taking the x coordinates to be the positions in the alphabet of the first four letters in your first name and the y coordinates to be the positions in the alphabet of the first four letters in your last name. These should be the columns of your data matrix.
Then, be sure to center your data. You can use Code like the following to accomplish this:
pos = lambda c: ord(c.lower()) - ord('a') + 1
x = np.array([pos(c) for c in 'Mark'])
x = x - x.mean()
y = np.array([pos(c) for c in 'McCl'])
y = y - y.mean()
X = np.matrix([x,y]).transpose()
X
# Output:
# matrix([[ 2.25, 5.25],
# [-9.75, -4.75],
# [ 7.25, -4.75],
# [ 0.25, 4.25]
# ])
Once you've set up your matrix, you can modify the code in this column of slides to perform to PCA. When responding to this post, be sure to show all the code that
- defines your matrix,
- computes the principal components, and
- plots the data together with a line showing the direction of the first principal component.
Finally, indicate the variance in the directions of those first two principal components.




























