Why is the second Principal Component orthogonal to the first one?

Because the second Principal Component should capture the highest variance from what is left after the first Principal Component explains the data as much as it can. (The first principal component has the largest possible variance, that is, accounts for as much of the variability in the data as possible.)

Then where should we look for the leftover variance from the first?

Orthogonal direction.

Below is the python code that generates the above graph.

# Before running PCA, it is important to first normalize X
X_norm, mu, sigma = featureNormalize(X)
# Run PCA
U, S, V = pca(X_norm)
plot = plt.scatter(X[:,0], X[:,1], s=30, facecolors=’none’, edgecolors=’b’)
plt.title(“PCA — Eigenvectors Shown”,fontsize=20)
plt.plot([mu[0], mu[0] + 1.5*S[0]*U[0,0]],
[mu[1], mu[1] + 1.5*S[0]*U[0,1]],
label=’First Principal Component’)
plt.plot([mu[0], mu[0] + 1.5*S[1]*U[1,0]],
[mu[1], mu[1] + 1.5*S[1]*U[1,1]],
label=’Second Principal Component’)
leg = plt.legend(loc=4)

I’m an Engineering Manager at Scale AI and this is my notepad for Applied Math / CS / Deep Learning topics. Follow me on Twitter for more!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store