PCA ,or Principal Component Analysis, is defined as the following in wikipedia[1]:

A statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.

In other words, it is a technique that reduces an data to its most important components by removing correlated characteristics. Personally, I like to think of it as reducing an data to its “essence.”

PCA Introduction

Principal component analysis, or what I will throughout the rest of this article refer to as PCA, is considered the go-to tool in the machine learning arsenal. It has applications in computer vision, big data analysis, signal processing, speech recognition, and more. Many articles, professors, and textbooks tend to shroud the method behind a wall of text or equations. Let me tear down that shroud!

function [V,E,D] = pca(X) % do PCA on image patches % % INPUT variables: % X matrix with image patches as columns % % OUTPUT variables: % V whitening matrix % E principal component transformation (orthogonal) % D variances of the principal components % Calculate the eigenvalues and eigenvectors of the new covariance matrix. covarianceMatrix = X*X'/size(X,2); [E, D] = eig(covarianceMatrix); % Sort the eigenvalues and recompute matrices [dummy,order] = sort(diag(-D)); E = E(:,order); d = diag(D); dsqrtinv = real(d.^(-0.5)); Dsqrtinv = diag(dsqrtinv(order)); D = diag(d(order)); V = Dsqrtinv*E';