Principal component analysis (PCA) compute mean using python

https://stackoverflow.com/questions/15851869

02-04-2022
|

Question

I am a beginner to python and I am implementing Principal component analysis (PCA) using python, but I am having a problem computing the mean. Here is my code:

import Image
import os
from PIL import Image
from numpy import *
import numpy as np


#import images
dirname = "C:\\Users\\Karim\\Downloads\\att_faces\\New folder"
X = [np.asarray(Image.open(os.path.join(dirname, fn))) for fn in os.listdir(dirname)]

#get number of images and dimentions
path, dirs, files = os.walk(dirname).next()
num_images = len(files)
image_file = "C:\\Users\\Karim\\Downloads\\att_faces\\New folder\\2.pgm"
img = Image.open(image_file)
width, height = img.size

print width
print height
print num_images

M = (X-mean(X.T,axis=1)).T # subtract the mean (along columns)

I get the error:

AttributeError: 'list' object has no attribute 'T'

Solution 2

images -= np.mean(images, axis=0)

OTHER TIPS

The problem is X.T in your last line because X is a python list, not a numpy.ndarray. It isn't clear what you're trying to do here but if you wanted to combine all the image arrays into a single numpy array, you could convert X = np.array(X) before the last line.

Also, unless you specifically want to roll your own PCA implementation, you can do this much more easily with numpy by using np.cov (for covariance calculation) and np.linalg.eig (to compute the eigenvalues and eigenvectors of the covariance matrix).

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow