# Covariance Mean subtraction in Principal Component Analysis

Discussion in 'MATLAB' started by fearry, Sep 15, 2005.

1. ### fearryGuest

I was just wondering what is the advantage of subtracting the Mean from the covariance matrix when computing PCA. Is it valid to Perform PCA without previously subtracting the mean.

I am working with dimension reduction of images.From my experiments I am finding that reconstructions of images are much more accurate when I don't subtract the mean from the covariance when performing PCA.

Can anyone help explain this?

fearry, Sep 15, 2005

2. ### Frederick W. KoehlerGuest

It's my understanding in the Chemometrics (Analytical Spectroscopy)
community, that it's not a "true" PCA calculation w/o mean centering.

If you do not mean center, the first factor will be very closely related
to the mean.

In the "old" days, when we used to do this sort of calculation in
Fortran and were limited to single precision, there were some numerical
issues for wanting to remove the mean, but I doubt if that's still
really an issue in double precision.

Frederick Koehler

Frederick W. Koehler, Sep 15, 2005

3. ### John D'ErricoGuest

Yes, you can do PCA without subtracting the mean. Its
entirely valid. But almost certainly the first component
will be heavily biased towards the mean.

(By the way, if you do not subtract the mean, its not
called a covariance matrix. I've typically seen it called
a second moment matrix in that case.)

Your statement that you get much more accuracy when the
mean is not subtracted makes no sense. Are you comparing
the fraction of the total variance explained by a fixed
number of components? If so, its an invalid comparison,
since with the mean in there the total sum of squares
is strongly inflated. In terms of absolute predictive
value the PCA based on the covariance matrix must be a
better predictor (unless the mean was zero) when compared
to the PCA based on the second moment matrix. This should
be a mathematically provable statement.

My guess is that you have either made a mistake in your
reconstructions or the problem is the one I suggest
above.

HTH,
John D'Errico

John D'Errico, Sep 16, 2005
4. ### Rune AllnorGuest

The reason why your analysis works better without the mean subtracted,
is
probably that images are non-negative in the first place.

As for why the mean is subtracted in general PCA, consider two
orthogonal sinusoidals s1 and s2 defined as, say,

s1 = [sqrt(3)/2 1/2]'
s2 = [-1/2 sqrt(3)/2]'

A signal that comprises these sinusoidals will come out fine with
respect
to the eigenvectors of the covariance matrix, that is hermitian an thus

have orthogonal eigenvectors.

Now, if you add a non-zero mean m to these vectors, the vectors
v1 = m + s1 and v2 = m + s2 are no longer orthigonal. And so
the relation between the signal components and the eigenvectors
of the covariance matrix is no longer "easy" to deal with.

But again, these considerations work for the parameter estimation
problem of sinusoidsals. They need not apply to other types
of signals, like images.

Rune

Rune Allnor, Sep 16, 2005
5. ### John D'ErricoGuest

This is an interesting point of view. But one can still
find an orthogonal basis for the vectors [v1,v2]. It simply
won't be the original trig functions. Its still just as
valid, and will still reconstruct the data as well.

In fact, since the eigenvalues for the original set will be
equal to each other, even the original trig functions may
not be recovered, since eigenvectors corresponding to
multiple eigenvalues are not unique.

John

John D'Errico, Sep 16, 2005
6. ### Rune AllnorGuest

Sure. This si the Karhunen-Love Transform, if I am not mistaken.
My point is merely that one usally imposes some significance
on the eigenvecors, that holds in the zero-mean case.

This significance, which usually is the basis for whatever
elaborate analysis one is up to, is then lost in the case
of a non-zero mean.
There are several ways of getting from a set of eigenvectors
to trig functions. MUSIC uses the NULL space of the covariance
matrix of the noise-free signal, and searches for the sines
that are orthogonal to the null space.

The Kumaresan-Tufts Forward-Backward Linear prediction scheme
sets up a set of equatons from the eigenvectors of the signal
space of the covariance matrix, and solves for the frequency
terms.

But all these methods provide ambiguous results, due to
cos(-x) = cos(x), so you basically need restrictions on
the solution to get a unique answer. In some applications
it might be useful to convert the data from a real-valued
representation to a complex-valued representation to avoid
the ambiguity due to Euler's equations,

2cos(x) = exp(jx) + exp(-jx)
j2sin(x) = exp(jx) - exp(-jx)

Rune

Rune

Rune Allnor, Sep 20, 2005