Re: factor loadings

Chris Evans (C.Evans@sghms.ac.uk)
Mon, 27 Jan 1997 09:10:52 +0100

On 27 Jan 97 at 6:50, Bob Green wrote:

> Advice is sought regarding what parameters can affect the signs in a
> PCA/factor analysis. I am trying to work out why the factor loading signs
> vary when two versions of the same program are applied to the same data.
> The loadings are not exactly mirrored.
>
The signs of loadings in PCA are arbitrary. Computer programs can
use different algorithms to calculate them and different methods can
produce different signs. Many methods are iterative approximating
methods with a preset precision, i.e. the program does the
calculation once taking a guesstimated answer as input, takes the
answer from that and feeds it back in and runs again, each time it
reruns it compares some aspect of the result from the latest run with
that from the run before and it stops when the two are sufficiently
close according to the precision built in by the programmer. All
sorts of things can cause different programs to come out with
different signs.

However, slightly different answers is more of a concern. Since
computing power was pretty limited when many programs were written
they may have been given a low precision setting on the grounds that
grid analysis rarely needs great precision (and, obviously, as that
would allow the program to complete more rapidly). You can set some
aspect of the precision you want in G-PACK though this is not common
in the programs I know. It's hard coded into INGRID, I can't vouch
for other programs. I've been impressed by the precision with which
INGRID and things like SAS/IML concurr as I know Patrick Slater was
using an algorithm written for a very early version of FORTRAN
whereas SAS must represent one of the latest and best bits of code
for this sort of work. It is possible that some other grid programs
didn't select such a good algorithm and that they are actually
spitting out rather inaccurate answers for some or all grids. The
maths/programming of getting these things right is not at all trivial
as a huge amount of calculating is involved in a PCA as the precision
of the basic operations in computers is finite (and quite limited, 1
in 2^8, i.e. 1 in 256 on an 8 bit processor). Of course, compiler
programmers and other clever people have already worked out good
solutions for calculating PCA as Patrick's example seems to show.
However, it would be quite possible to find a bad solution and not
know it until someone like Bob really checks out carefully. It's
also possible that differences would only appear with some sorts of
grids, not others.

Hm! I think someone pressed my "anorak on" button. Hope this is
a) roughly correct (Richard Bell are you out there? Am I about
right?!)
b) not a complete turn off for those not intrigued by these things!

Best wishes,

Chris

Chris Evans, Senior Lecturer in Psychotherapy.
Dept. Gen. Psychiatry, St. George's Hospital Medical School,
(London University), Cranmer Terrace, London SW17 0RE, Britain
Tel/fax.: (+44|0) 181 725 2540 Email: C.Evans@sghms.ac.uk
http://psyctc.sghms.ac.uk/

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%