Subgroup mean differences

Page started on static site 31.vii.14, created from that here 4.i.19.  All content is made available under a Creative Commons License. Please feel free to reuse anything here but respect the licence, i.e. give attribution back to here. 

I had a running discussion over some years with a good colleague about the problems of conducting exploratory factor analyses (EFA) or Principal Component Analyses (PCA) on variables where there may be groups within the data and where there might be the same population factor/component structure in the correlations between the variables but where there are perhaps large mean differences between groups on some of the variables to be analysed. I quite often get sent papers to peer review that apply factor analytic methods or PCA to data within which there perhaps clearly is, or may well be, quite large subgroup mean differences. I’m putting this bit of simulation work, done using my beloved R <> in the hope it would help make the issues clearer for people.

  • This link takes you to an HTML file that, I think, shows that the issues really are serious and not hard to understand starting from the fairly well known bivariate case of correlations or regressions where there is group mean structure as well as a simple within group regression relationshhip.
  • This is the PDF version of the file
  • and this is the Rmd file that was used to generate those in Rstudio, see <>. It can also be used to generate ODF and M$ Word format files now too if you have OpenOffice or LibreOffice on any system or M$ Word on a Windoze system.