Factor scores

I haven’t come across these for years now so this is going to be a short summary of quite a complicated issue.

Factor scores are scores imputed to individuals whose data was analysed using factor analysis. They can be computed for either exploratory or confirmatory factor analyses. In principle, for a multi-item measure assumed to be measuring more than one latent variable, factor scores are arguably a better way to give each respondent a score on each latent than just adding up their item response values on the scales mapping to the latent variables.

Details #

Let’s take the ever handy HADS: Hospital Anxiety and Depression Scales. This measure has two subscales purporting, fairly reasonably with that rather diagnostic/disease model, to measure anxiety and depression with seven items on each subscale. All the items have responses scored from 0 to 3 so simple scores for each scale range from 0 to 21. Factor scores can be retrieved from factor analysis of a dataset of responses on the measure and would use the loadings of each item on its factor to compute a score for that factor for each respondent. In principle these scores are more reliable and more valid, in construct validity terms, than the raw scores.

Factor scores could be based on EFA of the HADS data or CFA. Where the data show very good conformity to the CFA model used, which for the HADS would be an oblique two factor model, the CFA scores are preferable to the EFA ones but this is getting into quite small print and for measures whose fit to the CFA model is less good, the choice is getting pretty arbitrary though it would be good where that is the case if reports looked at the choice and the issues of poor fit to the CFA model.

Issues #

“In principle” here means pretty much “assuming that the CTT (Classical Test Theory) model applies”. That is one issue as there are good grounds to believe that the CTT model is only a rough model of what goes on when a large number of people answer the HADS.

Another issue is that these scores are specific to the loadings used to compute the scores. In general that means the set of factor loadings obtained in that particular factor analysis, that dataset though in principle a scoring system that uses loadings from an earlier factor analysis can be used to compute scores and that used to be used sometime to give scoring rules for measures though I haven’t seen that used in many years.

Yet another issue is that there are three different ways of computing factor scores each with different pros and cons: if you are reading a report that talks about factor scores it should be clear what method was used to compute the scores.

Try also #

Classical Test Theory
Confirmatory Factor Analysis (CFA)
Construct validity
Exploratory Factor Analysis (EFA)
Factor Analysis
Hospital Anxiety and Depression Scales (HADS)
Reliability
Validity

Chapters #

Not covered in the OMbook.

Online resources #

Hm. One day, when I am avoiding other work, I will knock up an Rblog post looking at these issues for some HADS or other data!

Dates #

First created 2.iii.26.

want to suggest changes or got questions?

Updated on 1st June 2026