Independence of observations

This is an absolutely fundamental aspect of all statistical methods: the idea that all the observations, all the values on any of the variables, are independent of the others.

All basic statistical modelling, estimation and “tests” have as a fundamental part of their model that the observations are “independent”, that is that the value of one doesn’t make the value of another any more likely to be higher or lower. The name is absolutely descriptive and accurate.

Violation of this independence can lead to findings that can be very misleading, i.e. our simple tests are not “robust” to non-independence of observations.

Details #

What this means is that you can apply statistical theory, the maths of statistics, to whatever the model is. However, very often in our field observations are not independent, they are “nested” or “associated”. For example observations from the same client are likely to be more similar to each other than they are to observations from another client, trajectories of change might be different between therapists, we can’t just assume that therapists have no impact. (At its simplest that wouldn’t really be what we want to find would it?!)

To handle this situation the realm of “multi-level models” evolved. This only assumes independence, or modelled levels of dependence (see autocorrelation) within “levels”, so we could analyse data within clients, nested within therapists, nested within services. Multilevel modelling (MLM) is probably emerging as the dominant term for these methods but they are also known as “random coefficients regression” and hierarchical linear modelling (HLM).

MLM and MLMwin were names of specialised statistical software to handle MLMs that came out of the educational research world in the UK and were dominant there and more widely for a while, however their use has declined as more general statistical packages have acquired the abilities to handle MLMs. Similarly, the acronym HLM was the name of a software system to carry out MLM/HLM that was much used in North America but again is now probably much less used than it was.

In many ways the methods of “time series analysis” (TSA) can be seen as a subset of multilevel model. These are what they say: ways of analysing data that doesn’t have independence of observations because either all the observations are from one person, or else they are a collection of such series. However, they have evolved rather separately from MLM and the fact that they started for analysing data from individuals (or individual processes) does set them a bit apart from MLM.