Gaussian (“Normal”) distribution

This is the famous “bell shaped curve” which maps the probability of a score as this.

Why does this matter? That’s a long story, I’ll try to spell it out but unless you want either to have a much stronger grip of statistics or just intrigued by the history and elegance, you can safely skip the rest of this!

Details #

So what is this? It’s as “probability density function” for a continuous variable which can take values from -infinity to +infinity but has maximum probability at the centre (zero for the “standard Normal” distribution, i.e. halfway between -infinity and +infinity). So it’s a pretty hypothetical thing, why on earth is it generally called “Normal”? Why does it matter?

With others I prefer to call it the Gaussian distribution to avoid this word “Normal” (and please not “normal”: it’s not) and because Carl Friedrich Gauss has one (of several) good claims to have recognised its importance in 1823 (see for a lovely short summary of the actual history). That points out that de Moivre started its recognition a good 80+ years earlier and Laplace took that further in between the two. The Gaussian distribution matters because it makes possible the whole of “parametric statistics” which, until recent developments in computer power, underpinned the explosion of statistics through the 20th Century.

Although it’s not actually something we see a lot in therapy data, can arise from things as simple as tossing dice. Hang on, didn’t I use tossing a die as an example of a uniform distribution (with observed scores of 1, 2, 3, 4, 5 and 6 equally likely if it’s a fair die). How does that discrete (it only takes six values) uniform distribution get to the bell shaped curve above?

Well, if I tossed six dice repeatedly clearly each time I will get scores between 6 and 36. Here’s the theoretical distribution of scores for that game.

The only actual possible scores are the points but it can be seen that the probability distribution of the possible scores is now far from uniform and is starting to look a bit like this famous bell. Here’s the nearest fit of that Gaussian bell to those probabilities added in green.

They are clearly different but it can be seen that the score is starting to show a distribution not unlike the Gaussian. Here are the distributions of scores for throwing a single die through to throwing eight.

Throwing just one die creates that uniform/flat distribution with each possible score from 1 to 6 equiprobable (at probability 1/6 of course). Two dice creates the inverted v-shaped distribution but by the point at which we are looking at scores of eight dice the distribution is looking much more like a Gaussian (and the score range has moved up to be from 8 to 48 of course).

This is part of what led to the Gaussian distribution being called “normal” or “Normal”: as numbers from (almost any) distribution are summed their distribution tends to the Gaussian. This makes it a particularly plausible model for distributions of “error”. A fact about the Gaussian distribution is that the particular set of scores is always defined by just two “parameters”: the population mean score and the population standard deviation. The so called “standard Gaussian” at the top of this page has mean zero and standard deviation one. Here are two different Gaussians perhaps representing scores on a measure for a help-seeking (HS) and a non-help-seeking (NHS) population.

Here the non-help-seeking population has mean 8 and SD 2 and the help-seeking population has mean 12 and SD 1. The smaller SD for the help-seeking population gives it the sharper, thinner shape but the two shapes are the same: if you shifted and stretched the help-seeking curve you could make it fit exactly on the non-help-seeking. Likewise, if you shifted the non-help-seeking curve to the right and squashed it it could fit exactly on the help-seeking curve.

This means that statisticians, mostly through the 20th Century, could work out many ways of exploring data to see if things looked non-random, systematic, based on assumptions about this famous distribution: “parametric statistical methods” so-called because just those two parameters are needed to describe different Gaussian distributions.

If you have reached the end of this and think “I (still) don’t get it!” then I’m sorry but … don’t worry. It really won’t stop you thinking sensibly about data as we have largely moved on from needing to think of things as Gaussian (when generally they are not). It’s just still around in many papers and textbooks.

Try also #

Anderson-Darling test
Arithmetic mean
Binomial distribution
Cramer-von Mises test
Distribution tests
Hypothesis testing, see Inferential testing, “tests”
Kolmogorov-Smirnov test
Parametric statistics
Shapiro-Francia test
Shapiro-Wilk test
Standard deviation
Statistical tests, see Inferential testing, “tests”
Uniform distribution

Chapters #

Mostly 5, 7 and 8.

Online resources #

At some point, I hope in 2024, I should have shiny animations that illustrate more of this but I should also have online calculation apps for various typical therapy change data scenarios that will draw on bootstrapping and liberate us from parametric statistics.
Some progress on that:
Rblog posts
Explore distributions with plots
Tests of fit to Gaussian distribution
Kolmogorov-Smirnov test

Dates #

Minor updates 13.ii.24.

Powered by BetterDocs