We all know what age is so why put it here?!

  1. Because it’s a pretty universally available demographic variable that should probably always be collected in ROM.
  2. Because it’s a potential de-anonymisation risk: having someone’s age and gender may, combined with other data, identify them to someone who knows something about the source of the data. This can be an important reason if sharing pseudonymised data not to give age to the nearest year but to give it in age groups (and watch changing age in longitudinal data giving away birthdays or to the nearest week).
  3. But mostly because it’s a good example of the folly of taking Stevens’ categories of scales too seriously: on the face of it age is a paradigmatic ratio scale. That’s to say that someone who is 40 really is twice as old as someone who is 20 (OK, we’re not splitting hairs to the nearest day, minute or second here nor debating the exact timing of birth!) However, the psychological meaning of age is far more complex: a 14 year old is twice as old as a 7 year old but in many ways they are living lives that are qualitatively radically different so that ratio is pretty meaningless and not the same as the way a 2 year old is twice the age of a one year old. However, the differences between being 80 and being 40 may again be as huge as between 14 and 7 … but are they. To compound this, the scaling may actually be personal: one person may feel that their 18th birthday was a joyous marker of adulthood while another may be bitterly wondering where the exciting teenage years they were supposed to have had disappeared.

