Outliers and the range — measuring spread

Outliers and the range — measuring spread

Outliers and the range

An outlier is a value that's much larger or smaller than the rest of the data. It can pull the mean way up or down — but the median and mode barely move.

The range — a simple spread measure

The range is just the difference between the largest and smallest values.

Range = max − min.

Example. Quiz scores 5, 6, 7, 8, 9 → range = 9 − 5 = 4. Tight spread. Example. Scores 5, 6, 7, 8, 95 → range = 95 − 5 = 90. The 95 is an outlier that blows up the range.

How outliers affect the mean

Mean is sensitive — a single extreme value can shift it a lot.

Example. Five donations: 1, 2, 3, 4, 100.
  • Mean = (1+2+3+4+100) ÷ 5 = 22.
  • Median = 3.

The mean 22 does not really describe a "typical" donation in this group. The median 3 does.

How outliers affect the range

Range is also sensitive — it only uses the two extreme values, so any outlier directly enlarges it.

Spotting outliers

Look at the data sorted from smallest to largest. Any value that sits far from the cluster of others is a candidate outlier.

  • 10, 11, 12, 13, 45 → 45 is an outlier.
  • 10, 11, 12, 13, 14 → no outlier.

What to do with an outlier

  • Check it. Is it a measurement error? (Then often discard it.)
  • Keep it but use the median, which is robust against outliers.
  • Mention it. When you report results, say "ignoring one outlier of 45…".

Try it out