Back to Statistics and Probability

Higher Applications of Mathematics

Box plots and histograms

Comparing distributions visually.

Before you start

  • Be confident reading values from tables and graphs.
  • Check units, sample size and what each variable represents.
  • Use context in written answers, especially when interpreting results.

Method chooser

Which statistics method do I use?

Statistics lesson

Key idea

  • This topic focuses on comparing distributions, skewness, outliers and grouped numerical data. In Higher Applications, the aim is to use statistical methods to make careful decisions from real data.
  • Good statistical work has three parts: choose a suitable method, carry it out accurately, then explain what the result means in the situation.
  • When writing conclusions, use cautious language such as 'this suggests' or 'there is evidence to suggest'. Data can support a conclusion, but it rarely proves it completely.

Key formulae, definitions and methods

  • A box plot shows minimum, lower quartile, median, upper quartile and maximum.
  • IQR = upper quartile - lower quartile.
  • A histogram uses frequency density when class widths are unequal.

Technology output practice

Interpreting statistical output

Read the simulated output, pick out the key value, then turn it into a written conclusion. This is a learning preview, not a real RStudio environment.

Context

Summary statistics output

A class compares journey times to a sports venue, measured in minutes.

Simulated output

> summary(travel$minutes)
Min.   1st Qu.   Median   Mean   3rd Qu.   Max.
18.0     26.0      30.5    31.4     36.0    48.0

> sd(travel$minutes)
[1] 6.8
Mean31.4
Median30.5
Standarddeviation 6.8

Mean

31.4 min

The average journey time in the sample.

Median

30.5 min

Half the journeys were shorter than this and half were longer.

Standard deviation

6.8 min

A typical spread from the mean; smaller would mean more consistent times.

What it means

The typical journey took just over 30 minutes. The standard deviation shows there was some variation, so one journey time should not be treated as exact for everyone.

What to write

The mean journey time was 31.4 minutes and the median was 30.5 minutes, so a typical journey was about 31 minutes. The standard deviation of 6.8 minutes shows the journey times varied by several minutes.

Weak answer: The standard deviation is 6.8, so the average is 6.8.

Watch out

Remember that standard deviation is not the average. It describes spread, not centre.

Which value would you quote to describe consistency?

Choose an option, then check the feedback.

Worked examples

Worked example 1

Choose the method

A school compares travel times for pupils using bus, car and walking routes.

  1. Find the five-number summary.
  2. Draw the scale accurately.
  3. Mark the box, median and whiskers.

The box plot summarises the distribution without showing every data point.

Worked example 2

Carry out and interpret

A school compares travel times for pupils using bus, car and walking routes.

  1. Compare medians to discuss typical values.
  2. Compare IQRs or ranges to discuss spread.
  3. Comment on skewness or possible outliers.

A stronger comparison mentions centre, spread and context.

Worked example 3

Check the conclusion

A school compares travel times for pupils using bus, car and walking routes.

  1. For grouped data, calculate frequency density if needed.
  2. Draw bars using class width and density.
  3. Use area to represent frequency.

Histograms need careful scales because bar area represents frequency.

Watch out

  • Choosing a method because it is familiar rather than because it matches the data.
  • Giving a numerical answer without explaining what it means in context.
  • Mixing up sample evidence with certainty about the whole population.
  • Ignoring outliers, skewness, units or the scale on a graph.
  • Using causal language when the data only shows association.

Technology connection

Related RStudio and Spreadsheet topics

Next step

Move into practice

Use the learning notes to choose suitable summaries and conclusions, then try varied data sets, tables, p-values and interpretation prompts.

Statistics mixed quiz