Back to National 5

Topic

Statistics and Data

Data and statistics is about summarising information, reading graphs, comparing groups, measuring spread, and explaining what the results mean.

Topic explanation

Choose the statistic that answers the question. The mean is useful for a typical value, while range and standard deviation describe spread.

Graph questions require careful reading of labels, scales, and units. Do not calculate until you know what each axis shows.

Scattergraphs show relationships between two numerical quantities. A line of best fit can support an estimate, but predictions outside the plotted range are less reliable.

Quartiles, SIQR and standard deviation all describe spread. Use them with an average so the comparison says something about both typical value and consistency.

Standard deviation measures how spread out the values are from the mean. A larger standard deviation means less consistency; a smaller standard deviation means values are closer together.

Applications answers often need a conclusion, such as which group is more consistent or which option has the higher typical value.

Quick methods

Mean
Add values, then divide by how many values there are.
Median
Put values in order and find the middle.
Range
Largest value minus smallest value.
Quartiles and SIQR
Use the lower and upper quartiles to compare the spread of the middle half of the data.
Scattergraphs
Describe direction, strength and any unusual points before using a line of best fit.
Sample standard deviation
Use s = √(Σ(xᵢ − x̄)² ÷ (n − 1)) when the data is a sample.
Population standard deviation
Use σ = √(Σ(xᵢ − μ)² ÷ N) when the data is the whole population.

Worked examples

Mean and spread

Weekly screen times are 12, 16, 18, 20 and 24 hours. Find the mean and comment on spread.

  1. Add the values: 12 + 16 + 18 + 20 + 24 = 90
  2. There are 5 values, so mean = 90 ÷ 5 = 18
  3. The range is 24 − 12 = 12 hours, so the values are spread across a 12-hour interval.

Answer: The mean is 18 hours; the values have a range of 12 hours.

Watch out: A comment should interpret the statistic, not just repeat the calculation.

Pie chart calculation

A pie chart shows 35% of 240 pupils travel by bus. How many pupils is this?

  1. 35% = 0.35
  2. 0.35 × 240 = 84

So: 84 pupils travelled by bus.

Watch out: The percentage label is not the number of pupils. Multiply by the total surveyed.

Compare standard deviation

Two data sets have the same mean. Set A has standard deviation 3 and Set B has standard deviation 9. Which is more spread out?

  1. Standard deviation measures spread.
  2. The larger standard deviation shows more spread.
  3. 9 is larger than 3.

Final step: Set B is more spread out.

Watch out: The mean does not decide consistency when standard deviation is given.

Quartiles and SIQR comparison

Class A has median 42 and SIQR 5. Class B has median 42 and SIQR 11. Which class has the more consistent middle half of scores?

  1. Both classes have the same median, so the typical score is similar.
  2. SIQR measures spread in the middle half of the data.
  3. Class A has the smaller SIQR.

Answer: Class A is more consistent because its SIQR is smaller.

Watch out: Check that you have not used the median alone when the question asks about consistency.

Line of best fit estimate

A scattergraph links study time to score. The line of best fit passes near 4 hours, 58 marks and 8 hours, 74 marks. Estimate the score for 6 hours.

  1. 6 hours is halfway between 4 and 8 hours.
  2. 58 and 74 are 16 marks apart.
  3. Half of 16 is 8, so estimate 58 + 8.

So: The estimated score is about 66 marks.

Watch out: Use the line of best fit for an estimate, not one nearby plotted point.

National 5 method

Standard deviation table method

Standard deviation measures how spread out values are from the mean. It is always zero or positive and uses the same units as the original data. A larger standard deviation means the values are more spread out.

Sample standard deviation

s = √( Σ(xᵢ − x̄)² ÷ (n − 1) )

Use sample standard deviation when the data is a sample from a larger population. It divides by n − 1.

Population standard deviation

σ = √( Σ(xᵢ − μ)² ÷ N )

Use population standard deviation when the data set is the whole population. It divides by N.

Worked sample table for 4, 6, 8, 10 and 12

Mean: x̄ = 8

Sample standard deviation table with separate columns for x, x minus x bar, and squared difference
xx − x̄(x − x̄)²
4-416
6-24
800
1024
12416
Σ(x − x̄)²40

Worked sample calculation

  1. Data values: 4, 6, 8, 10, 12.
  2. Mean: x̄ = 8.
  3. Σ(x − x̄)² = 40.
  4. s = √(40 ÷ (5 − 1)) = √10 ≈ 3.16.
  5. The values are typically about 3.16 units from the mean.

Population comparison

  1. Use the same Σ(x − μ)² = 40.
  2. σ = √(40 ÷ 5) = √8 ≈ 2.83.
  3. The population standard deviation is smaller here because it divides by N instead of n − 1.
  4. If you are unsure, the question should tell you whether the data is a sample or the whole population.

Practice prompts

  • Complete one missing value in an x − x̄ column.
  • Complete one missing value in an (x − x̄)² column.
  • Calculate Σ(x − mean)² from a table.
  • Choose whether to divide by n − 1 or N.
  • Calculate sample standard deviation from a completed table.
  • Compare two sets using mean and standard deviation.