Back to RStudio and Project Skills

Higher Applications of Mathematics

Linear regression in RStudio

Fitting models and reading regression output.

Before you start

  • Know the statistical question you are trying to answer.
  • Check that variables are named clearly and measured in suitable units.
  • Be ready to write an interpretation, not just copy RStudio output.

Method chooser

Which RStudio method do I use?

RStudio lesson

Key idea

  • This topic focuses on fitting and interpreting a straight-line model. In Higher Applications, RStudio is used as a practical tool to calculate, graph and test ideas from data.
  • A strong answer shows what command was used, what output was produced, and what that output means in context. Use careful language such as 'This suggests...' or 'There is evidence to suggest...'.

Key commands and skills

  • model <- lm(y ~ x, data = data)
  • names(data)
  • summary(data)
  • Use comments in scripts with # to explain your steps.

Technology output practice

Output interpretation preview

Read the simulated output, pick out the key value, then turn it into a written conclusion. This is a learning preview, not a real RStudio environment.

Context

Linear model output

A youth group models total ticket income from the number of tickets sold.

Simulated output

> model <- lm(income ~ tickets, data = sales)
> coefficients(model)
(Intercept)     tickets
     42.10        5.70

Fitted model: income = 42.10 + 5.70 x tickets

Intercept

42.10

The model's starting value when tickets is 0. It may not have a useful real-life meaning.

Slope

5.70

For each extra ticket, predicted income increases by about 5.70 pounds.

Prediction

Use with care

Predictions are more reliable inside the range of data used to fit the model.

What it means

The slope is the most useful value for explaining how the variables are linked. The intercept should be interpreted only if zero tickets makes sense in context.

What to write

The model predicts that each extra ticket sold increases income by about 5.70 pounds. Predictions should be treated cautiously if the number of tickets is outside the original data range because that would be extrapolation.

Weak answer: The answer is 42.10 because it is first in the table.

Watch out

Check which coefficient matches the question. For rate of change, use the slope beside the explanatory variable.

Which value tells you the increase in income per extra ticket?

Choose an option, then check the feedback.

Worked examples

Walkthrough 1

Run the command

A pupil is using RStudio for fitting and interpreting a straight-line model with a small school-friendly data set.

  1. Load or identify the data frame.
  2. Check the exact column names with names(data).
  3. Run the key command: model <- lm(y ~ x, data = data)

The output should be checked against the variables and the original question.

Walkthrough 2

Read the output

RStudio has produced numerical or graphical output.

  1. Find the key value, graph feature or p-value.
  2. Check the unit and variable name.
  3. Avoid copying every line of output into the conclusion.

The gradient estimates the expected change in y for each 1-unit increase in x.

Walkthrough 3

Write the interpretation

The result must be used in a project conclusion.

  1. Start with a cautious phrase such as 'This suggests...'.
  2. Refer to the context and variables.
  3. Mention a limitation if the data set is small, biased or observational.

The conclusion should be clear, cautious and linked to evidence.

Watch out

  • Misspelling a data frame or column name.
  • Forgetting brackets or quotation marks in a command.
  • Copying output without explaining what it means.
  • Claiming causation from correlation.
  • Using strong language such as 'proves' when the data only suggests evidence.

Statistics connection

Related Statistics topics

Next step

Move into practice

Use the learning notes to read output tables carefully, then try varied summary, correlation, regression and test-output interpretation.

RStudio mixed quiz