Summarising Data

Mean, median, mode, range, IQR, standard deviation, standardised scores, skewness

Measures of Average

Measures of average (central tendency) describe the 'typical' value in a dataset. The three main measures are the mean, median, and mode.

Mean

The sum of all values divided by the number of values. Affected by extreme values (outliers).

Mean Formula

x̄ = Σx ÷ n

For grouped data: x̄ ≈ Σ(f × x) ÷ Σf  (where x = mid-point)

Median

The middle value when data is arranged in order. For n values, the median is at position (n+1)/2. Not affected by outliers.

Mode

The most frequently occurring value. A dataset can have no mode, one mode (unimodal), or two modes (bimodal). Best for qualitative data.

Which Average to Use?

Mean: use when data is roughly symmetric with no outliers. Median: use when data is skewed or has outliers. Mode: use for qualitative data or when the most common value is needed.

Measures of Spread

Measures of spread describe how spread out the data is around the average.

Range

Maximum value − Minimum value. Simple but affected by outliers.

Interquartile Range (IQR)

IQR = Q3 − Q1. The spread of the middle 50% of the data. Not affected by outliers.

Standard Deviation (σ)

A measure of the average distance of each data point from the mean. A larger standard deviation means the data is more spread out.

Standard Deviation Formula

σ = √[ Σ(x − x̄)² ÷ n ]

Or equivalently: σ = √[ (Σx²/n) − x̄² ]

Exam Tip

You will usually be given the standard deviation formula in the exam. The key steps are: (1) find the mean, (2) subtract the mean from each value, (3) square each result, (4) find the average of the squared differences, (5) take the square root.

Standardised Scores

A standardised score (z-score) allows comparison of values from different datasets by measuring how many standard deviations a value is from the mean.

Standardised Score (Z-score) Formula

z = (x − x̄) ÷ σ

Where x = individual value, x̄ = mean, σ = standard deviation

Skewness

Skewness describes the asymmetry of a distribution.

Positive Skew

The tail is on the right. Most data is clustered on the left. Mean > Median > Mode.

Negative Skew

The tail is on the left. Most data is clustered on the right. Mean < Median < Mode.

Symmetric Distribution

The distribution is balanced. Mean = Median = Mode.

Skewness Measure

Skewness ≈ 3(Mean − Median) ÷ Standard Deviation

Positive result → positive skew. Negative result → negative skew. Zero → symmetric.

Exam Tip

You can identify skewness from a box plot: if the median is closer to Q1, the distribution is positively skewed. If closer to Q3, it is negatively skewed.

Ch 2: Processing and Representing Data

Ch 4: Scatter Diagrams and Correlation