Back to Chapter Notes
2

Processing and Representing Data

Frequency tables, histograms, pie charts, stem-and-leaf, box plots, cumulative frequency

Frequency Tables

A frequency table organises raw data into groups, making it easier to identify patterns and calculate statistics.

Frequency

The number of times a value or group of values occurs in a dataset.

Class Width

The range of values included in a class interval. For the class 10 ≤ x < 20, the class width is 10.

Mid-point

The middle value of a class interval. Used when estimating the mean from grouped data. For 10 ≤ x < 20, the mid-point is 15.

Estimated Mean from Grouped Data
Mean ≈ Σ(f × x) ÷ Σf

Where f = frequency, x = mid-point of each class
Exam Tip

When calculating the estimated mean, always use the mid-point of each class interval, not the boundary values. The answer is an estimate because we do not know the exact values within each class.

Histograms

A histogram is used to display continuous data grouped into class intervals. Unlike a bar chart, the area of each bar (not the height) represents the frequency.

Frequency Density

The height of each bar in a histogram. Frequency density = Frequency ÷ Class width.

Frequency Density Formula
Frequency Density = Frequency ÷ Class Width

Frequency = Frequency Density × Class Width
Exam Tip

Always label the y-axis 'Frequency Density' on a histogram, not 'Frequency'. If class widths are equal, the histogram looks like a bar chart — but the principle is the same.

Cumulative Frequency

A cumulative frequency diagram allows us to estimate the median, quartiles, and interquartile range from grouped data.

Cumulative Frequency

A running total of frequencies. Plot cumulative frequency against the upper class boundary of each interval.

Reading from a Cumulative Frequency Diagram
  • Median: read off at n/2 on the cumulative frequency axis
  • Lower Quartile (Q1): read off at n/4
  • Upper Quartile (Q3): read off at 3n/4
  • IQR = Q3 − Q1
  • Percentiles: read off at the appropriate fraction of n
Exam Tip

Always plot cumulative frequency at the UPPER class boundary, not the mid-point. Draw a smooth S-shaped curve through the points.

Box Plots

A box plot (box-and-whisker diagram) displays the five-number summary of a dataset and allows easy comparison between distributions.

Five-Number Summary
  • Minimum value
  • Lower Quartile (Q1)
  • Median (Q2)
  • Upper Quartile (Q3)
  • Maximum value
Outlier

A value that lies more than 1.5 × IQR below Q1 or above Q3. Outliers are plotted as separate crosses (×) beyond the whiskers.

Comparing Box Plots

When comparing two box plots, comment on: (1) the median — which group has a higher typical value? (2) the IQR — which group has more spread? (3) the range — which group has more overall spread? (4) skewness — is the distribution symmetric or skewed?

Stem-and-Leaf Diagrams

A stem-and-leaf diagram retains the original data values while organising them in order. It is useful for small datasets.

Back-to-Back Stem-and-Leaf

Two datasets share the same stem, with leaves going left and right. This allows direct comparison of two distributions.

Exam Tip

Always include a key on a stem-and-leaf diagram. For example: '3 | 4 means 34'. Leaves must be written in order (smallest to largest away from the stem).

Pie Charts

A pie chart shows proportions as sectors of a circle. The angle of each sector is proportional to the frequency.

Sector Angle Formula
Angle = (Frequency ÷ Total frequency) × 360°
Comparative Pie Charts

Two pie charts where the area of each circle is proportional to the total frequency it represents. The radius is proportional to the square root of the total frequency.

Comparative Pie Chart Radius
r₂/r₁ = √(n₂/n₁)

Where r = radius, n = total frequency
Exam Tip

For comparative pie charts, you must adjust the radius — not just draw two pie charts of the same size. The area represents the total, so area ∝ n, meaning radius ∝ √n.