Five-Number Summary Overview

The five-number summary is a set of five statistics that provides a compact description of a dataset:

  1. Minimum (Min): Smallest value
  2. First Quartile (Q₁): 25th percentile
  3. Median (Q₂): 50th percentile (middle value)
  4. Third Quartile (Q₃): 75th percentile
  5. Maximum (Max): Largest value

This summary is particularly useful for creating box plots and understanding data distribution.

Five-Number Summary for Ungrouped Data

Calculation Steps

  1. Arrange data in ascending order
  2. Find the minimum value
  3. Calculate Q₁ using quartile formula
  4. Calculate Q₂ (median) using median formula
  5. Calculate Q₃ using quartile formula
  6. Find the maximum value

Formula

$$\text{Five-Number Summary} = {\text{Min}, Q_1, Q_2, Q_3, \text{Max}}$$

Example 1: Simple Dataset

Daily temperature (in °F) for 15 days:

72, 75, 78, 80, 82, 85, 88, 90, 92, 94, 95, 96, 98, 100, 102

Find the five-number summary.

Solution:

Arranged in ascending order (already sorted):

72, 75, 78, 80, 82, 85, 88, 90, 92, 94, 95, 96, 98, 100, 102

n = 15

Minimum: 72

Q₁ (First Quartile):

$$Q_1 = \text{Value of } \left(\frac{1(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (4)^{th} \text{ obs.} = 80$$

Q₂ (Median):

$$Q_2 = \text{Value of } \left(\frac{15+1}{2}\right)^{th} \text{ obs.} = \text{Value of } (8)^{th} \text{ obs.} = 90$$

Q₃ (Third Quartile):

$$Q_3 = \text{Value of } \left(\frac{3(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (12)^{th} \text{ obs.} = 96$$

Maximum: 102

Five-Number Summary: {72, 80, 90, 96, 102}

Example 2: Dataset with Outliers

Test scores of 20 students:

45, 52, 58, 62, 68, 70, 72, 75, 78, 80, 82, 85, 88, 90, 92, 95, 98, 100, 105, 112

Find the five-number summary.

Solution:

n = 20 (even)

Minimum: 45

Q₁:

$$Q_1 = \text{Value of } \left(\frac{1(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (5.25)^{th} \text{ obs.}$$

$$= 5^{th} + 0.25(6^{th} - 5^{th}) = 68 + 0.25(70 - 68) = 68.5$$

Q₂ (Median):

$$Q_2 = \text{Value of } \left(\frac{20+1}{2}\right)^{th} \text{ obs.} = \text{Value of } (10.5)^{th} \text{ obs.}$$

$$= 10^{th} + 0.5(11^{th} - 10^{th}) = 80 + 0.5(82 - 80) = 81$$

Q₃:

$$Q_3 = \text{Value of } \left(\frac{3(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (15.75)^{th} \text{ obs.}$$

$$= 15^{th} + 0.75(16^{th} - 15^{th}) = 92 + 0.75(95 - 92) = 94.25$$

Maximum: 112

Five-Number Summary: {45, 68.5, 81, 94.25, 112}

Five-Number Summary for Grouped Data

For grouped data, first calculate Q₁, Q₂, and Q₃ using the grouped data quartile formulas, then identify the minimum and maximum from the class intervals.

Example: Grouped Data

Class Interval Frequency
10-20 4
20-30 8
30-40 12
40-50 10
50-60 6

Find the five-number summary.

Solution:

Cumulative Frequencies:

Class f CF
10-20 4 4
20-30 8 12
30-40 12 24
40-50 10 34
50-60 6 40

N = 40

Minimum: 10 (lower bound of first class)

Q₁: Position = N/4 = 10, Class = 20-30

$$Q_1 = 20 + \left(\frac{10 - 4}{8}\right) \times 10 = 27.5$$

Q₂ (Median): Position = N/2 = 20, Class = 30-40

$$Q_2 = 30 + \left(\frac{20 - 12}{12}\right) \times 10 = 36.67$$

Q₃: Position = 3N/4 = 30, Class = 40-50

$$Q_3 = 40 + \left(\frac{30 - 24}{10}\right) \times 10 = 46$$

Maximum: 60 (upper bound of last class)

Five-Number Summary: {10, 27.5, 36.67, 46, 60}

Box Plot Visualization

The five-number summary is visualized as a box plot:

     Min    Q₁    Median   Q₃    Max
     |------|------|------|------|
     |      |======|======|      |
     |      |  Box (IQR)  |      |
     Whisker |      |      | Whisker

Interpretation Guide

Component Meaning
Min to Q₁ Lower 25% of data (spread of lowest quartile)
Q₁ to Median 25%-50% of data
Median to Q₃ 50%-75% of data
Q₃ to Max Upper 25% of data (spread of highest quartile)
IQR = Q₃ - Q₁ Middle 50% of data (interquartile range)
Range = Max - Min Total data spread

Statistical Insights

  • Symmetric Distribution: If (Median - Q₁) ≈ (Q₃ - Median)
  • Skewed Distribution: If (Median - Q₁) ≠ (Q₃ - Median)
  • Outlier Detection: Values beyond Q₁ - 1.5(IQR) or Q₃ + 1.5(IQR)

Applications

  • Exploratory Data Analysis: Quick dataset overview
  • Comparative Analysis: Comparing multiple datasets
  • Outlier Detection: Identifying unusual values
  • Data Quality Assessment: Understanding data spread
  • Reporting Summary Statistics: Business dashboards
  • Statistical Graphics: Creating box plots