Five-Number Summary Overview
The five-number summary is a set of five statistics that provides a compact description of a dataset:
- Minimum (Min): Smallest value
- First Quartile (Q₁): 25th percentile
- Median (Q₂): 50th percentile (middle value)
- Third Quartile (Q₃): 75th percentile
- Maximum (Max): Largest value
This summary is particularly useful for creating box plots and understanding data distribution.
Five-Number Summary for Ungrouped Data
Calculation Steps
- Arrange data in ascending order
- Find the minimum value
- Calculate Q₁ using quartile formula
- Calculate Q₂ (median) using median formula
- Calculate Q₃ using quartile formula
- Find the maximum value
Formula
$$\text{Five-Number Summary} = {\text{Min}, Q_1, Q_2, Q_3, \text{Max}}$$
Example 1: Simple Dataset
Daily temperature (in °F) for 15 days:
72, 75, 78, 80, 82, 85, 88, 90, 92, 94, 95, 96, 98, 100, 102
Find the five-number summary.
Solution:
Arranged in ascending order (already sorted):
72, 75, 78, 80, 82, 85, 88, 90, 92, 94, 95, 96, 98, 100, 102
n = 15
Minimum: 72
Q₁ (First Quartile):
$$Q_1 = \text{Value of } \left(\frac{1(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (4)^{th} \text{ obs.} = 80$$
Q₂ (Median):
$$Q_2 = \text{Value of } \left(\frac{15+1}{2}\right)^{th} \text{ obs.} = \text{Value of } (8)^{th} \text{ obs.} = 90$$
Q₃ (Third Quartile):
$$Q_3 = \text{Value of } \left(\frac{3(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (12)^{th} \text{ obs.} = 96$$
Maximum: 102
Five-Number Summary: {72, 80, 90, 96, 102}
Example 2: Dataset with Outliers
Test scores of 20 students:
45, 52, 58, 62, 68, 70, 72, 75, 78, 80, 82, 85, 88, 90, 92, 95, 98, 100, 105, 112
Find the five-number summary.
Solution:
n = 20 (even)
Minimum: 45
Q₁:
$$Q_1 = \text{Value of } \left(\frac{1(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (5.25)^{th} \text{ obs.}$$
$$= 5^{th} + 0.25(6^{th} - 5^{th}) = 68 + 0.25(70 - 68) = 68.5$$
Q₂ (Median):
$$Q_2 = \text{Value of } \left(\frac{20+1}{2}\right)^{th} \text{ obs.} = \text{Value of } (10.5)^{th} \text{ obs.}$$
$$= 10^{th} + 0.5(11^{th} - 10^{th}) = 80 + 0.5(82 - 80) = 81$$
Q₃:
$$Q_3 = \text{Value of } \left(\frac{3(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (15.75)^{th} \text{ obs.}$$
$$= 15^{th} + 0.75(16^{th} - 15^{th}) = 92 + 0.75(95 - 92) = 94.25$$
Maximum: 112
Five-Number Summary: {45, 68.5, 81, 94.25, 112}
Five-Number Summary for Grouped Data
For grouped data, first calculate Q₁, Q₂, and Q₃ using the grouped data quartile formulas, then identify the minimum and maximum from the class intervals.
Example: Grouped Data
| Class Interval | Frequency |
|---|---|
| 10-20 | 4 |
| 20-30 | 8 |
| 30-40 | 12 |
| 40-50 | 10 |
| 50-60 | 6 |
Find the five-number summary.
Solution:
Cumulative Frequencies:
| Class | f | CF |
|---|---|---|
| 10-20 | 4 | 4 |
| 20-30 | 8 | 12 |
| 30-40 | 12 | 24 |
| 40-50 | 10 | 34 |
| 50-60 | 6 | 40 |
N = 40
Minimum: 10 (lower bound of first class)
Q₁: Position = N/4 = 10, Class = 20-30
$$Q_1 = 20 + \left(\frac{10 - 4}{8}\right) \times 10 = 27.5$$
Q₂ (Median): Position = N/2 = 20, Class = 30-40
$$Q_2 = 30 + \left(\frac{20 - 12}{12}\right) \times 10 = 36.67$$
Q₃: Position = 3N/4 = 30, Class = 40-50
$$Q_3 = 40 + \left(\frac{30 - 24}{10}\right) \times 10 = 46$$
Maximum: 60 (upper bound of last class)
Five-Number Summary: {10, 27.5, 36.67, 46, 60}
Box Plot Visualization
The five-number summary is visualized as a box plot:
Min Q₁ Median Q₃ Max
|------|------|------|------|
| |======|======| |
| | Box (IQR) | |
Whisker | | | Whisker
Interpretation Guide
| Component | Meaning |
|---|---|
| Min to Q₁ | Lower 25% of data (spread of lowest quartile) |
| Q₁ to Median | 25%-50% of data |
| Median to Q₃ | 50%-75% of data |
| Q₃ to Max | Upper 25% of data (spread of highest quartile) |
| IQR = Q₃ - Q₁ | Middle 50% of data (interquartile range) |
| Range = Max - Min | Total data spread |
Statistical Insights
- Symmetric Distribution: If (Median - Q₁) ≈ (Q₃ - Median)
- Skewed Distribution: If (Median - Q₁) ≠ (Q₃ - Median)
- Outlier Detection: Values beyond Q₁ - 1.5(IQR) or Q₃ + 1.5(IQR)
Applications
- Exploratory Data Analysis: Quick dataset overview
- Comparative Analysis: Comparing multiple datasets
- Outlier Detection: Identifying unusual values
- Data Quality Assessment: Understanding data spread
- Reporting Summary Statistics: Business dashboards
- Statistical Graphics: Creating box plots