Quartiles Overview
Quartiles are values that divide an arranged dataset into four equal parts. They consist of three values: Q₁ (first quartile), Q₂ (second quartile/median), and Q₃ (third quartile). Quartiles help identify the distribution and spread of data.
Quartiles for Ungrouped Data
Formula
The formula for the i-th quartile in ungrouped data is:
$$Q_i = \text{Value of } \left(\frac{i(n+1)}{4}\right)^{th} \text{ observation, } i=1,2,3$$
where n is the total number of observations.
Example 1: Odd Number of Observations
A random sample of 15 patients yielded the following data on the length of stay (in days) in the hospital:
5, 6, 9, 10, 15, 10, 14, 12, 10, 13, 13, 9, 8, 10, 12
Find the quartiles.
Solution:
Arrange the data in ascending order:
5, 6, 8, 9, 9, 10, 10, 10, 10, 12, 12, 13, 13, 14, 15
First Quartile (Q₁):
$$Q_1 = \text{Value of } \left(\frac{1(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (4)^{th} \text{ obs.} = 9$$
Thus, 25% of the patients had length of stay ≤ 9 days.
Second Quartile (Q₂) - Median:
$$Q_2 = \text{Value of } \left(\frac{2(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (8)^{th} \text{ obs.} = 10$$
Thus, 50% of the patients had length of stay ≤ 10 days.
Third Quartile (Q₃):
$$Q_3 = \text{Value of } \left(\frac{3(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (12)^{th} \text{ obs.} = 13$$
Thus, 75% of the patients had length of stay ≤ 13 days.
Example 2: Even Number of Observations with Interpolation
Blood sugar level (in mg/dl) of a sample of 20 patients:
75, 89, 72, 78, 87, 85, 73, 75, 97, 87, 84, 76, 73, 79, 99, 86, 83, 76, 78, 73
Find Q₁, Q₂, and Q₃.
Solution:
Arrange the data in ascending order:
72, 73, 73, 73, 75, 75, 76, 76, 78, 78, 79, 80, 82, 83, 84, 85, 86, 87, 97, 99
First Quartile (Q₁):
$$Q_1 = \text{Value of } \left(\frac{1(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (5.25)^{th} \text{ obs.}$$
$$= 5^{th} \text{ obs.} + 0.25(6^{th} \text{ obs.} - 5^{th} \text{ obs.})$$
$$= 75 + 0.25(75 - 75) = 75 \text{ mg/dl}$$
Second Quartile (Q₂):
$$Q_2 = \text{Value of } \left(\frac{2(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (10.5)^{th} \text{ obs.}$$
$$= 10^{th} \text{ obs.} + 0.5(11^{th} \text{ obs.} - 10^{th} \text{ obs.})$$
$$= 78 + 0.5(79 - 78) = 78.5 \text{ mg/dl}$$
Third Quartile (Q₃):
$$Q_3 = \text{Value of } \left(\frac{3(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (15.75)^{th} \text{ obs.}$$
$$= 15^{th} \text{ obs.} + 0.75(16^{th} \text{ obs.} - 15^{th} \text{ obs.})$$
$$= 84 + 0.75(85 - 84) = 84.75 \text{ mg/dl}$$
Quartiles for Grouped Data
Formula
For grouped data, quartiles are calculated using:
$$Q_i = L + \left(\frac{\frac{i \cdot N}{4} - CF}{f}\right) \times h$$
where:
- L = Lower class boundary of the quartile class
- N = Total frequency
- CF = Cumulative frequency before the quartile class
- f = Frequency of the quartile class
- h = Class width
- i = 1, 2, 3 (for Q₁, Q₂, Q₃ respectively)
Example: Grouped Data
Consider the following distribution of marks:
| Class Interval | Frequency |
|---|---|
| 0-10 | 5 |
| 10-20 | 12 |
| 20-30 | 18 |
| 30-40 | 10 |
| 40-50 | 5 |
Find Q₁, Q₂, and Q₃.
Solution:
Calculate cumulative frequency:
| Class Interval | Frequency | Cumulative Frequency |
|---|---|---|
| 0-10 | 5 | 5 |
| 10-20 | 12 | 17 |
| 20-30 | 18 | 35 |
| 30-40 | 10 | 45 |
| 40-50 | 5 | 50 |
N = 50
First Quartile (Q₁):
Position of Q₁ = N/4 = 50/4 = 12.5
Quartile class is 10-20 (cumulative frequency crosses 12.5 here)
$$Q_1 = 10 + \left(\frac{12.5 - 5}{12}\right) \times 10 = 10 + \frac{7.5}{12} \times 10 = 10 + 6.25 = 16.25$$
Second Quartile (Q₂) - Median:
Position of Q₂ = 2N/4 = 50/2 = 25
Quartile class is 20-30 (cumulative frequency crosses 25 here)
$$Q_2 = 20 + \left(\frac{25 - 17}{18}\right) \times 10 = 20 + \frac{8}{18} \times 10 = 20 + 4.44 = 24.44$$
Third Quartile (Q₃):
Position of Q₃ = 3N/4 = 150/4 = 37.5
Quartile class is 30-40 (cumulative frequency crosses 37.5 here)
$$Q_3 = 30 + \left(\frac{37.5 - 35}{10}\right) \times 10 = 30 + \frac{2.5}{10} \times 10 = 30 + 2.5 = 32.5$$
Interpretation of Quartiles
- Q₁ (First Quartile): 25% of the data falls below this value
- Q₂ (Second Quartile): 50% of the data falls below this value (same as median)
- Q₃ (Third Quartile): 75% of the data falls below this value
Interquartile Range (IQR)
The interquartile range is the difference between Q₃ and Q₁:
$$IQR = Q_3 - Q_1$$
The IQR represents the spread of the middle 50% of the data and is used to identify outliers.
Related Articles
Related Position Measures:
- Percentiles - Dividing data into 100 parts
- Deciles - Dividing data into 10 parts
- Five-Number Summary - Complete position overview
- Interquartile Range (IQR) - Q3 minus Q1
Related Concepts:
- Variance and Standard Deviation - Measuring spread
- Skewness and Distribution Shape - Understanding distribution
- Mean, Median, and Mode - Central tendency measures
Applications:
- Outlier Detection Methods - Using IQR for outlier detection
- Empirical Rule - Normal distribution percentages
- Descriptive Statistics Complete Guide - Comprehensive statistics guide
When to Use Quartiles
- Understanding data distribution and spread
- Creating box plots and visualizations
- Identifying outliers using the 1.5 × IQR rule
- Comparing datasets
- Reporting percentile-based statistics
References
-
Walpole, R.E., Myers, S.L., Myers, S.L., & Ye, K. (2012). Probability & Statistics for Engineers & Scientists (9th ed.). Pearson. - Foundational treatment of quantiles and quartile calculations.
-
Montgomery, D.C., & Runger, G.C. (2018). Applied Statistics for Engineers and Scientists (6th ed.). John Wiley & Sons. - Applications of quartiles in data analysis and quality control.