Quartiles Overview

Quartiles are values that divide an arranged dataset into four equal parts. They consist of three values: Q₁ (first quartile), Q₂ (second quartile/median), and Q₃ (third quartile). Quartiles help identify the distribution and spread of data.

Quartiles for Ungrouped Data

Formula

The formula for the i-th quartile in ungrouped data is:

$$Q_i = \text{Value of } \left(\frac{i(n+1)}{4}\right)^{th} \text{ observation, } i=1,2,3$$

where n is the total number of observations.

Example 1: Odd Number of Observations

A random sample of 15 patients yielded the following data on the length of stay (in days) in the hospital:

5, 6, 9, 10, 15, 10, 14, 12, 10, 13, 13, 9, 8, 10, 12

Find the quartiles.

Solution:

Arrange the data in ascending order:

5, 6, 8, 9, 9, 10, 10, 10, 10, 12, 12, 13, 13, 14, 15

First Quartile (Q₁):

$$Q_1 = \text{Value of } \left(\frac{1(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (4)^{th} \text{ obs.} = 9$$

Thus, 25% of the patients had length of stay ≤ 9 days.

Second Quartile (Q₂) - Median:

$$Q_2 = \text{Value of } \left(\frac{2(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (8)^{th} \text{ obs.} = 10$$

Thus, 50% of the patients had length of stay ≤ 10 days.

Third Quartile (Q₃):

$$Q_3 = \text{Value of } \left(\frac{3(15+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (12)^{th} \text{ obs.} = 13$$

Thus, 75% of the patients had length of stay ≤ 13 days.

Example 2: Even Number of Observations with Interpolation

Blood sugar level (in mg/dl) of a sample of 20 patients:

75, 89, 72, 78, 87, 85, 73, 75, 97, 87, 84, 76, 73, 79, 99, 86, 83, 76, 78, 73

Find Q₁, Q₂, and Q₃.

Solution:

Arrange the data in ascending order:

72, 73, 73, 73, 75, 75, 76, 76, 78, 78, 79, 80, 82, 83, 84, 85, 86, 87, 97, 99

First Quartile (Q₁):

$$Q_1 = \text{Value of } \left(\frac{1(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (5.25)^{th} \text{ obs.}$$

$$= 5^{th} \text{ obs.} + 0.25(6^{th} \text{ obs.} - 5^{th} \text{ obs.})$$

$$= 75 + 0.25(75 - 75) = 75 \text{ mg/dl}$$

Second Quartile (Q₂):

$$Q_2 = \text{Value of } \left(\frac{2(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (10.5)^{th} \text{ obs.}$$

$$= 10^{th} \text{ obs.} + 0.5(11^{th} \text{ obs.} - 10^{th} \text{ obs.})$$

$$= 78 + 0.5(79 - 78) = 78.5 \text{ mg/dl}$$

Third Quartile (Q₃):

$$Q_3 = \text{Value of } \left(\frac{3(20+1)}{4}\right)^{th} \text{ obs.} = \text{Value of } (15.75)^{th} \text{ obs.}$$

$$= 15^{th} \text{ obs.} + 0.75(16^{th} \text{ obs.} - 15^{th} \text{ obs.})$$

$$= 84 + 0.75(85 - 84) = 84.75 \text{ mg/dl}$$

Quartiles for Grouped Data

Formula

For grouped data, quartiles are calculated using:

$$Q_i = L + \left(\frac{\frac{i \cdot N}{4} - CF}{f}\right) \times h$$

where:

  • L = Lower class boundary of the quartile class
  • N = Total frequency
  • CF = Cumulative frequency before the quartile class
  • f = Frequency of the quartile class
  • h = Class width
  • i = 1, 2, 3 (for Q₁, Q₂, Q₃ respectively)

Example: Grouped Data

Consider the following distribution of marks:

Class Interval Frequency
0-10 5
10-20 12
20-30 18
30-40 10
40-50 5

Find Q₁, Q₂, and Q₃.

Solution:

Calculate cumulative frequency:

Class Interval Frequency Cumulative Frequency
0-10 5 5
10-20 12 17
20-30 18 35
30-40 10 45
40-50 5 50

N = 50

First Quartile (Q₁):

Position of Q₁ = N/4 = 50/4 = 12.5

Quartile class is 10-20 (cumulative frequency crosses 12.5 here)

$$Q_1 = 10 + \left(\frac{12.5 - 5}{12}\right) \times 10 = 10 + \frac{7.5}{12} \times 10 = 10 + 6.25 = 16.25$$

Second Quartile (Q₂) - Median:

Position of Q₂ = 2N/4 = 50/2 = 25

Quartile class is 20-30 (cumulative frequency crosses 25 here)

$$Q_2 = 20 + \left(\frac{25 - 17}{18}\right) \times 10 = 20 + \frac{8}{18} \times 10 = 20 + 4.44 = 24.44$$

Third Quartile (Q₃):

Position of Q₃ = 3N/4 = 150/4 = 37.5

Quartile class is 30-40 (cumulative frequency crosses 37.5 here)

$$Q_3 = 30 + \left(\frac{37.5 - 35}{10}\right) \times 10 = 30 + \frac{2.5}{10} \times 10 = 30 + 2.5 = 32.5$$

Interpretation of Quartiles

  • Q₁ (First Quartile): 25% of the data falls below this value
  • Q₂ (Second Quartile): 50% of the data falls below this value (same as median)
  • Q₃ (Third Quartile): 75% of the data falls below this value

Interquartile Range (IQR)

The interquartile range is the difference between Q₃ and Q₁:

$$IQR = Q_3 - Q_1$$

The IQR represents the spread of the middle 50% of the data and is used to identify outliers.

Related Position Measures:

Related Concepts:

Applications:


When to Use Quartiles

  • Understanding data distribution and spread
  • Creating box plots and visualizations
  • Identifying outliers using the 1.5 × IQR rule
  • Comparing datasets
  • Reporting percentile-based statistics

References

  1. Walpole, R.E., Myers, S.L., Myers, S.L., & Ye, K. (2012). Probability & Statistics for Engineers & Scientists (9th ed.). Pearson. - Foundational treatment of quantiles and quartile calculations.

  2. Montgomery, D.C., & Runger, G.C. (2018). Applied Statistics for Engineers and Scientists (6th ed.). John Wiley & Sons. - Applications of quartiles in data analysis and quality control.