Mean, median, and mode are the three primary measures of central tendency. They each describe the “typical” or “average” value in a dataset, but in different ways. Understanding when to use each is critical for proper data analysis.
Mean (Average)
The mean is the sum of all values divided by the number of values. It’s the most commonly used measure of central tendency but can be affected by extreme values.
Mean for Ungrouped Data
Formula:
Mean = (x₁ + x₂ + ... + xₙ) / n
Mean = Σxᵢ / n
Example: Dataset: 10, 15, 20, 25, 30 Mean = (10 + 15 + 20 + 25 + 30) / 5 = 100 / 5 = 20
Mean for Grouped Data (Frequency Distribution)
When data is organized into groups with frequencies:
Formula:
Mean = Σ(fᵢ × xᵢ) / Σfᵢ
where:
fᵢ = frequency of class i
xᵢ = midpoint of class i (for continuous data)
Σfᵢ = total number of observations (N)
Example:
| Class | Midpoint (x) | Frequency (f) | f × x |
|---|---|---|---|
| 10-20 | 15 | 5 | 75 |
| 20-30 | 25 | 8 | 200 |
| 30-40 | 35 | 7 | 245 |
| Total | 20 | 520 |
Mean = 520 / 20 = 26
Key point for continuous data: Use class midpoint as the representative value for each group.
Median (Middle Value)
The median is the middle value when data is arranged in order. It’s robust to outliers and useful for skewed distributions.
Median for Ungrouped Data
Process:
- Arrange data in ascending order
- If odd number of values: median = middle value
- If even number of values: median = average of two middle values
Example (odd n): Dataset: 10, 15, 20, 25, 30 Median = 20 (middle value, 3rd position)
Example (even n): Dataset: 10, 15, 20, 25 Median = (15 + 20) / 2 = 17.5 (average of 2nd and 3rd values)
Median for Grouped Data
Formula:
Median = L + ((N/2 - CF) / f) × h
where:
L = lower boundary of median class
N = total frequency
CF = cumulative frequency before median class
f = frequency of median class
h = class width
Steps:
- Find cumulative frequencies
- Locate median class (where cumulative frequency ≥ N/2)
- Apply formula
Example:
| Class | Frequency | Cumulative |
|---|---|---|
| 10-20 | 5 | 5 |
| 20-30 | 8 | 13 |
| 30-40 | 7 | 20 |
| Total | 20 |
Median class location: N/2 = 20/2 = 10 Median class is 20-30 (cumulative frequency 13 ≥ 10)
Median = 20 + ((10 - 5) / 8) × 10 Median = 20 + (5/8) × 10 Median = 20 + 6.25 = 26.25
Mode (Most Frequent Value)
The mode is the value that appears most frequently. It’s the only measure of central tendency for purely categorical data.
Mode for Ungrouped Data
Process: Simply identify the value that appears most often.
Example: Dataset: 10, 15, 15, 20, 20, 20, 25 Mode = 20 (appears 3 times)
Types:
- Unimodal: One mode
- Bimodal: Two modes
- Multimodal: Multiple modes
- No mode: All values appear with equal frequency
Mode for Grouped Data
For grouped data, we identify the modal class (class with highest frequency).
Formula (Approximate):
Mode = L + ((f₁ - f₀) / (2f₁ - f₀ - f₂)) × h
where:
L = lower boundary of modal class
f₁ = frequency of modal class
f₀ = frequency of class before modal class
f₂ = frequency of class after modal class
h = class width
Example (using data above): Modal class: 30-40 (frequency 7 is highest)
Mode = 30 + ((7 - 8) / (2×7 - 8 - 0)) × 10 Mode = 30 + (-1 / 6) × 10 Mode ≈ 28.33
Note: For grouped data, if there’s minimal information, use the class midpoint as the mode.
Comparing Mean, Median, Mode
| Characteristic | Mean | Median | Mode |
|---|---|---|---|
| Definition | Average | Middle value | Most frequent |
| Ungrouped data | Simple to calculate | Position-based | Count-based |
| Grouped data | Uses frequencies | Cumulative formula | Highest frequency class |
| Affected by outliers | Yes (sensitive) | No (robust) | No |
| Good for | Normal distributions | Skewed distributions | Categorical data |
| Best use | General purposes | Skewed/real data | Categories/modes |
When to Use Each
Use Mean when:
- Data is approximately normally distributed
- No extreme outliers
- You need mathematical properties (algebra, further analysis)
Use Median when:
- Data is skewed or has outliers
- You want the “typical” value for highly variable data
- Income, home prices, or other right-skewed data
Use Mode when:
- Data is categorical (colors, brands, preferences)
- Identifying most popular/common value
- Data distribution is highly multimodal
Relationship: Mean vs Median vs Mode
Symmetrical distribution: Mean = Median = Mode
Right-skewed (positive skew): Mean > Median > Mode (Tail pulls mean to the right)
Left-skewed (negative skew): Mean < Median < Mode (Tail pulls mean to the left)
Quick Reference
Ungrouped Data Formulas
Mean = Σx / n
Median = middle value(s) when sorted
Mode = most frequent value
Grouped Data Formulas
Mean = Σ(f × x) / Σf
Median = L + ((N/2 - CF) / f) × h
Mode = L + ((f₁ - f₀) / (2f₁ - f₀ - f₂)) × h
Related Articles
Alternative Measures of Central Tendency:
- Geometric Mean - For growth rates and ratios
- Harmonic Mean - For rates and averages
Related Concepts:
- Measures of Dispersion - How spread out is the data?
- Skewness and Kurtosis - Is the data symmetric?
- Five-Number Summary - Complete data overview
Applications:
- Descriptive Statistics Complete Guide - Master all measures