Empirical Rule Overview

The empirical rule, also known as the 68-95-99.7 rule, states that for a normal (bell-shaped) distribution:

  • 68% of data falls within 1 standard deviation (σ) of the mean (μ)
  • 95% of data falls within 2 standard deviations of the mean
  • 99.7% of data falls within 3 standard deviations of the mean

This rule applies only to approximately normal distributions.

The 68-95-99.7 Rule

Formulas

$$\text{68% of data: } \mu \pm 1\sigma$$ $$\text{95% of data: } \mu \pm 2\sigma$$ $$\text{99.7% of data: } \mu \pm 3\sigma$$

Visual Representation

                    68%    95%   99.7%
                    |----|----|---|

                         |
                       __|__
                      /     \
        99.7%:   |-----|-----|-----|
                 μ-3σ  μ-σ  μ  μ+σ  μ+3σ

        95%:        |--------|--------|

        68%:           |-----|-----|

Empirical Rule for Ungrouped Data

Example 1: Test Scores

A class of 200 students took a test with:

  • Mean (μ) = 75
  • Standard Deviation (σ) = 8

Estimate the number of students in each range.

Solution:

Within 1 Standard Deviation (μ ± σ):

Range: 75 ± 8 = [67, 83]

Expected percentage: 68%

Expected count: 200 × 0.68 = 136 students

Approximately 136 students scored between 67 and 83.

Within 2 Standard Deviations (μ ± 2σ):

Range: 75 ± 16 = [59, 91]

Expected percentage: 95%

Expected count: 200 × 0.95 = 190 students

Approximately 190 students scored between 59 and 91.

Within 3 Standard Deviations (μ ± 3σ):

Range: 75 ± 24 = [51, 99]

Expected percentage: 99.7%

Expected count: 200 × 0.997 ≈ 199 students

Approximately 199 students scored between 51 and 99.

Example 2: Heights Distribution

Heights of adult males are normally distributed with:

  • Mean (μ) = 70 inches
  • Standard Deviation (σ) = 2.5 inches

Find the percentage of men with heights in various ranges.

Solution:

Height between 65 and 75 inches (μ ± 2σ):

Range: 70 ± 5 = [65, 75]

Percentage: 95%

Height between 67.5 and 72.5 inches (μ ± 1σ):

Range: 70 ± 2.5 = [67.5, 72.5]

Percentage: 68%

Height less than 67.5 inches:

This is below μ - 1σ

Percentage: (100% - 68%) / 2 = 16%

Height between 72.5 and 75 inches:

From μ + 1σ to μ + 2σ

Percentage: (95% - 68%) / 2 = 13.5%

Empirical Rule for Grouped Data

For grouped data that is approximately normally distributed:

  1. Calculate the mean (μ) and standard deviation (σ)
  2. Determine the class intervals corresponding to μ ± 1σ, μ ± 2σ, μ ± 3σ
  3. Count the frequency in each interval
  4. Compare with expected percentages

Example: Grouped Data

Class Interval Frequency
50-60 3
60-70 12
70-80 25
80-90 32
90-100 22
100-110 5
110-120 1

N = 100, μ = 80, σ = 10

Verify the empirical rule.

Solution:

Within 1σ (70-90):

Frequency: 25 + 32 = 57

Percentage: 57% (close to 68%)

Within 2σ (60-100):

Frequency: 12 + 25 + 32 + 22 = 91

Percentage: 91% (close to 95%)

Within 3σ (50-110):

Frequency: 3 + 12 + 25 + 32 + 22 + 5 = 99

Percentage: 99% (close to 99.7%)

The data approximately follows the empirical rule, confirming it’s approximately normally distributed.

Detailed Distribution Breakdown

Range Percentage Count (if n=100)
μ - 3σ to μ - 2σ 2.15% 2-3
μ - 2σ to μ - 1σ 13.5% 13-14
μ - 1σ to μ 34% 34
μ to μ + 1σ 34% 34
μ + 1σ to μ + 2σ 13.5% 13-14
μ + 2σ to μ + 3σ 2.15% 2-3

Outlier Detection Using Empirical Rule

Values beyond μ ± 3σ can be considered extreme outliers:

  • Beyond μ + 3σ: Extremely high values (0.15% chance)
  • Beyond μ - 3σ: Extremely low values (0.15% chance)

This provides a rule-of-thumb for identifying unusual observations.

Conditions for Using the Empirical Rule

  • Data must be approximately normally distributed
  • Data should be roughly bell-shaped
  • Mean and median should be approximately equal
  • Data should not be heavily skewed

Applications

  • Quality Control: Identifying defects outside acceptable ranges
  • Risk Assessment: Determining probability of extreme events
  • Academic Performance: Grading and benchmarking
  • Manufacturing: Process capability analysis
  • Investment Analysis: Portfolio risk assessment
  • Medical Testing: Reference ranges for health metrics

Limitation

The empirical rule is specifically for normal distributions. For non-normal distributions, use Chebyshev’s theorem instead, which provides more conservative estimates valid for any distribution.