Variance and standard deviation measure how spread out data is around the mean. Variance is measured in squared units, while standard deviation (the square root of variance) is in the original units, making it more interpretable.
Variance
Variance measures the average squared deviation from the mean.
Variance for Ungrouped Data
Population Variance:
σ² = Σ(xᵢ - μ)² / N
Sample Variance:
s² = Σ(xᵢ - x̄)² / (n - 1)
Example: Dataset: 10, 15, 20, 25, 30 Mean = 20
Deviations squared: (-10)² = 100, (-5)² = 25, (0)² = 0, (5)² = 25, (10)² = 100 Sum = 250
Sample variance: 250 / (5-1) = 62.5
Variance for Grouped Data
Formula:
s² = Σ(fᵢ(xᵢ - x̄)²) / (n - 1)
where:
fᵢ = frequency
xᵢ = class midpoint
x̄ = mean
n = total frequency
Standard Deviation
Standard deviation is the square root of variance. It’s in the same units as the original data.
Standard Deviation for Ungrouped Data
σ = √(Σ(xᵢ - μ)² / N) [Population]
s = √(Σ(xᵢ - x̄)² / (n - 1)) [Sample]
Example (from above): s = √62.5 ≈ 7.91
Interpretation: Values typically deviate from the mean by about 7.91 units.
Standard Deviation for Grouped Data
s = √(Σ(fᵢ(xᵢ - x̄)²) / (n - 1))
Why (n-1) for Sample Variance?
Using (n-1) instead of n is called Bessel’s correction. It provides an unbiased estimate of the population variance. With (n), the sample variance tends to underestimate the population variance.
Interpretation
- Small SD: Values clustered near mean (consistent data)
- Large SD: Values spread far from mean (variable data)
Example:
- Dataset A: 95, 100, 105 (SD ≈ 5)
- Dataset B: 50, 100, 150 (SD ≈ 50) Dataset B is much more variable.
68-95-99.7 Rule (for Normal Distribution)
- ~68% of data within ±1 SD
- ~95% of data within ±2 SD
- ~99.7% of data within ±3 SD
When to Use
✅ Use variance/SD when:
- Quantifying data spread/variability
- Comparing consistency across groups
- Statistical tests (many assume known SD)
Related Articles
Alternative Dispersion Measures:
- Interquartile Range (IQR) - Robust quartile-based measure
- Mean Absolute Deviation - Average absolute deviation
- Coefficient of Variation - Relative variability
Related Concepts:
- Mean, Median, and Mode - Measures of center
- Skewness - Distribution asymmetry
- Outlier Detection - Identifying unusual values
Applications:
- Position Measures and Quantiles - Understanding data division
- Descriptive Statistics Complete Guide - Comprehensive overview