Variance and standard deviation measure how spread out data is around the mean. Variance is measured in squared units, while standard deviation (the square root of variance) is in the original units, making it more interpretable.

Variance

Variance measures the average squared deviation from the mean.

Variance for Ungrouped Data

Population Variance:

σ² = Σ(xᵢ - μ)² / N

Sample Variance:

s² = Σ(xᵢ - x̄)² / (n - 1)

Example: Dataset: 10, 15, 20, 25, 30 Mean = 20

Deviations squared: (-10)² = 100, (-5)² = 25, (0)² = 0, (5)² = 25, (10)² = 100 Sum = 250

Sample variance: 250 / (5-1) = 62.5

Variance for Grouped Data

Formula:

s² = Σ(fᵢ(xᵢ - x̄)²) / (n - 1)

where:
fᵢ = frequency
xᵢ = class midpoint
x̄ = mean
n = total frequency

Standard Deviation

Standard deviation is the square root of variance. It’s in the same units as the original data.

Standard Deviation for Ungrouped Data

σ = √(Σ(xᵢ - μ)² / N)      [Population]
s = √(Σ(xᵢ - x̄)² / (n - 1))  [Sample]

Example (from above): s = √62.5 ≈ 7.91

Interpretation: Values typically deviate from the mean by about 7.91 units.

Standard Deviation for Grouped Data

s = √(Σ(fᵢ(xᵢ - x̄)²) / (n - 1))

Why (n-1) for Sample Variance?

Using (n-1) instead of n is called Bessel’s correction. It provides an unbiased estimate of the population variance. With (n), the sample variance tends to underestimate the population variance.


Interpretation

  • Small SD: Values clustered near mean (consistent data)
  • Large SD: Values spread far from mean (variable data)

Example:

  • Dataset A: 95, 100, 105 (SD ≈ 5)
  • Dataset B: 50, 100, 150 (SD ≈ 50) Dataset B is much more variable.

68-95-99.7 Rule (for Normal Distribution)

  • ~68% of data within ±1 SD
  • ~95% of data within ±2 SD
  • ~99.7% of data within ±3 SD

When to Use

Use variance/SD when:

  • Quantifying data spread/variability
  • Comparing consistency across groups
  • Statistical tests (many assume known SD)

Alternative Dispersion Measures:

Related Concepts:

Applications: