Variance and standard deviation for grouped data

Let $(x_i,f_i), i=1,2, \cdots , n$ be the observed frequency distribution.

Formula

Sample variance

The sample variance of $X$ is denoted by $s_x^2$ and is given by

$s_x^2 =\dfrac{1}{N-1}\sum_{i=1}^{n}f_i(x_i -\overline{x})^2$

OR

$s_x^2 =\dfrac{1}{N-1}\bigg(\sum_{i=1}^{n}f_ix_i^2-\frac{\big(\sum_{i=1}^n f_ix_i\big)^2}{N}\bigg)$

where,

  • $N=\sum_{i=1}^n f_i$ is the total number of observations,
  • $\overline{x}$ is the sample mean.

Sample standard deviation

The sample standard deviation of $X$ is defined as the positive square root of sample variance. The sample standard deviation of $X$ is given by

$s_x =\sqrt{s_x^2}$

Example 1

Following tables shows a frequency distribution of daily number of car accidents at a particular cross road during a month of April.

No.of car accidents ($x$) 2 3 4 5 6
No. of days ($f$) 9 11 6 3 1

Calculate variance and standard deviation of number of car accidents.

Solution

$x_i$ $f_i$ $f_i*x_i$ $f_ix_i^2$
2 9 18 36
3 11 33 99
4 6 24 96
5 3 15 75
6 1 6 36
Total 30 96 342

Sample mean

The sample mean of $X$ is

$$ \begin{aligned} \overline{x} &=\frac{1}{n}\sum_{i=1}^n f_ix_i\\ &=\frac{96}{30}\\ &=3.2\text{ accidents } \end{aligned} $$

The average of no.of car accidents is $3.2$ accidents .

Sample variance

Sample variance of $X$ is

$$ \begin{aligned} s_x^2 &=\dfrac{1}{n-1}\bigg(\sum_{i=1}^{n}f_ix_i^2-\frac{\big(\sum_{i=1}^n f_ix_i\big)^2}{n}\bigg)\\ &=\dfrac{1}{29}\bigg(342-\frac{(96)^2}{30}\bigg)\\ &=\dfrac{1}{29}\big(342-\frac{9216}{30}\big)\\ &=\dfrac{1}{29}\big(342-307.2\big)\\ &= \frac{34.8}{29}\\ &=1.2 \end{aligned} $$

Sample standard deviation

The standard deviation is the positive square root of the variance.

The sample standard deviation is

$$ \begin{aligned} s_x &=\sqrt{s_x^2}\\ &=\sqrt{2.5}\\ &=1.0954 \text{ accidents } \end{aligned} $$

Thus the standard deviation of no.of car accidents is $1.0954$ accidents.

Example 2

The table below shows the total number of man-days lost to sickness during one week’s operation of a small chemical plant.

Days Lost 1-3 4-6 7-9 10-12 13-15
Frequency 8 7 10 9 6

Calculate the variance and standard deviation of the number of lost days.

Solution

Class Interval Class Boundries mid-value ($x_i$) Freq ($f_i$) $f_i*x_i$ $f_ix_i^2$
1-3 0.5-3.5 2 8 16 32
4-6 3.5-6.5 5 7 35 175
7-9 6.5-9.5 8 10 80 640
10-12 9.5-12.5 11 9 99 1089
13-15 12.5-15.5 14 6 84 1176
Total 40 314 3112

Sample mean

The sample mean of $X$ is

$$ \begin{aligned} \overline{x} &=\frac{1}{N}\sum_{i=1}^n f_ix_i\\ &=\frac{314}{40}\\ &=7.85\text{ days } \end{aligned} $$

The average of total number of man days lost is $7.85$ days .

Sample variance

Sample variance of $X$ is

$$ \begin{aligned} s_x^2 &=\dfrac{1}{N-1}\bigg(\sum_{i=1}^{n}f_ix_i^2-\frac{\big(\sum_{i=1}^n f_ix_i\big)^2}{N}\bigg)\\ &=\dfrac{1}{39}\bigg(3112-\frac{(314)^2}{40}\bigg)\\ &=\dfrac{1}{39}\big(3112-\frac{98596}{40}\big)\\ &=\dfrac{1}{39}\big(3112-2464.9\big)\\ &= \frac{647.1}{39}\\ &=16.5923 \end{aligned} $$

Sample standard deviation

The standard deviation is the positive square root of the variance.

The sample standard deviation is

$$ \begin{aligned} s_x &=\sqrt{s_x^2}\\ &=\sqrt{22.5}\\ &=4.0734 \text{ days } \end{aligned} $$

Thus the standard deviation of total number of man days lost is $4.0734$ days .

Related Resources