Variance and standard deviation for grouped data
Let $(x_i,f_i), i=1,2, \cdots , n$
be the observed frequency distribution.
Formula
Sample variance
The sample variance of $X$ is denoted by $s_x^2$ and is given by
$s_x^2 =\dfrac{1}{N-1}\sum_{i=1}^{n}f_i(x_i -\overline{x})^2$
OR
$s_x^2 =\dfrac{1}{N-1}\bigg(\sum_{i=1}^{n}f_ix_i^2-\frac{\big(\sum_{i=1}^n f_ix_i\big)^2}{N}\bigg)$
where,
$N=\sum_{i=1}^n f_i$
is the total number of observations,$\overline{x}$
is the sample mean.
Sample standard deviation
The sample standard deviation of $X$ is defined as the positive square root of sample variance. The sample standard deviation of $X$ is given by
$s_x =\sqrt{s_x^2}$
Example 1
Following tables shows a frequency distribution of daily number of car accidents at a particular cross road during a month of April.
No.of car accidents ($x$) | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|
No. of days ($f$) | 9 | 11 | 6 | 3 | 1 |
Calculate variance and standard deviation of number of car accidents.
Solution
$x_i$ | $f_i$ | $f_i*x_i$ | $f_ix_i^2$ | |
---|---|---|---|---|
2 | 9 | 18 | 36 | |
3 | 11 | 33 | 99 | |
4 | 6 | 24 | 96 | |
5 | 3 | 15 | 75 | |
6 | 1 | 6 | 36 | |
Total | 30 | 96 | 342 |
Sample mean
The sample mean of $X$ is
$$ \begin{aligned} \overline{x} &=\frac{1}{n}\sum_{i=1}^n f_ix_i\\ &=\frac{96}{30}\\ &=3.2\text{ accidents } \end{aligned} $$
The average of no.of car accidents is $3.2$ accidents .
Sample variance
Sample variance of $X$ is
$$ \begin{aligned} s_x^2 &=\dfrac{1}{n-1}\bigg(\sum_{i=1}^{n}f_ix_i^2-\frac{\big(\sum_{i=1}^n f_ix_i\big)^2}{n}\bigg)\\ &=\dfrac{1}{29}\bigg(342-\frac{(96)^2}{30}\bigg)\\ &=\dfrac{1}{29}\big(342-\frac{9216}{30}\big)\\ &=\dfrac{1}{29}\big(342-307.2\big)\\ &= \frac{34.8}{29}\\ &=1.2 \end{aligned} $$
Sample standard deviation
The standard deviation is the positive square root of the variance.
The sample standard deviation is
$$ \begin{aligned} s_x &=\sqrt{s_x^2}\\ &=\sqrt{2.5}\\ &=1.0954 \text{ accidents } \end{aligned} $$
Thus the standard deviation of no.of car accidents is $1.0954$ accidents.
Example 2
The table below shows the total number of man-days lost to sickness during one week’s operation of a small chemical plant.
Days Lost | 1-3 | 4-6 | 7-9 | 10-12 | 13-15 |
---|---|---|---|---|---|
Frequency | 8 | 7 | 10 | 9 | 6 |
Calculate the variance and standard deviation of the number of lost days.
Solution
Class Interval | Class Boundries | mid-value ($x_i$) | Freq ($f_i$) | $f_i*x_i$ | $f_ix_i^2$ | |
---|---|---|---|---|---|---|
1-3 | 0.5-3.5 | 2 | 8 | 16 | 32 | |
4-6 | 3.5-6.5 | 5 | 7 | 35 | 175 | |
7-9 | 6.5-9.5 | 8 | 10 | 80 | 640 | |
10-12 | 9.5-12.5 | 11 | 9 | 99 | 1089 | |
13-15 | 12.5-15.5 | 14 | 6 | 84 | 1176 | |
Total | 40 | 314 | 3112 |
Sample mean
The sample mean of $X$ is
$$ \begin{aligned} \overline{x} &=\frac{1}{N}\sum_{i=1}^n f_ix_i\\ &=\frac{314}{40}\\ &=7.85\text{ days } \end{aligned} $$
The average of total number of man days lost is $7.85$ days .
Sample variance
Sample variance of $X$ is
$$ \begin{aligned} s_x^2 &=\dfrac{1}{N-1}\bigg(\sum_{i=1}^{n}f_ix_i^2-\frac{\big(\sum_{i=1}^n f_ix_i\big)^2}{N}\bigg)\\ &=\dfrac{1}{39}\bigg(3112-\frac{(314)^2}{40}\bigg)\\ &=\dfrac{1}{39}\big(3112-\frac{98596}{40}\big)\\ &=\dfrac{1}{39}\big(3112-2464.9\big)\\ &= \frac{647.1}{39}\\ &=16.5923 \end{aligned} $$
Sample standard deviation
The standard deviation is the positive square root of the variance.
The sample standard deviation is
$$ \begin{aligned} s_x &=\sqrt{s_x^2}\\ &=\sqrt{22.5}\\ &=4.0734 \text{ days } \end{aligned} $$
Thus the standard deviation of total number of man days lost is $4.0734$ days .