## Empirical Rule for ungrouped data

Empirical rule is the general rule of thumb that applies to bell shaped (symmetrical) distribution. The empirical rule can be stated as :

- $68$% of the data will fall within one standard deviation of the mean,
- $95$% of the data will fall within two standard deviations of the mean,
- $99.7$% of the data will fall within three standard deviations of the mean.

## Formula

Let $x_1,x_2,\cdots, x_n$ be $n$ sample observations. If the distribution of $x$ is approximately symmetrical, then

$68$% of the data falls in $\overline{x}\pm 1 s_x$

$95$% of the data falls in $\overline{x}\pm 2 s_x$

$99.7$% of the data falls in $\overline{x}\pm 3 s_x$

where,

`$\overline{x}=\dfrac{1}{n}\sum_{i=1}^{n}x_i$`

is the sample mean,`$s_x =\sqrt{\dfrac{1}{n-1}\bigg(\sum_{i=1}^{n}x_i^2-\dfrac{\big(\sum_{i=1}^n x_i\big)^2}{n}\bigg)}$`

is the sample standard deviation.

## Example

The following data gives the hourly wage rates (in dollars) of 10 employees of a company.

20,21,24,25,18,22,24,22,20,22.

Check empirical rule for the given data.

### Solution

$x_i$ | $x_i^2$ | |
---|---|---|

20 | 400 | |

21 | 441 | |

24 | 576 | |

25 | 625 | |

18 | 324 | |

22 | 484 | |

24 | 576 | |

22 | 484 | |

20 | 400 | |

22 | 484 | |

Total | 218 | 4794 |

**Sample mean**

The sample mean of $X$ is

```
$$
\begin{aligned}
\overline{x} &=\frac{1}{n}\sum_{i=1}^n x_i\\
&=\frac{218}{10}\\
&=21.8\text{ dollars}
\end{aligned}
$$
```

The average of hourly wage rate is $21.8$ dollars.

**Sample variance**

Sample variance of $X$ is

```
$$
\begin{aligned}
s_x^2 &=\dfrac{1}{n-1}\bigg(\sum_{i=1}^{n}x_i^2-\frac{\big(\sum_{i=1}^n x_i\big)^2}{n}\bigg)\\
&=\dfrac{1}{9}\bigg(4794-\frac{(218)^2}{10}\bigg)\\
&=\dfrac{1}{9}\big(4794-\frac{47524}{10}\big)\\
&=\dfrac{1}{9}\big(4794-4752.4\big)\\
&= \frac{41.6}{9}\\
&=4.6222
\end{aligned}
$$
```

**Sample standard deviation**

The standard deviation is the positive square root of the variance.

The sample standard deviation is

```
$$
\begin{aligned}
s_x &=\sqrt{s_x^2}\\
&=\sqrt{4.6222}\\
&=2.1499 \text{ dollars}
\end{aligned}
$$
```

Thus the standard deviation of hourly wage rate is $2.1499$ dollars.

**Empirical Rule**

$68$% of the data falls in $\overline{x}\pm 1 s_x$.

i.e., (`$21.8\pm 1*2.1499$`

) contains $68$% of the data values.

$\Rightarrow$ (`$19.6501, 23.9499$`

) contains $68$% of the data values.

$95$% of the data falls in $\overline{x}\pm 2 s_x$.

i.e., (`$21.8\pm 2*2.1499$`

) contains $95$% of the data values.

$\Rightarrow$ (`$17.5002, 26.0998$`

) contains $95$% of the data values.

$99.7$% of the data falls in $\overline{x}\pm 3 s_x$.

i.e., (`$21.8\pm 3*2.1499$`

) contains $99.7$% of the data values.

$\Rightarrow$ (`$15.3503, 28.2497$`

) contains $99.7$% of the data values.