Outliers for ungrouped data

An outlier is any observation that is $1.5(IQR)$ away from the first quartile ($Q_1$) or third quartile ($Q_3$).

Formula

$x$ is an outlier if $x$ is below $Q_1 -1.5\times IQR$ or above $Q_3+1.5\times IQR$,

where,

  • $Q_1$ is the first quartile,
  • $Q_3$ is the third quartile,
  • $IQR = Q_3-Q_1$ is an inter-quartile range.

$Q_i =$ Value of $\bigg(\dfrac{i(n+1)}{4}\bigg)^{th}$ observation, $i=1,2,3$

where $n$ is the total number of observations.

Example 1

Following is the data about hourly wages (in dollars) of sample of 15 workers working in a company

20,21,24,23,25,12,22,34,24,22,20,22,19,22.

Check whether any outlier exists in the data.

Solution

The formula for $i^{th}$ quartile is

$Q_i =$ Value of $\bigg(\dfrac{i(n+1)}{4}\bigg)^{th}$ observation, $i=1,2,3$

where $n$ is the total number of observations.

Arrange the data in ascending order

12, 19, 20, 20, 21, 22, 22, 22, 22, 23, 23, 24, 24, 25, 34

First Quartile $Q_1$

The first quartle $Q_1$ can be computed as follows:

$$ \begin{aligned} Q_1 &=\text{Value of }\bigg(\dfrac{1(n+1)}{4}\bigg)^{th} \text{ observation}\\ &=\text{Value of }\bigg(\dfrac{1(15+1)}{4}\bigg)^{th} \text{ observation}\\ &= \text{ Value of }\big(4\big)^{th} \text{ observation}\\ &=20 \end{aligned} $$

Third Quartile $Q_3$

The third quartile $Q_3$ can be computed as follows:

$$ \begin{aligned} Q_3 &=\text{Value of }\bigg(\dfrac{3(n+1)}{4}\bigg)^{th} \text{ observation}\\ &=\text{Value of }\bigg(\dfrac{3(15+1)}{4}\bigg)^{th} \text{ observation}\\ &= \text{Value of }\big(12\big)^{th} \text{ observation}\\ &=24 \end{aligned} $$

Inter-quartile range

$$ \begin{aligned} IQR & = Q_3 - Q_1\\ &= 24 - 20\\ & = 4. \end{aligned} $$

$Q_1-1.5*IQR = 14$ and $Q3+1.5*IQR = 30$.

The observation $12$ is less than $14$ and the observation $34$ is greater than $30$.

Thus the outliers are $12, 34$.

Example 2

Blood sugar level (in mg/dl) of a sample of 20 patients admitted to the hospitals are as follows:

75,89,72,78,87, 85, 73, 75, 97, 87, 84, 76,73,79,99,86,83,76,78,73.

Check whether any outlier exists in the data.

Solution

The formula for $i^{th}$ quartile is

$Q_i =$ Value of $\bigg(\dfrac{i(n+1)}{4}\bigg)^{th}$ observation, $i=1,2,3$

where $n$ is the total number of observations.

Arrange the data in ascending order

72, 73, 73, 73, 75, 75, 76, 76, 78, 78, 79, 80, 82, 83, 84, 85, 86, 87, 97, 99

First Quartile $Q_1$

The first quartle $Q_1$ can be computed as follows:

$$ \begin{aligned} Q_1 &=\text{Value of }\bigg(\dfrac{1(n+1)}{4}\bigg)^{th} \text{ observation}\\ &=\text{Value of }\bigg(\dfrac{1(20+1)}{4}\bigg)^{th} \text{ observation}\\ &= \text{ Value of }\big(5.25\big)^{th} \text{ observation}\\ &= \text{Value of }\big(5\big)^{th} \text{ obs.}+0.25 \big(\text{Value of } \big(6\big)^{th}\text{ obs.}-\text{Value of }\big(5\big)^{th} \text{ obs.}\big)\\ &=75+0.25\big(75 -75\big)\\ &=75 \end{aligned} $$

Third Quartile $Q_3$

The third quartile $Q_3$ can be computed as follows:

$$ \begin{aligned} Q_3 &=\text{Value of }\bigg(\dfrac{3(n+1)}{4}\bigg)^{th} \text{ observation}\\ &=\text{Value of }\bigg(\dfrac{3(20+1)}{4}\bigg)^{th} \text{ observation}\\ &= \text{Value of }\big(15.75\big)^{th} \text{ observation}\\ &= \text{Value of }\big(15\big)^{th} \text{ obs.}+0.75 \big(\text{Value of } \big(16\big)^{th}\text{ obs.}-\text{Value of }\big(15\big)^{th} \text{ obs.}\big)\\ &=84+0.75\big(85 -84\big)\\ &=84.75 \end{aligned} $$

Inter-quartile range

$$ \begin{aligned} IQR & = Q_3 - Q_1\\ &= 84.75 - 75\\ & = 9.75. \end{aligned} $$

$Q_1-1.5*IQR = 60.375$ and $Q3+1.5*IQR = 99.375$.

All the observations are within $Q_1-1.5*IQR$ and $Q3+1.5*IQR$.

So no outliers found in the data.

Related Resources