## Bowley's Coefficient of Skewness for grouped data

Skewness is a measure of symmetry. The meaning of skewness is “lack of symmetry”. Skewness gives us an idea about the concentration of higher or lower data values around the central value of the data.

For a symmetric distribution, the two quartiles namely $Q_1$ and $Q_3$ are equidistance from the median (i.e. $Q_2$). That is for symmetric distribution $Q_3 - Q_2 = Q_2 -Q_1$.

If the distriution is not symmetric (i.e., skewed) then the distance $Q_3-Q_2$ is not equal to the distance $Q_2-Q_1$. That is for asymetric distribution $Q_3-Q_2\neq Q_2-Q1$.

The absolute measure of skewness is $(Q_3-Q2)-(Q_2-Q1)= Q_3+Q_1-2*Q2$.

## Formula

Bowley's coefficient of skewness is the relative measure of skewness. It is denoted by $S_b$ and is defined as

`$S_b = \dfrac{Q_3+Q_1 - 2Q_2}{Q_3 -Q_1}$`

The formula for $i^{th}$ quartile is

`$$ \begin{aligned} Q_i=l + \bigg(\frac{\frac{iN}{4} - F_<}{f}\bigg)\times h; \quad i=1,2,3 \end{aligned} $$`

where,

- $l :$ the lower limit of the $i^{th}$ quartile class
- $N=\sum f :$ total number of observations
- $f :$ frequency of the $i^{th}$ quartile class
- $F_< :$ cumulative frequency of the class previous to $i^{th}$ quartile class
- $h :$ the class width

## Types of Skewness

- If $S_b<0$, i.e., $Q_3-Q_2<Q_2-Q1$ then the distriution is
**negatively skewed**. - If $S_b=0$, i.e., $Q_3-Q_2=Q_2-Q1$ then the distriution is
**Symmetric**or**not skewed**. - If $S_b>0$, i.e., $Q_3-Q_2>Q_2-Q1$ then the distriution is
**positively skewed**.

## Example 1

The following table gives the number of children of 80 families in a village

No.of children | 0 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|---|

No. of families | 12 | 23 | 16 | 9 | 10 | 10 |

Find the Bowley's coefficient of skewness.

### Solution

$x_i$ | $f_i$ | $cf$ | |
---|---|---|---|

0 | 12 | 12 | |

1 | 23 | 35 | |

2 | 16 | 51 | |

3 | 9 | 60 | |

4 | 10 | 70 | |

5 | 10 | 80 | |

Total | 80 |

**Quartiles**

The formula for $i^{th}$ quartile is

$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$

where $N$ is the total number of observations.

**First Quartile $Q_1$**

`$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(80)}{4}\bigg)^{th}\text{ value}\\ &=\big(20\big)^{th}\text{ value} \end{aligned} $$`

The cumulative frequency just greater than or equal to $20$ is $35$. The corresponding value of $X$ is the $1^{st}$ quartile. That is, $Q_1 =1$ days.

Thus, $25$ % of the students had absences less than or equal to $1$ days.

**Second Quartile $Q_2$**

`$$ \begin{aligned} Q_{2} &=\bigg(\dfrac{2(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{2(80)}{4}\bigg)^{th}\text{ value}\\ &=\big(40\big)^{th}\text{ value} \end{aligned} $$`

The cumulative frequency just greater than or equal to $40$ is $51$. The corresponding value of $X$ is the $2^{nd}$ quartile. That is, $Q_2 =2$ days.

Thus, $50$ % of the students had absences less than or equal to $2$ days.

**Third Quartile $Q_3$**

`$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(80)}{4}\bigg)^{th}\text{ value}\\ &=\big(60\big)^{th}\text{ value} \end{aligned} $$`

The cumulative frequency just greater than or equal to $60$ is $70$. The corresponding value of $X$ is the $3^{rd}$ quartile. That is, $Q_3 =4$ days.

Thus, $75$ % of the students had absences less than or equal to $4$ days.

**Bowley's Coefficient of Skewness**

The coefficient of skewness based on quartiles is
`$$ \begin{aligned} S_b &= \frac{Q_3+Q_1 - 2Q_2}{Q_3 -Q_1}\\ &= \frac{4 + 1 - 2*2}{4 - 1}\\ &=\frac{1}{3}\\ &= 0.3333 \end{aligned} $$`

As the coefficient of skewness $S_b$ is **$\text{greater than zero}$** (i.e., $S_b > 0$), the distribution is **$\text{positively skewed}$**.

## Example 2

The following table gives the frequency distribution of waiting time of 65 persons at a ticket counter to buy a movie ticket.

Waiting time (in minutes) | 0-6 | 7-13 | 14-20 | 21-27 | 28- 34 |
---|---|---|---|---|---|

frequency | 5 | 12 | 18 | 30 | 10 |

Compute the Bowley's coefficient of skewness.

### Solution

The classes are inclusive. To make them exclusive type subtract 0.5 from the lower limit and add 0.5 to the upper limit of each class.

Class Interval | Class Boundries | $f_i$ | $cf$ | |
---|---|---|---|---|

0-6 | -0.5-6.5 | 5 | 5 | |

7-13 | 6.5-13.5 | 12 | 17 | |

14-20 | 13.5-20.5 | 18 | 35 | |

21-27 | 20.5-27.5 | 20 | 55 | |

28-34 | 27.5-34.5 | 10 | 65 | |

Total | 65 |

**Quartiles**

The formula for $i^{th}$ quartile is

$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$

where $N$ is the total number of observations.

**First Quartile $Q_1$**

`$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(65)}{4}\bigg)^{th}\text{ value}\\ &=\big(16.25\big)^{th}\text{ value} \end{aligned} $$`

The cumulative frequency just greater than or equal to $16.25$ is $17$. The corresponding class $6.5-13.5$ is the $1^{st}$ quartile class.

Thus

- $l = 6.5$, the lower limit of the $1^{st}$ quartile class
- $N=65$, total number of observations
- $f =12$, frequency of the $1^{st}$ quartile class
- $F_< = 5$, cumulative frequency of the class previous to $1^{st}$ quartile class
- $h =7$, the class width

The first quartile $Q_1$ can be computed as follows:

`$$ \begin{aligned} Q_1 &= l + \bigg(\frac{\frac{1(N)}{4} - F_<}{f}\bigg)\times h\\ &= 6.5 + \bigg(\frac{\frac{1*65}{4} - 5}{12}\bigg)\times 7\\ &= 6.5 + \bigg(\frac{16.25 - 5}{12}\bigg)\times 7\\ &= 6.5 + \big(0.9375\big)\times 7\\ &= 6.5 + 6.5625\\ &= 13.0625 \text{ minutes} \end{aligned} $$`

Thus, $25$ % of the students spent less than or equal to $13.0625$ minutes on the internet.

**Second Quartile $Q_2$**

`$$ \begin{aligned} Q_{2} &=\bigg(\dfrac{2(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{2(65)}{4}\bigg)^{th}\text{ value}\\ &=\big(32.5\big)^{th}\text{ value} \end{aligned} $$`

The cumulative frequency just greater than or equal to $32.5$ is $35$. The corresponding class $13.5-20.5$ is the $2^{nd}$ quartile class.

Thus

- $l = 13.5$, the lower limit of the $2^{nd}$ quartile class
- $N=65$, total number of observations
- $f =18$, frequency of the $2^{nd}$ quartile class
- $F_< = 17$, cumulative frequency of the class previous to $2^{nd}$ quartile class
- $h =7$, the class width

The second quartile $Q_2$ can be computed as follows:

`$$ \begin{aligned} Q_2 &= l + \bigg(\frac{\frac{2(N)}{4} - F_<}{f}\bigg)\times h\\ &= 13.5 + \bigg(\frac{\frac{2*65}{4} - 17}{18}\bigg)\times 7\\ &= 13.5 + \bigg(\frac{32.5 - 17}{18}\bigg)\times 7\\ &= 13.5 + \big(0.8611\big)\times 7\\ &= 13.5 + 6.0278\\ &= 19.5278 \text{ minutes} \end{aligned} $$`

Thus, $50$ % of the students spent less than or equal to $19.5278$ minutes on the internet.

**Third Quartile $Q_3$**

`$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(65)}{4}\bigg)^{th}\text{ value}\\ &=\big(48.75\big)^{th}\text{ value} \end{aligned} $$`

The cumulative frequency just greater than or equal to $48.75$ is $55$. The corresponding class $20.5-27.5$ is the $3^{rd}$ quartile class.

Thus

- $l = 20.5$, the lower limit of the $3^{rd}$ quartile class
- $N=65$, total number of observations
- $f =20$, frequency of the $3^{rd}$ quartile class
- $F_< = 35$, cumulative frequency of the class previous to $3^{rd}$ quartile class
- $h =7$, the class width

The third quartile $Q_3$ can be computed as follows:

`$$ \begin{aligned} Q_3 &= l + \bigg(\frac{\frac{3(N)}{4} - F_<}{f}\bigg)\times h\\ &= 20.5 + \bigg(\frac{\frac{3*65}{4} - 35}{20}\bigg)\times 7\\ &= 20.5 + \bigg(\frac{48.75 - 35}{20}\bigg)\times 7\\ &= 20.5 + \big(0.6875\big)\times 7\\ &= 20.5 + 4.8125\\ &= 25.3125 \text{ minutes} \end{aligned} $$`

Thus, $75$ % of the students spent less than or equal to $25.3125$ minutes on the internet.

**Bowley's Coefficient of Skewness**

The coefficient of skewness based on quartiles is
`$$ \begin{aligned} S_b &= \frac{Q_3+Q_1 - 2Q_2}{Q_3 -Q_1}\\ &= \frac{25.3125 + 13.0625 - 2*19.5278}{25.3125 - 13.0625}\\ &=\frac{-0.6806}{12.25}\\ &= -0.0556 \end{aligned} $$`

As the coefficient of skewness $S_b$ is **$\text{less than zero}$** (i.e., $S_b < 0$), the distribution is **$\text{negatively skewed}$**.