Bowley’s Coefficient of Skewness for grouped data
Skewness is a measure of symmetry. The meaning of skewness is “lack of symmetry”. Skewness gives us an idea about the concentration of higher or lower data values around the central value of the data.
For a symmetric distribution, the two quartiles namely $Q_1$ and $Q_3$ are equidistance from the median (i.e. $Q_2$). That is for symmetric distribution $Q_3 - Q_2 = Q_2 -Q_1$.
If the distriution is not symmetric (i.e., skewed) then the distance $Q_3-Q_2$ is not equal to the distance $Q_2-Q_1$. That is for asymetric distribution $Q_3-Q_2\neq Q_2-Q1$.
The absolute measure of skewness is $(Q_3-Q2)-(Q_2-Q1)= Q_3+Q_1-2*Q2$.
Formula
Bowley’s coefficient of skewness is the relative measure of skewness. It is denoted by $S_b$ and is defined as
$S_b = \dfrac{Q_3+Q_1 - 2Q_2}{Q_3 -Q_1}$
The formula for $i^{th}$ quartile is
$$ \begin{aligned} Q_i=l + \bigg(\frac{\frac{iN}{4} - F_<}{f}\bigg)\times h; \quad i=1,2,3 \end{aligned} $$
where,
- $l :$ the lower limit of the $i^{th}$ quartile class
- $N=\sum f :$ total number of observations
- $f :$ frequency of the $i^{th}$ quartile class
- $F_< :$ cumulative frequency of the class previous to $i^{th}$ quartile class
- $h :$ the class width
Types of Skewness
- If $S_b<0$, i.e., $Q_3-Q_2<Q_2-Q1$ then the distriution is negatively skewed.
- If $S_b=0$, i.e., $Q_3-Q_2=Q_2-Q1$ then the distriution is Symmetric or not skewed.
- If $S_b>0$, i.e., $Q_3-Q_2>Q_2-Q1$ then the distriution is positively skewed.
Example 1
The following table gives the number of children of 80 families in a village
No.of children | 0 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|---|
No. of families | 12 | 23 | 16 | 9 | 10 | 10 |
Find the Bowley’s coefficient of skewness.
Solution
$x_i$ | $f_i$ | $cf$ | |
---|---|---|---|
0 | 12 | 12 | |
1 | 23 | 35 | |
2 | 16 | 51 | |
3 | 9 | 60 | |
4 | 10 | 70 | |
5 | 10 | 80 | |
Total | 80 |
Quartiles
The formula for $i^{th}$ quartile is
$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$
where $N$ is the total number of observations.
First Quartile $Q_1$
$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(80)}{4}\bigg)^{th}\text{ value}\\ &=\big(20\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $20$ is $35$. The corresponding value of $X$ is the $1^{st}$ quartile. That is, $Q_1 =1$ days.
Thus, $25$ % of the students had absences less than or equal to $1$ days.
Second Quartile $Q_2$
$$ \begin{aligned} Q_{2} &=\bigg(\dfrac{2(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{2(80)}{4}\bigg)^{th}\text{ value}\\ &=\big(40\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $40$ is $51$. The corresponding value of $X$ is the $2^{nd}$ quartile. That is, $Q_2 =2$ days.
Thus, $50$ % of the students had absences less than or equal to $2$ days.
Third Quartile $Q_3$
$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(80)}{4}\bigg)^{th}\text{ value}\\ &=\big(60\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $60$ is $70$. The corresponding value of $X$ is the $3^{rd}$ quartile. That is, $Q_3 =4$ days.
Thus, $75$ % of the students had absences less than or equal to $4$ days.
Bowley’s Coefficient of Skewness
The coefficient of skewness based on quartiles is
$$ \begin{aligned} S_b &= \frac{Q_3+Q_1 - 2Q_2}{Q_3 -Q_1}\\ &= \frac{4 + 1 - 2*2}{4 - 1}\\ &=\frac{1}{3}\\ &= 0.3333 \end{aligned} $$
As the coefficient of skewness $S_b$ is $\text{greater than zero}$ (i.e., $S_b > 0$), the distribution is $\text{positively skewed}$.
Example 2
The following table gives the frequency distribution of waiting time of 65 persons at a ticket counter to buy a movie ticket.
Waiting time (in minutes) | 0-6 | 7-13 | 14-20 | 21-27 | 28- 34 |
---|---|---|---|---|---|
frequency | 5 | 12 | 18 | 30 | 10 |
Compute the Bowley’s coefficient of skewness.
Solution
The classes are inclusive. To make them exclusive type subtract 0.5 from the lower limit and add 0.5 to the upper limit of each class.
Class Interval | Class Boundries | $f_i$ | $cf$ | |
---|---|---|---|---|
0-6 | -0.5-6.5 | 5 | 5 | |
7-13 | 6.5-13.5 | 12 | 17 | |
14-20 | 13.5-20.5 | 18 | 35 | |
21-27 | 20.5-27.5 | 20 | 55 | |
28-34 | 27.5-34.5 | 10 | 65 | |
Total | 65 |
Quartiles
The formula for $i^{th}$ quartile is
$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$
where $N$ is the total number of observations.
First Quartile $Q_1$
$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(65)}{4}\bigg)^{th}\text{ value}\\ &=\big(16.25\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $16.25$ is $17$. The corresponding class $6.5-13.5$ is the $1^{st}$ quartile class.
Thus
- $l = 6.5$, the lower limit of the $1^{st}$ quartile class
- $N=65$, total number of observations
- $f =12$, frequency of the $1^{st}$ quartile class
- $F_< = 5$, cumulative frequency of the class previous to $1^{st}$ quartile class
- $h =7$, the class width
The first quartile $Q_1$ can be computed as follows:
$$ \begin{aligned} Q_1 &= l + \bigg(\frac{\frac{1(N)}{4} - F_<}{f}\bigg)\times h\\ &= 6.5 + \bigg(\frac{\frac{1*65}{4} - 5}{12}\bigg)\times 7\\ &= 6.5 + \bigg(\frac{16.25 - 5}{12}\bigg)\times 7\\ &= 6.5 + \big(0.9375\big)\times 7\\ &= 6.5 + 6.5625\\ &= 13.0625 \text{ minutes} \end{aligned} $$
Thus, $25$ % of the students spent less than or equal to $13.0625$ minutes on the internet.
Second Quartile $Q_2$
$$ \begin{aligned} Q_{2} &=\bigg(\dfrac{2(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{2(65)}{4}\bigg)^{th}\text{ value}\\ &=\big(32.5\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $32.5$ is $35$. The corresponding class $13.5-20.5$ is the $2^{nd}$ quartile class.
Thus
- $l = 13.5$, the lower limit of the $2^{nd}$ quartile class
- $N=65$, total number of observations
- $f =18$, frequency of the $2^{nd}$ quartile class
- $F_< = 17$, cumulative frequency of the class previous to $2^{nd}$ quartile class
- $h =7$, the class width
The second quartile $Q_2$ can be computed as follows:
$$ \begin{aligned} Q_2 &= l + \bigg(\frac{\frac{2(N)}{4} - F_<}{f}\bigg)\times h\\ &= 13.5 + \bigg(\frac{\frac{2*65}{4} - 17}{18}\bigg)\times 7\\ &= 13.5 + \bigg(\frac{32.5 - 17}{18}\bigg)\times 7\\ &= 13.5 + \big(0.8611\big)\times 7\\ &= 13.5 + 6.0278\\ &= 19.5278 \text{ minutes} \end{aligned} $$
Thus, $50$ % of the students spent less than or equal to $19.5278$ minutes on the internet.
Third Quartile $Q_3$
$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(65)}{4}\bigg)^{th}\text{ value}\\ &=\big(48.75\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $48.75$ is $55$. The corresponding class $20.5-27.5$ is the $3^{rd}$ quartile class.
Thus
- $l = 20.5$, the lower limit of the $3^{rd}$ quartile class
- $N=65$, total number of observations
- $f =20$, frequency of the $3^{rd}$ quartile class
- $F_< = 35$, cumulative frequency of the class previous to $3^{rd}$ quartile class
- $h =7$, the class width
The third quartile $Q_3$ can be computed as follows:
$$ \begin{aligned} Q_3 &= l + \bigg(\frac{\frac{3(N)}{4} - F_<}{f}\bigg)\times h\\ &= 20.5 + \bigg(\frac{\frac{3*65}{4} - 35}{20}\bigg)\times 7\\ &= 20.5 + \bigg(\frac{48.75 - 35}{20}\bigg)\times 7\\ &= 20.5 + \big(0.6875\big)\times 7\\ &= 20.5 + 4.8125\\ &= 25.3125 \text{ minutes} \end{aligned} $$
Thus, $75$ % of the students spent less than or equal to $25.3125$ minutes on the internet.
Bowley’s Coefficient of Skewness
The coefficient of skewness based on quartiles is
$$ \begin{aligned} S_b &= \frac{Q_3+Q_1 - 2Q_2}{Q_3 -Q_1}\\ &= \frac{25.3125 + 13.0625 - 2*19.5278}{25.3125 - 13.0625}\\ &=\frac{-0.6806}{12.25}\\ &= -0.0556 \end{aligned} $$
As the coefficient of skewness $S_b$ is $\text{less than zero}$ (i.e., $S_b < 0$), the distribution is $\text{negatively skewed}$.