Karl Pearson coefficient of skewness for ungrouped data
Let $x_i, i=1,2, \cdots , n$ be $n$ observations.
Formula
The Karl Pearson’s coefficient skewness is given by
$S_K =\dfrac{3(Mean-Median)}{sd}=\dfrac{3(\overline{x}-M)}{s_x}$
where,
- $\overline{x}$ is the sample mean,
- $M$ is the median,
- $s_x$ is the sample standard deviation.
Sample mean
The sample mean $\overline{x}$ is given by
$$ \begin{eqnarray*} \overline{x}& =\frac{1}{n}\sum_{i=1}^{n}x_i \end{eqnarray*} $$
Sample Median
Arrange the data in ascending order of magnitude.
Median of $X$ is given by
$$ \begin{equation*} Md= \left\{ \begin{array}{ll} \text{value of }\big(\frac{n+1}{2}\big)^{th}\text{ obs.}, & \hbox{if $n$ is odd;} \\ \text{average of }\big(\frac{n}{2}\big)^{th}\text{ and }\big(\frac{n}{2}+1\big)^{th} \text{ obs.}, & \hbox{if $n$ is even.} \end{array} \right. \end{equation*} $$
Sample Standard deviation
sample standard deviation is given by
$$ \begin{aligned} s_x &=\sqrt{s_x^2}\\ &=\sqrt{\dfrac{1}{n-1}\bigg(\sum_{i=1}^{n}x_i^2-\frac{\big(\sum_{i=1}^n x_i\big)^2}{n}\bigg)} \end{aligned} $$
Example 1
The age (in years) of 6 randomly selected students from a class are
22,25,24,23,24,20.
Find the Karl Pearson’s coefficient of skewness.
Solution
$x_i$ | $x_i^2$ | |
---|---|---|
22 | 484 | |
25 | 625 | |
24 | 576 | |
23 | 529 | |
24 | 576 | |
20 | 400 | |
Total | 138 | 3190 |
Sample mean
The sample mean of $X$ is
$$ \begin{aligned} \overline{x} &=\frac{1}{n}\sum_{i=1}^n x_i\\ &=\frac{138}{6}\\ &=23\text{ years} \end{aligned} $$
The average of age of students is $23$ years.
Sample Median The data in ascending order of magnitude is $20, 22, 23, 24, 24, 25$.
Here $n = 6$ which is even.
Sample median = average of $(\frac{n}{2})^{th}$ and $(\frac{n}{2}+1)^{th}$ observations.
Thus the median age of students is
$$ \begin{aligned} M &= \frac{\big(\frac{6}{2}\big)^{th}\text{Obs.} +\big(\frac{6}{2}+1\big)^{th}\text{Obs.}}{2}\\ &= \frac{\big(3\big)^{th}\text{Obs.} +\big(4\big)^{th}\text{Obs.}}{2}\\ &=\frac{23 +24}{2} \\ &= 23.5 \text{ years}. \end{aligned} $$
The median age of students is $M=23.5$ years.
Sample variance
Sample variance of $X$ is
$$ \begin{aligned} s_x^2 &=\dfrac{1}{n-1}\bigg(\sum_{i=1}^{n}x_i^2-\frac{\big(\sum_{i=1}^n x_i\big)^2}{n}\bigg)\\ &=\dfrac{1}{5}\bigg(3190-\frac{(138)^2}{6}\bigg)\\ &=\dfrac{1}{5}\big(3190-\frac{19044}{6}\big)\\ &=\dfrac{1}{5}\big(3190-3174\big)\\ &= \frac{16}{5}\\ &=3.2 \end{aligned} $$
Sample standard deviation
The standard deviation is the positive square root of the variance.
The sample standard deviation is
$$ \begin{aligned} s_x &=\sqrt{s_x^2}\\ &=\sqrt{3.2}\\ &=1.7889 \text{ years} \end{aligned} $$
Thus the standard deviation of age of students is $1.7889$ years.
Karl Pearson’s coefficient of skewness
The Karl Pearson’s coefficient skewness is
$$ \begin{aligned} s_k &=\frac{3(Mean-Median)}{sd}\\ &=\frac{3\times(23-23.5)}{1.7889}\\ &= -0.8385 \end{aligned} $$
As the value of $s_k < 0$, the data is $\text{negatively skewed}$.
Example 2
A random sample of 11 patients yielded the following data on the length of stay (in days) in the hospital.
12,9,10,15,10,14,7,10,8,11,15
Find the Karl Pearson’s coefficient of skewness.
Solution
$x_i$ | $x_i^2$ | |
---|---|---|
12 | 144 | |
9 | 81 | |
10 | 100 | |
15 | 225 | |
10 | 100 | |
14 | 196 | |
7 | 49 | |
10 | 100 | |
8 | 64 | |
11 | 121 | |
15 | 225 | |
Total | 121 | 1405 |
Sample mean
The sample mean of $X$ is
$$ \begin{aligned} \overline{x} &=\frac{1}{n}\sum_{i=1}^n x_i\\ &=\frac{121}{11}\\ &=11\text{ days} \end{aligned} $$
The average of length of stay in the hospital is $11$ days.
Sample Median
$n = 11$ which is odd.
The data in ascending order of magnitude is $7, 8, 9, 10, 10, 10, 11, 12, 14, 15, 15$.
Sample median = average of $(\frac{n}{2})^{th}$ and $(\frac{n}{2}+1)^{th}$ observations
That is
$$ \begin{aligned} M &= \text{value of }\bigg(\frac{n+1}{2}\bigg)^{th}\text{ obs.}\\ &= \text{value of }\bigg(\frac{11+1}{2}\bigg)^{th}\text{ obs.}\\ &= \text{value of } \big(6\big)^{th}\text{Obs.}\\ &=10 \text{ days} \end{aligned} $$
The median length of stay in the hospital is $M=10$ days.
Sample variance
Sample variance of $X$ is
$$ \begin{aligned} s_x^2 &=\dfrac{1}{n-1}\bigg(\sum_{i=1}^{n}x_i^2-\frac{\big(\sum_{i=1}^n x_i\big)^2}{n}\bigg)\\ &=\dfrac{1}{10}\bigg(1405-\frac{(121)^2}{11}\bigg)\\ &=\dfrac{1}{10}\big(1405-\frac{14641}{11}\big)\\ &=\dfrac{1}{10}\big(1405-1331\big)\\ &= \frac{74}{10}\\ &=7.4 \end{aligned} $$
Sample standard deviation
The standard deviation is the positive square root of the variance.
The sample standard deviation is
$$ \begin{aligned} s_x &=\sqrt{s_x^2}\\ &=\sqrt{7.4}\\ &=2.7203 \text{ days} \end{aligned} $$
Thus the standard deviation of length of stay in the hospital is $2.7203$ days.
Karl Pearson’s coefficient of skewness
The Karl Pearson’s coefficient skewness is
$$ \begin{aligned} s_k &=\frac{3(Mean-Median)}{sd}\\ &=\frac{3\times(11-10)}{2.7203}\\ &= 1.1028 \end{aligned} $$
As the value of $s_k > 0$, the data is $\text{positively skewed}$.