Percentiles for grouped data
Percentiles are the values which divide whole distriution into hundred equal parts. They are 99 in numbers namely $P_1, P_2, \cdots, P_{99}$. Here $P_1$ is first percentile, $P_2$ is second percentile and so on.
Formula
For discrete frequency distribution, the formula for $i^{th}$ percentile is
$P_i =\bigg(\dfrac{i(N)}{100}\bigg)^{th}$ value, $i=1,2,\cdots, 99$
where,
- $N$ is total number of observations.
For continuous frequency distribution, the formula for $i^{th}$ percentile is
$P_i=l + \bigg(\dfrac{\frac{iN}{100} - F_<}{f}\bigg)\times h; \quad i=1,2,\cdots,99$
where,
- $l :$ the lower limit of the $i^{th}$ percentile class
- $N=\sum f :$ total number of observations
- $f :$ frequency of the $i^{th}$ percentile class
- $F_< :$ cumulative frequency of the class previous to $i^{th}$ percentile class
- $h :$ the class width
Example 1
A librarian keeps the records about the amount of time spent (in minutes) in a library by college students. Data is as follows:
| Time spent | 30 | 32 | 35 | 38 | 40 |
|---|---|---|---|---|---|
| No. of students | 8 | 12 | 20 | 10 | 5 |
Calculate $P_{15}$ and $P_{40}$.
Solution
| $x_i$ | $f_i$ | $cf$ | |
|---|---|---|---|
| 30 | 8 | 8 | |
| 32 | 12 | 20 | |
| 35 | 20 | 40 | |
| 38 | 10 | 50 | |
| 40 | 5 | 55 | |
| Total | 55 |
The formula for $i^{th}$ percentile is
$P_i =\bigg(\dfrac{i(N)}{100}\bigg)^{th}$ value, $i=1,2,\cdots, 99$
where $N$ is the total number of observations.
Fiftieth percentile $P_{15}$
$$ \begin{aligned} P_{15} &=\bigg(\dfrac{15(N)}{100}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{15(55)}{100}\bigg)^{th}\text{ value}\\ &=\big(8.25\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $8.25$ is $20$. The corresponding value of $X$ is the $15^{th}$ percentile. That is, $P_{15} =32$ minutes.
Fourtieth percentile $P_{40}$
$$ \begin{aligned} P_{40} &=\bigg(\dfrac{40(N)}{100}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{40(55)}{100}\bigg)^{th}\text{ value}\\ &=\big(22\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $22$ is $40$. The corresponding value of $X$ is the $40^{th}$ percentile. That is, $P_{40} =35$ minutes.
Example 2
The following table gives a frequency distribution of weight (in pounds) of 57 children at a day care center.
| Weight | 10-19 | 20-29 | 30-39 | 40-49 | 50-59 | 60-69 | 70-79 |
|---|---|---|---|---|---|---|---|
| children | 5 | 19 | 10 | 13 | 4 | 4 | 2 |
Calculate
a. the maximum weight of lower 30 % of the children,
b. the minimum weight of upper 30 % of the children,
c. the limits for the weight of middle 40 % of the children.
Solution
Let $X$ denote the weight of children at a day care center.
Here the classes are inclusive. To make them exclusive type subtract 0.5 from the lower limit and add 0.5 to the upper limit of each class.
| Class Interval | Class Boundries | $f_i$ | $cf$ | |
|---|---|---|---|---|
| 10-20 | 9.5-20.5 | 5 | 5 | |
| 20-30 | 19.5-30.5 | 19 | 24 | |
| 30-40 | 29.5-40.5 | 10 | 34 | |
| 40-50 | 39.5-50.5 | 13 | 47 | |
| 50-60 | 49.5-60.5 | 4 | 51 | |
| 60-70 | 59.5-70.5 | 4 | 55 | |
| 10-20 | 9.5-20.5 | 2 | 57 | |
| Total | 57 |
a. The maximum weight of lower $30$ % of the children is $P_{30}$.
The formula for $i^{th}$ percentile is
$P_i =\bigg(\dfrac{i(N)}{100}\bigg)^{th}$ value, $i=1,2,\cdots, 99$
where $N$ is the total number of observations.
$$ \begin{aligned} P_{30} &=\bigg(\dfrac{30(N)}{100}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{30(57)}{100}\bigg)^{th}\text{ value}\\ &=\big(17.1\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $17.1$ is $24$, the corresponding class $19.5-30.5$ is the $30^{th}$ percentile class.
Thus
- $l = 19.5$, the lower limit of the $30^{th}$ percentile class
- $N=57$, total number of observations
- $f =19$, frequency of the $30^{th}$ percentile class
- $F_< = 5$, cumulative frequency of the class previous to $30^{th}$ percentile class
- $h =10$, the class width
The thirtieth percentile $P_{30}$ can be computed as follows:
$$ \begin{aligned} P_{30} &= l + \bigg(\frac{\frac{30(N)}{100} - F_<}{f}\bigg)\times h\\ &= 19.5 + \bigg(\frac{\frac{30*57}{100} - 5}{19}\bigg)\times 10\\ &= 19.5 + \bigg(\frac{17.1 - 5}{19}\bigg)\times 10\\ &= 19.5 + \big(0.6368\big)\times 10\\ &= 19.5 + 6.3684\\ &= 25.8684 \text{ pounds} \end{aligned} $$
The maximum weight of lower $30$ % of the children is $P_{30}= 25.8684$ pounds.
b. The minimum weight of upper $30$ % of the children is $P_{70}$.
$$ \begin{aligned} P_{70} &=\bigg(\dfrac{70(N)}{100}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{70(57)}{100}\bigg)^{th}\text{ value}\\ &=\big(39.9\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $39.9$ is $47$, the corresponding class $39.5-50.5$ is the $70^{th}$ percentile class.
Thus
- $l = 39.5$, the lower limit of the $70^{th}$ percentile class
- $N=57$, total number of observations
- $f =13$, frequency of the $70^{th}$ percentile class
- $F_< = 34$, cumulative frequency of the class previous to $70^{th}$ percentile class
- $h =10$, the class width
The seventieth percentile $P_{70}$ can be computed as follows:
$$ \begin{aligned} P_{70} &= l + \bigg(\frac{\frac{70(N)}{100} - F_<}{f}\bigg)\times h\\ &= 39.5 + \bigg(\frac{\frac{70*57}{100} - 34}{13}\bigg)\times 10\\ &= 39.5 + \bigg(\frac{39.9 - 34}{13}\bigg)\times 10\\ &= 39.5 + \big(0.4538\big)\times 10\\ &= 39.5 + 4.5385\\ &= 44.0385 \text{ pounds} \end{aligned} $$
The minimum weight of upper $30$ % of the children is $P_{30}= 44.0385$ pounds.
c. The limits for the weight of middle $40$ % of the children is $P_{30}$ and $P_{70}$.
Thus the limits for the weight of middle 40 % of the children is $P_{30} = 25.8684$ pounds and $P_{70}= 44.0385$ pounds.