Summary statistic for grouped data
Summary statistic summarize and provide information about the sample data. It includes the minimum value of the data, first quartile ($Q_1$), median (i.e., $Q_2$), mean ($\overline{x}$), third quartile ($Q_3$) and the minimum value of the data.
Summary statistic includes
- minimum value ($\min$),
- first quartile ($Q_1$),
- $\text{median }$ ($Q_2$),
- sample mean ($\overline{x}$),
- third quartile ($Q_3$),
- maximum value ($\max$).
Formula
$\min$, $Q_1$, $\text{median}$, $\overline{x}$, $Q_3$ and $\max$
The mean of $X$ is denoted by $\overline{x}$ and is given by
$\overline{x} =\dfrac{1}{n}\sum_{i=1}^{n}x_i$
Quartiles
The formula for $i^{th}$ quartile is
$Q_i =$ Value of $\bigg(\dfrac{i(N+1)}{4}\bigg)^{th}$ observation, $i=1,2,3$
where $N$ is the total number of observations.
Example 1
A librarian keeps the records about the amount of time spent (in minutes) in a library by college students. Data is as follows:
Time spent | 30 | 32 | 35 | 38 | 40 |
---|---|---|---|---|---|
No. of students | 8 | 12 | 20 | 10 | 5 |
Compute summary statistics for the above frequency distribution.
Solution
$x_i$ | $f_i$ | $f_i*x_i$ | $cf$ | |
---|---|---|---|---|
30 | 8 | 240 | 8 | |
32 | 12 | 384 | 20 | |
35 | 20 | 700 | 40 | |
38 | 10 | 380 | 50 | |
40 | 5 | 200 | 55 | |
Total | 55 | 1904 |
Minimum Value
The minimum amount of time spent in library by college students is $\min = 30$
minutes.
Maximum Value
The maximum amount of time spent in library by college students is $\max = 40$
minutes.
Sample mean
The sample mean of $X$ is
$$ \begin{aligned} \overline{x} &=\frac{1}{N}\sum_{i=1}^n f_ix_i\\ &=\frac{1904}{55}\\ &=34.6182 \text{ minutes} \end{aligned} $$
The average amount of time spent in library by college students is $34.6182$ minutes.
Quartiles
The formula for $i^{th}$ quartile is
$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$
where $N$ is the total number of observations.
First Quartile $Q_1$
$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(55)}{4}\bigg)^{th}\text{ value}\\ &=\big(13.75\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $13.75$ is $20$. The corresponding value of $X$ is the $1^{st}$ quartile. That is, $Q_1 =32$ minutes.
Median $M$
$$ \begin{aligned} M &=\bigg(\dfrac{N}{2}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{55}{2}\bigg)^{th}\text{ value}\\ &=\big(27.5\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $13.75$ is $40$. The corresponding value of $X$ is the median. That is, $M =35$ minutes.
Third Quartile $Q_3$
$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(55)}{4}\bigg)^{th}\text{ value}\\ &=\big(41.25\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $41.25$ is $50$. The corresponding value of $X$ is the $3^{rd}$ quartile. That is, $Q_3 =38$ minutes.
Thus the summary statistics for the amount of time spent in library by college students is
$\min = 30$ minutes, $Q_1 = 32$ minutes, $\text{median }=35$ minutes, $\overline{x}=34.6182$ minutes, $Q_3=38$ minutes and $\max = 40$ minutes.
Example 2
The following table gives the distribution of weight (in pounds) of 100 newborn babies at certain hospital in 2012.
Weight (in pounds) | 3-5 | 5-7 | 7-9 | 9-11 | 11-13 |
---|---|---|---|---|---|
No.of babies | 10 | 30 | 28 | 18 | 14 |
Compute summary statistics for the above frequency distribution.
Solution
Class Interval | $x_i$ | $f_i$ | $f_i*x_i$ | $cf$ | |
---|---|---|---|---|---|
3-5 | 4 | 10 | 40 | 10 | |
5-7 | 6 | 30 | 180 | 40 | |
7-9 | 8 | 28 | 224 | 68 | |
9-11 | 10 | 18 | 180 | 86 | |
11-13 | 12 | 14 | 168 | 100 | |
Total | 100 | 792 |
Minumum Value
The minimum weight of newborn babies is $\min = 3 \text{ pounds}$
.
Maximum Value
The maximum weight of newborn babies is $\max = 13 \text{ pounds}$
.
Sample mean
The sample mean of $X$ is
$$ \begin{aligned} \overline{x} &=\frac{1}{N}\sum_{i=1}^n f_ix_i\\ &=\frac{792}{100}\\ &=7.92\text{ pounds} \end{aligned} $$
The average weight of newborn babies is $7.92$ pounds.
Quartiles
The formula for $i^{th}$ quartile is
$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$
where $N$ is the total number of observations.
First Quartile $Q_1$
$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(100)}{4}\bigg)^{th}\text{ value}\\ &=\big(25\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $25$ is $40$. The corresponding class $5-7$ is the $1^{st}$ quartile class.
Thus
- $l = 5$, the lower limit of the $1^{st}$ quartile class
- $N=100$, total number of observations
- $f =30$, frequency of the $1^{st}$ quartile class
- $F_< = 10$, cumulative frequency of the class previous to $1^{st}$ quartile class
- $h =2$, the class width
The first quartile $Q_1$ can be computed as follows:
$$ \begin{aligned} Q_1 &= l + \bigg(\frac{\frac{1(N)}{4} - F_<}{f}\bigg)\times h\\ &= 5 + \bigg(\frac{\frac{1*100}{4} - 10}{30}\bigg)\times 2\\ &= 5 + \bigg(\frac{25 - 10}{30}\bigg)\times 2\\ &= 5 + \big(0.5\big)\times 2\\ &= 5 + 1\\ &= 6 \text{ pounds} \end{aligned} $$
Thus, $25$ % of weight of newborn babies is less than or equal to $6$ pounds.
Median
$$ \begin{aligned} M &=\bigg(\dfrac{N}{2}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{100}{2}\bigg)^{th}\text{ value}\\ &=\big(50\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $50$ is $68$. The corresponding class $7-9$ is the median class.
Thus
- $l = 7$, the lower limit of the median class
- $N=100$, total number of observations
- $f =28$, frequency of the median class
- $F_< = 40$, cumulative frequency of the class previous to median class
- $h =2$, the class width
The median $M$ can be computed as follows:
$$ \begin{aligned} M &= l + \bigg(\frac{\frac{N}{2} - F_<}{f}\bigg)\times h\\ &= 7 + \bigg(\frac{\frac{100}{2} - 40}{28}\bigg)\times 2\\ &= 7 + \bigg(\frac{50 - 40}{28}\bigg)\times 2\\ &= 7 + \big(0.3571\big)\times 2\\ &= 7 + 0.7143\\ &= 7.7143 \text{ pounds} \end{aligned} $$
Thus, $50$ % of weight of newborn babies is less than or equal to $7.7143$ pounds.
Third Quartile $Q_3$
$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(100)}{4}\bigg)^{th}\text{ value}\\ &=\big(75\big)^{th}\text{ value} \end{aligned} $$
The cumulative frequency just greater than or equal to $75$ is $86$. The corresponding class $9-11$ is the $3^{rd}$ quartile class.
Thus
- $l = 9$, the lower limit of the $3^{rd}$ quartile class
- $N=100$, total number of observations
- $f =18$, frequency of the $3^{rd}$ quartile class
- $F_< = 68$, cumulative frequency of the class previous to $3^{rd}$ quartile class
- $h =2$, the class width
The third quartile $Q_3$ can be computed as follows:
$$ \begin{aligned} Q_3 &= l + \bigg(\frac{\frac{3(N)}{4} - F_<}{f}\bigg)\times h\\ &= 9 + \bigg(\frac{\frac{3*100}{4} - 68}{18}\bigg)\times 2\\ &= 9 + \bigg(\frac{75 - 68}{18}\bigg)\times 2\\ &= 9 + \big(0.3889\big)\times 2\\ &= 9 + 0.7778\\ &= 9.7778 \text{ pounds} \end{aligned} $$
Thus, $75$ % of weight of newborn babies is less than or equal to $9.7778$ pounds.
Thus the summary statistics of weight of newborn babies is
$\min = 3$ pounds, $Q_1 = 6$ pounds, $\text{median }=7.7143$ pounds,$\overline{x}=7.92$ pounds, $Q_3=9.7778$ pounds and $\max = 13$ pounds.
Related Resources
Suggestions and comments will be appreciated.