Chebyshev’s Inequality

Let $X$ be a random variable with mean $\mu$ and finite variance $\sigma^2$. Then for any real constant $k>0$, $$ \begin{equation*} P[|X-\mu| \geq k\sigma] \leq \frac{1}{k^2}\quad \text{ and }\quad P[|X-\mu|< k\sigma] \geq 1-\frac{1}{k^2} \end{equation*} $$ OR

If $\mu$ and $\sigma$ are the mean and the standard deviation of a random variable $X,$ then for any positive constant $k$, the probability is at least $1-\dfrac{1}{k^2}$ that $X$ will take on a value within $k$ standard deviations of the mean.

Formula

The probability that the random variable $X$ is within $k$ standard deviation of the mean is given by $$ \begin{aligned} P[|X-\mu| \geq k\sigma] \leq \frac{1}{k^2}\quad \text{ and }\quad P[|X-\mu|< k\sigma] \geq 1-\frac{1}{k^2} \end{aligned} $$ where,

  • $\mu$ is the mean of $X$,
  • $\sigma$ is the standard deviation of $X$,
  • $k$ is the real constant greater than 0.

Example 1

The ages of members of gym have a mean of 45 years and a standard deviation of 11 years. What can you conclude about the percentage of gym members aged between 28.5 and 61.5?

Solution

Given that $\mu=45$ and $\sigma = 11$.

The probability that the gym members age is between $28.5$ and $61.5$ is $$ \begin{aligned} P(28.5 < X < 61.5) &= P(28.5-45<X-\mu< 61.5-45)\\ &= P(-16.5<(X-\mu)< 16.5)\\ &=P\big(|X-\mu|< 16.5\big) \end{aligned} $$ Comparing this with the Chebyshev’s inequality, we get

$$ \begin{aligned} & k\sigma = 16.5\\ \Rightarrow & k =\frac{16.5}{\sigma}\\ \Rightarrow & k =\frac{16.5}{11}\\ \Rightarrow & k =1.5 \end{aligned} $$

Therefore, by Chebyshev’s inequality,

$$ \begin{aligned} P(28.5 < X < 61.5) &=P\big(|X-\mu|< 16.5\big)\\ &\geq 1-\frac{1}{k^2}\\ &\geq 1-\frac{1}{1.5^2}\\ &\geq 1-0.4444\\ &\geq 0.5556 \end{aligned} $$ Thus, the percentage of gym members aged between $28.5$ and $61.5$ is at least $55.56$.

Example 2

The daily production of electric motors at a certain factory averaged 120, with a standard deviation of 10.

a. What can be said about the fraction of days on which the production level falls between 100 and 140?

b. Find the shortest interval certain to contain at least 90% of the daily production levels.

Solution

Given that $\mu=120$ and $\sigma = 10$.

a. The probability that the production level falls between $100$ and $140$ is $$ \begin{aligned} P(100 < X < 140) &= P(100-120<X-\mu< 140-120)\\ &= P(-20<(X-\mu)< 20)\\ &=P\big(|X-\mu|< 20\big) \end{aligned} $$ Comparing this with the Chebyshev’s inequality, we get $$ \begin{aligned} & k\sigma = 20\\ \Rightarrow & k =\frac{20}{\sigma}\\ \Rightarrow & k =\frac{20}{10}\\ \Rightarrow & k =2 \end{aligned} $$ Therefore, by Chebyshev’s inequality,

$$ \begin{aligned} P(100 < X < 140) &=P\big(|X-\mu|< 20\big)\\ &\geq 1-\frac{1}{k^2}\\ &\geq 1-\frac{1}{2^2}\\ &\geq 1-0.25\\ &\geq 0.75\\ \end{aligned} $$ Thus, the percentage of days on which the production level falls between $100$ and $140$ is at least $75$.

b. We want to find the value of $k$ such that shortest interval certain to contain at least 90% of the daily production levels.

Using Chebyshev’s inequality, $$ \begin{aligned} P(|X-120|<10k)\geq 1-\frac{1}{k^2}=0.9 \end{aligned} $$

$$ \begin{aligned} & 1-\frac{1}{k^2}=0.9 \\ \Rightarrow & \frac{1}{k^2} = 0.1\\ \Rightarrow & k^2 = 10\\ \Rightarrow & k = \sqrt{10}\\ \Rightarrow & k = 3.16\\ \end{aligned} $$

Using the Chebyshev’s inequality

$$ \begin{aligned} & P(|X-120|<10\times 3.16)\geq 0.9\\ \Rightarrow & P(|X-120|<31.6)\geq 0.9\\ \Rightarrow & P(-31.6<X-120<31.6)\geq 0.9\\ \Rightarrow & P(-31.6+120<X<31.6+120)\geq 0.9\\ \Rightarrow & P(88.4<X<151.6)\geq 0.9\\ \end{aligned} $$ Thus, the shortest interval $(88.4,151.6)$ will contain at least 90% of the daily production levels.

Related Resources