Hypergeometric Experiment

A hypergeometric experiment is an experiment which satisfies each of the following conditions:

  • The population or set to be sampled consists of N individuals, objects, or elements (a finite population).
  • Each object can be characterized as a “defective” or “non-defective”, and there are M defectives in the population.
  • A sample of n individuals is drawn in such a way that each subset of size n is equally likely to be chosen.

Hypergeometric Distribution

Suppose we have an hypergeometric experiment. That is, suppose there are N units in the population and M out of N are defective, so NM units are non-defective.

Let X denote the number of defective in a completely random sample of size n drawn from a population consisting of total N units.

The total number of ways of finding n units out of N is (Nn). Out of M defective units x defective units can be selected in (Mx) ways and out of NM non-defective units remaining (nx) units can be selected in (NMnx) ways.

Hence, probability of selecting x defective units in a random sample of n units out of N is P(X=x)=Favourable CasesTotal Cases

P(X=x)=(Mx)(NMnx)(Nn),x=0,1,2,,n.

The above distribution is called hypergeometric distribution.

Notation: XH(n,M,N).

Graph of Hypergeometric Distribution H(5,5,20)

Following graph shows the probability mass function of hypergeometric distribution.

Key Features of Hypergeometric Distribution

  • Suppose there are N units in the population. These N units are classified as M successes and remaining NM failures.
  • Out of N units, n units are selected at random without replacement.
  • X is the number of successes in the sample.

Mean of Hypergeometric Distribution

The expected value of hypergeometric randome variable is E(X)=MnN.

Proof

The expected value of hypergeometric randome variable is \begin{eqnarray*} % \nonumber to remove numbering (before each equation) E(x) &=& \sum_{x=0}^n x\frac{\binom{M}{x}\binom{N-M}{n-x}}{\binom{N}{n}}\\ &=& 0+ \sum_{x=1}^n x\frac{\frac{M!}{x!(M-x)!}\binom{N-M}{n-x}}{\frac{N!}{n!(N-n)!}}\\ &=& \sum_{x=1}^n \frac{\frac{M(M-1)!}{(x-1)!(M-x)!}\binom{N-M}{n-x}}{\frac{N(N-1)!}{n(n-1)!(N-n)!}}\\ &=& \frac{Mn}{N}\sum_{x=1}^n\frac{\binom{M-1}{x-1}\binom{N-M}{n-x}}{\binom{N-1}{n-1}} \end{eqnarray*} Let x1=y. So for x=1, y=0 and for x=n, y=n1. Therefore \begin{eqnarray*} % \nonumber to remove numbering (before each equation) \mu_1^\prime &=& \frac{Mn}{N}\sum_{y=0}^{n-1}\frac{\binom{M-1}{y}\binom{N-M}{n-y-1}}{\binom{N-1}{n-1}} \\ &=& \frac{Mn}{N}\sum_{y=0}^{n^\prime}\frac{\binom{M-1}{y}\binom{N-M}{n^\prime-y}}{\binom{N-1}{n^\prime-1}} \\ &=&\frac{Mn}{N}\times 1. \end{eqnarray*} Hence, mean = E(X)=MnN.

Variance of Hypergeometric Distribution

The variance of an hypergeometric random variable is V(X)=Mn(NM)(Nn)N2(N1).

Proof

The variance of random variable X is given by

V(X)=E(X2)[E(X)]2.

Let us find the expected value of X(X1).

\begin{eqnarray*} % \nonumber to remove numbering (before each equation) E[X(X-1)]&=& \sum_{x=0}^n x(x-1)\frac{\binom{M}{x}\binom{N-M}{n-x}}{\binom{N}{n}}\\ &=& 0+0+ \sum_{x=2}^n x\frac{\frac{M!}{x!(M-x)!}\binom{N-M}{n-x}}{\frac{N!}{n!(N-n)!}}\\ &=& \sum_{x=2}^n \frac{\frac{M(M-1)(M-2)!}{(x-2)!(M-x)!}\binom{N-M}{n-x}}{\frac{N(N-1)(N-2)!}{n(n-1)(n-2)!(N-n)!}}\\ &=& \frac{M(M-1)n(n-1)}{N(N-1)}\sum_{x=2}^n\frac{\binom{M-2}{x-2}\binom{N-M}{n-x}}{\binom{N-2}{n-2}} \end{eqnarray*} Let x2=y. So for x=2, y=0 and for x=n, y=n2. Therefore \begin{eqnarray*} % \nonumber to remove numbering (before each equation) E[(X(X-1)]&=& \frac{Mn}{N}\sum_{y=0}^{n-2}\frac{\binom{M-2}{y}\binom{N-M}{n-y-2}}{\binom{N-2}{n-2}} \\ &=& \frac{Mn}{N}\sum_{y=0}^{n^\prime}\frac{\binom{M-2}{y}\binom{N-M}{n^\prime-y}}{\binom{N-2}{n^\prime}} \\ &=& \frac{M(M-1)n(n-1)}{N(N-1)}\times 1\\ & = &\frac{M(M-1)n(n-1)}{N(N-1)}. \end{eqnarray*}

\begin{eqnarray*} % \nonumber to remove numbering (before each equation) \mu_2^\prime &=& E[X(X-1)]+E(X) \\ &=& \frac{M(M-1)n(n-1)}{N(N-1)}+ \frac{Mn}{N}. \end{eqnarray*} Hence, \begin{eqnarray*} % \nonumber to remove numbering (before each equation) \text{Variance = }\mu_2 &=& \mu_2^\prime -(\mu_1^\prime)^2 \\ &=& \frac{M(M-1)n(n-1)}{N(N-1)}+ \frac{Mn}{N}- \frac{M^2n^2}{N^2} \\ &=& \frac{Mn(N-M)(N-n)}{N^2(N-1)}. \end{eqnarray*}

Binomial as a limiting case of Hypergeometric distribution

In Hypergeometric distribution, if N and MN=p, then the hypergeometric distribution tends to binomial distribution.

Proof

\begin{eqnarray*} % \nonumber to remove numbering (before each equation) P(X=x) &=& \lim_{N\to\infty} \frac{\binom{M}{x}\binom{N-M}{n-x}}{\binom{N}{n}}\\ &=&  \lim_{N\to\infty} \frac{\bigg[\frac{M(M-1)\cdots (M-x+1)}{x!}\bigg]\bigg[\frac{(N-M)(N-M-1)\cdots (N-M-n+x+1)}{(n-x)!}\bigg]}{\frac{N(N-1)\cdots (N-n+1)}{n!}} \end{eqnarray*} Dividing numerator and denominator by N, we get P(X=x)=limNn!x!(nx)!MN(MN1N)(MNx1N)(1MN)(1MN1N)(1MNnx1N)1(11N)(1n1N)=(nx)p(p0)(p0)(1p)(1p0)(1p0)1(10)(10)(MN=p)=(nx)px(1p)nx,x=0,1,2,,n;0<p<1. which is the probability mass function of binomial distribution.

Hope this tutorial helps you understand Hypergeometric distribution and various results related to Hypergeometric distributions.

Related Resources