## Covariance between X and Y

Let $(x_i, y_i), i=1,2, \cdots , n$ be $n$ pairs of observations.

Covariance measures the simultaneous variability between the two variables. It indicates how the two variables are related. A positive value of covariance indicate that the two variables moves in the same direction, whereas a negative value of covariance indicate that the two variables moves on opposite direction.

## Formula

The sample covariance between $x$ and $y$ is denoted by $Cov(x,y)$ or $s_{xy}$ and is defined as

$Cov(x,y) =s_{xy}=\dfrac{1}{n-1}\sum_{i=1}^{n} (x_i-\overline{x})(y_i-\overline{y})$

OR

$s_{xy} = \dfrac{1}{n-1}\bigg(\sum xy - \dfrac{(\sum x)(\sum y)}{n}\bigg)$

where,

• $\overline{x}$ sample mean of $x$,
• $\overline{y}$ sample mean of $y$

## Sample mean of $x$

$\overline{x} =\dfrac{1}{n}\sum_{i=1}^{n}x_i$

## Sample mean of $y$

$\overline{y} =\dfrac{1}{n}\sum_{i=1}^{n}y_i$

## Example 1

A study was conducted to analyze the relationship between advertising expenditure and sales. The following data were recorded:

X Advertising (in $) 20 24 30 32 35 Y Sales (in$) 310 340 400 420 490

Compute the covariance between advertising expenditure and sales.

### Solution

Let $x$ denote the advertising expenditure and $y$ denote the sales.

$x$ $y$ $x^2$ $y^2$ $xy$
1 20 310 400 96100 6200
2 24 340 576 115600 8160
3 30 400 900 160000 12000
4 32 420 1024 176400 13440
5 35 490 1225 240100 17150
Total 141 1960 4125 788200 56950

The sample covariance between $x$ and $y$ is

\begin{aligned} s_{xy} & = \frac{1}{n-1}\bigg(\sum xy - \frac{(\sum x)(\sum y)}{n}\bigg)\\ & = \frac{1}{5-1}\bigg(56950-\frac{(141)(1960)}{5}\bigg)\\ &= \frac{1}{4}\bigg(56950-\frac{276360}{5}\bigg)\\ &= \frac{1}{4}\bigg(56950-55272\bigg)\\ &= \frac{1678}{4}\\ &= 419.5. \end{aligned} The covariance between advertising expenditure and sales is $419.5$. Since the value of covariance is positive, there is a positive relationship between advertising expenditure and sales. That is the two variables moves together in the same direction.

## Example 2

A study of the amount of rainfall and the quantity of air pollution removed produced the following data:

Daily Rainfall (0.01cm) 4.3 4.5 5.9 5.6 6.1 5.2 3.8 2.1 7.5
Particulate Removed ($\mu g/m^3$) 126 121 116 118 114 118 132 141 108

Calculate covariance between daily rainfall and particulate removed,

### Solution

Let $x$ denote the daily rainfall (0.01 cm) and $y$ denote the particulate removed ($\mu g/m^3$).

$x$ $y$ $x^2$ $y^2$ $xy$
1 4.3 126 18.49 15876 541.8
2 4.5 121 20.25 14641 544.5
3 5.9 116 34.81 13456 684.4
4 5.6 118 31.36 13924 660.8
5 6.1 114 37.21 12996 695.4
6 5.2 118 27.04 13924 613.6
7 3.8 132 14.44 17424 501.6
8 2.1 141 4.41 19881 296.1
9 7.5 108 56.25 11664 810.0
Total 45.0 1094 244.26 133786 5348.2

The sample covariance between $x$ and $y$ is

\begin{aligned} s_{xy} & = \frac{1}{n-1}\bigg(\sum xy - \frac{(\sum x)(\sum y)}{n}\bigg)\\ & = \frac{1}{9-1}\bigg(5348.2-\frac{(45)(1094)}{9}\bigg)\\ &= \frac{1}{8}\bigg(5348.2-\frac{49230}{9}\bigg)\\ &= \frac{1}{8}\bigg(5348.2-5470\bigg)\\ &= \frac{-121.8}{8}\\ &= -15.225. \end{aligned} The covariance between daily rainfall and particulate removed is $-15.225$. Since the value of covariance is negative, there is a negative relationship between daily rainfall and particulate removed. That is the two variables moves together in the opposite direction.