## Simple linear regression from sum and sum of squares

Let `$(x_i, y_i), i=1,2, \cdots , n$`

be $n$ pairs of observations.

The simple linear regression model of $Y$ on $X$ is

$$y_i=\beta_0 + \beta_1x_i +e_i$$ where,

- $y$ is a dependent variable,
- $x$ is an independent variable,
- $\beta_0$ is an intercept,
- $\beta_1$ is the slope,
- $e$ is the error term.

## Formula

By the method of least square, the model parameters $\beta_0$ and $\beta_1$ can be estimated as

The regression coefficients $\beta_0$ (intercept) and $\beta_1$ (slope) can be estimated as

`$\hat{\beta}_1 = \dfrac{n \sum xy - (\sum x)(\sum y)}{n(\sum x^2) -(\sum x)^2}$`

`$\hat{\beta}_0=\overline{y}-\hat{\beta}_1\overline{x}$`

where,

`$\overline{x}=\dfrac{1}{n}\sum_{i=1}^n x_i$`

is the sample mean of $X$,`$\overline{y}=\dfrac{1}{n}\sum_{i=1}^n y_i$`

is the sample mean of $Y$,- $n$ is the number of data points.

## Example

Given that $\sum x= 45$, $\sum y = 1094$, $\sum x^2 = 244.26$, $\sum y^2 = 133786$ and $\sum xy =5348.2$.

a. Find the equation of the regression line to predict the particulate removed from the amount of daily rainfall.

b. Estimate the amount of particulate removed when the daily rainfall is $x=4.8$ units

### Solution

Let $x$ denote the daily rainfall and $y$ denote the particulate removed.

Let the simple linear regression model of $Y$ on $X$ is

$$y=\beta_0 + \beta_1x +e$$

By the method of least square, the estimates of $\beta_1$ and $\beta_0$ are respectively
`$$ \begin{aligned} \hat{\beta}_1 & = \frac{n \sum xy - (\sum x)(\sum y)}{n(\sum x^2) -(\sum x)^2} \end{aligned} $$`

and

`$$ \begin{aligned} \hat{\beta}_0&=\overline{y}-\hat{\beta}_1\overline{x} \end{aligned} $$`

The sample mean of $x$ is
`$$ \begin{aligned} \overline{x}&=\frac{1}{n} \sum_{i=1}^n x_i\\ &=\frac{45}{9}\\ &=5 \end{aligned} $$`

The sample mean of $y$ is
`$$ \begin{aligned} \overline{y}&=\frac{1}{n} \sum_{i=1}^n y_i\\ &=\frac{1094}{9}\\ &=121.5556 \end{aligned} $$`

The estimate of $\beta_1$ is given by
`$$ \begin{aligned} b_1 & = \frac{n \sum xy - (\sum x)(\sum y)}{n(\sum x^2) -(\sum x)^2}\\ & = \frac{9*5348.2-(45)(1094)}{9*(244.26)-(45)^2}\\ &= \frac{-1096.2}{173.34}\\ &= -6.324. \end{aligned} $$`

The estimate of intercept is

`$$ \begin{aligned} b_0&=\overline{y}-b_1\overline{x}\\ &=121.5556-(-6.324)*5\\ &=153.1756. \end{aligned} $$`

The best fitted simple linear regression model to predict particulate removed from daily rainfall is
`$$ \begin{aligned} \hat{y} &= 153.1756+ (-6.324)*x \end{aligned} $$`

The estimate of the amount particulate removed when the daily rainfall is $4.8$ (0.01 cm) is

`$$ \begin{aligned} \hat{y}&=153.1756 + (-6.324)\times 4.8\\ &= 122.8204\quad \mu g/m^3 \end{aligned} $$`