Simple linear regression from sum and sum of squares

Let $(x_i, y_i), i=1,2, \cdots , n$ be $n$ pairs of observations.

The simple linear regression model of $Y$ on $X$ is

$$y_i=\beta_0 + \beta_1x_i +e_i$$ where,

  • $y$ is a dependent variable,
  • $x$ is an independent variable,
  • $\beta_0$ is an intercept,
  • $\beta_1$ is the slope,
  • $e$ is the error term.

Formula

The simple linear regression model parameters $\beta_0$ and $\beta_1$ can be estimated using the method of least square.

The regression coefficients $\beta_1$ (slope) can be estimated as

$\hat{\beta}_1 = \frac{Cov(x,y)}{V(x)}=\dfrac{s_{xy}}{s_x^2}=r\dfrac{s_y}{s_x}$

The regression coefficients $\beta_0$ (intercept) can be estimated as

$\hat{\beta}_0=\overline{y}-\hat{\beta}_1\overline{x}$

where,

  • $\overline{x}=\dfrac{1}{n}\sum_{i=1}^n x_i$ is the sample mean of $X$,
  • $\overline{y}=\dfrac{1}{n}\sum_{i=1}^n y_i$ is the sample mean of $Y$,
  • $V(x) = s_x^2$ is variance of $X$,
  • $V(y) = s_y^2$ is variance of $Y$,
  • $Cov(x,y) = s_{xy}$ is covariance between $X$ and $Y$,
  • $r=\dfrac{Cov(x,y)}{\sqrt{V(x)V(y)}}$ is the correlation coefficient between $X$ and $Y$,
  • $n$ is the number of data points.

Suggestions and comments will be appreciated.

Related Resources