Let’s consider a random variable Y that is related to the independent variable x through the equation
Y = α + βx + ε.
where ε is a random variable with E(ε)=0 and Var(ε)=σ2.
E(ε) = 0 implies that this equation is the true regression line. Thus it’s assumed that our variable obeys the above equation (Walpole et al., p. 391).
In real life, we work with samples; this means that we estimate the parameters of the regression line by using the sample values given. This line is called the fitted regression line and is given by
We want to prove that the unbiased estimator of variance is:
In other words, we want to prove that:
First, we want to calculate
By using formula,
we can write
Now let’s focus on the second term in the right-hand side of the above equation. It’s easy to show that this term equals 0.
It’s easy to show that calculating the parameters of the fitted regression line with the least squares method implies that the same equation holds for the estimated mean values of x and y as well (see Walpol et al., p. 395, last line).
By using the properties of the covariance and the formula of the estimated value of parameter b, we get
Finally, by replacing theformula for the variance of b, we get
- Walpole R.E., Meyers R.H., Myers S. L., Ye K. Probability & Statistics for Scientists and Engineers – Eighth Edition. Pearson Prentice Hall, 2007. ISBN 0-13-187711-9.
- Wenge Guo. Chapter 1 Simple Linear Regression (Part 2), page 7.