The analysis-of-variance (ANOVA) approach — whose purpose is mainly to analyyze the quality of the estimated regression — is based on the so-called partioning of sums of squares, whose formula is as follows [Walpole et al., p. 415]
In short form, the above formula is indicated as
SST = SSR + SSE
The purpose of this short article is to provide the proof for the above formula — the so-called partition of sums of squares — for the case of regression involving a single independent variable x.
First of all, we start from the expansion of the left-hand side of formula (1.1),
where residuals are indicated by the epsilon character. Recall that the formula for the fitted regression line (involving a single independent variable x) is
Parameters a and b are estimated by the so-called method of least squares, which involves the minimization of the error sum of squares SSE, which means that the derivatives of SSE with respect to a and b are both set to 0:
Equation (1.2) can be rewritten as:
Recalling equation (1.6)
The sum of errors is expected to be almost zero:
We replace those values in (1.7) and get:
which is exactly what we wanted to prove.
References
- Walpole R.E., Meyers R.H., Myers S. L., Ye K. Probability & Statistics for Scientists and Engineers – Eighth Edition. Pearson Prentice Hall, 2007. ISBN 0-13-187711-9.