Why do they divide the sum of the residuals by n-(k+1)?
Can someone explain this?
Why do they divide the sum of the residuals by n-(k+1)?
Can someone explain this?
Well, k represents the no. of independent variables which we divide when calculating MSR… so we are left with n-k… To that., we just add +1 to calcluate the sum of errors to get the mean sum of errors… when you have MSR and MSE, you can now calculate the F stat, which shoud be higher the better, as we would always want MSR to be high and MSE to be low…Makes sense?
Because I guess your example is a multiple regression (i.e. Y is explained by several independent X variables).
To get the mean squarred error, or the mean of the sum of the residuals, you divide SSE by the degree of freedom. n-(k+1) is the degree of freedom for a multiple regression, while in a _ simple _ regression, the df is simply n-2.
The explanation’s a bit long (sorry), but it’ll give you a clear understanding.
Suppose that you have a sample of n random variables – returns, prices, whatever – and you compute the mean of that sample. If you were to take another sample of size n from the same population, then to get the same sample mean that you calculated, n – 1 of those (new) variables can be any values at all – they can vary freely – but the last one cannot vary freely; once the first n – 1 values are chosen, the last value is forced on us to ensure that we get the same sample mean. So, by having calculated the sample mean, we have lost one degree of freedom in our sample of random variables.
Suppose now that we have computed the sample mean and the sample standard deviation. Now, if we were to take another sample of size n from the same population, n – 2 of those (new) variables can vary freely, but the last two values are forced on us to ensure that we get the same sample mean and sample standard deviation. By having calculated two values – mean and standard deviation – we have lost two degrees of freedom.
Similarly, if we’re doing a simple (linear) regression, with n data points ( _x_1 , _y_1 ), ( _x_2 , _y_2 ), . . . ( xn , yn ), every time we calculate a regression coefficient we lose one degree of freedom. So if we calculate a slope and an intercept, we have lost two degrees of freedom: given a new set of n data points with the same x values but different y values, n – 2 of the y s are free to vary, but the last two are constrained to ensure that we get the same slope and intercept.
If we’re doing a multiple (linear) regression, then we’ll compute one intercept and k slopes; that’s k + 1 values we’ve computed, so that’s k + 1 degrees of freedom we’ve lost. If we choose a new set of n data points, the first n – (k + 1) can vary freely, but the last k + 1 values are forced on us to ensure that we get the same intercept and k slopes.
Brah…
Thank you so much for this. I completely understand this now. This is an incredible explanation.
My pleasure.
Then my mission was a success.
You flatter me.
back atcha.