What does really mean the RSS ?

Hi everyone, I have so many times read that RSS is the Regression Sum of Squares or variation in y that is explained by the factors in the model… How could the variation between the Yi mean and the Y-explained (RSS) explain the fit of the model ? Why do we take the Yi-mean to do the comparison ?

Please speak as you might to a young child, or a golden retriever Thank you !

Your regression model attempts to predict y. this is Yhat, your predicted value.

The difference between Yhat and the mean is explained by the regression formula. (Regression Sum of Squares)

The difference between the actual observed value (Yi) and your predicted value (Yhat) is unexplained. Your model predicts that the observed value Yi should be Yhat given the regression, so why isn’t it? The difference is error.

There’s a few charts on google that may help the idea click. I’m still wrapping my head around it, so cannot explain it much better myself just yet.

It is a measure of the discrepancy between the data and an estimation model. A small RSS indicates a tight fit of the model to the data. It is the difference between Y and Y-explained, not the mean. If the linear function is very close to the actual data, then you have a high correlation, and a high R-squared. Meaning that, the regression explains the data well, and you would have a small RSS, meaning that the error from the model is low.

That would be the SSR, not the RSS

The RSS thing can be confusing.

CFA text uses RSS to describe REGRESSION Sum of Squares and SSE to describe Sum of Squared Errors. In this case, we want a high RSS for model fit.

Other websites use RSS to describe RESIDUAL sum of squares (which is the error, and is the same thing as SSE). We would want the error part to be small, so a small RSS is appropriate if using this definition.

Use the table to avoid confusion. For me SSR and RSS are the same thing

I meant to comment the other day, but I couldn’t remember my account password and was locked out.

It’s a fine point, but you should know that Y minus Y-hat is, in fact, dealing with a conditional mean. In other words, Y-hat is our estimate of the average Y-value, conditional on the values of the independent variables: E(Y| X ) where X represents the particular combination of IV values and E(Y) is the mean or expected value of Y at those settings. Our estimated regression is an estimate of the true regression line, which is a line of true means (so we have an estimated line of true means).

Keep in mind that linear refers to the parameters. The model can include nonlinearities between the IVs and DV. This is an example E(Y) = bo + b1*(x) + b2*(x^2). The point is that a linear correlation can be low, while a model R-squared can be high.