In the wiley guides, Ex 5: R2 vs Adj R2, it states just below the blue blox that total variation in the indep variable SST is the same in both the 2 indep variable and 3 indep variable reggression models?? I would think that the more indep variables you add, the more likely R2 would increase and therefore Total variation would also increase, correct?
SST is Total Sum of Squares, right? So, it is the variation of the dependent variable (Y). The dependent variable is what we want to explain using other variables (independent X), so in conjuntion, the variation of the X variables are called RSS (Regression Sum of Squares). The remainning portion is the error or residual, also called the unexplained portion, SSE (Sum of Square Errors).
Summing up,
SST = RSS + SSE
SST is given, it does not change with the number of independent variables in the regression. Indeed is the RSS who change with the number of independent variables.
R2 increases when the independent variables (in conjuntion) are the adequate variables to explain the variation of the dependent variable. This is called “good fit”. So, since SST is given, R2 increases when RSS increases and hence SSE decreases.
Hope this helps.
It does. Thank you, Harraogath So basically the varaition in Y stays the same as does in X. Yet the explained amount is what can vary if you add more variables along with the unexplained… the total will not change.
Good. Y has a given variation, also does X, however remember that X is a “bag” of variables. So we are talking about that the variation of this bag of variables must try to fit the variation of Y.
Correct.
The total variation (SST or variation of Y) is given and certain to any variable you want to set as the dependent variable because it is just the VAR(Y)*n.