Detecting Multicollinearity

Hey,

Do we assume multicollinearity is present when:

  • the F test is statistically significant and _ none _ of the individual coefficients is significantly different from zero

  • or when the F test is statistically significant and _ at least one _ of the individual coefficients is not significantly different from zero

I remember having read both

Thanks

The Consequences of Multicollinearity - The estimated regression coefficients are consistent and unbiased, but unreliable and imprecise. - The standard errors are inflated so the t-stats of coefficients are artificially small and cannot reject the null hypothesis

Test for Multicollinearity:

High R2 and significant F-statistic even though the t-statistic on the estimated slope coefficients are not significant

Solution : Omit one or more of the independent variables

Anyone else?

Despite what the book says, this isn’t really a “test”– it’s an informal eyeballing (i.e. there is no hypothesis being tested and no measure of reliability for your conclusion.

The “solution” isn’t the only way to handle multicollinearity (and isn’t always appropriate), and they fail to mention that multicollinearity isn’t necessarily and issue. For the purposes of the exam, though, it seems they want to you remember their “test’ and their suggested solution.

To the OP:

The first statement is a pretty safe bet-- if you have a significant F-test and none of the t-tests are significant for the individual coefficients, there is likely a problematic degree of multicollinearity (assuming you want to interpret coefficients).

The second statement in your post isn’t necessarily true (and it’s not even a special case that I’m proposing–it’s quite common you could have this scenario without MC being an issue). A significant F-test with one non-significant t-test does not guarantee that MC is a problem-- it could just mean that the variable isn’t a useful predictor of the DV.

However, for real life purposes, you could have a significant F-test and some t-tests that are significant and multicollinearity could be an issue. Why? Only the collinear variables have inflated variances (deflated t-tests). For example, a regression of Y on X1, X2, X3, and X4 can have severe multicollinearity due to a high pairwise correlation between X3 and X4. Yet, the F-test could be significant, and the t-tests for X1 and X2 could be normal (not deflated) and significant with X3 and X4 having nonsignificant t-tests due to their collinearity.

If you were doing an analysis, you would most likely look at something called a Variance Inflation Factor (VIF). (You would also look at the signs of the estimated coefficients to see if they fit with the known theory or prior research–if they don’t this could indicate problematic MC.) The VIF tells you how much larger the coefficient’s variance is due to the variable’s relationship with all the other variables in the model. In the context of our example above, a VIF of 5 on X3’s coefficient would mean that b3’s variance is 5 times larger due to how much X3 is related to the group of X1, X2, and X4.

Hope this helps answer your question– just keep in mind what you want to know is more appropriately addressed by going beyond the curriculum. They aren’t going to test you outside of their framework, so keep that in mind when answering questions.

Cant stress this enough. For the multicollinearity questions I have seen they tend to put it on a tee for you. They will ask if a data set shows signs of multicollinearity or heterskedasticity and show you a breusch-pagan test & a seperate correlation of the variables. They dont seem to want to pull a trick on these questions, as the material isnt easy. This section seems to be straight forward, just understand these tests, what they calculate, and how to interpret them

Thanks a lot!

Glad to help.