In one regression, the F-Stat is 157 and the regression has 3 independent variables. However, the t-test show that not all coeffisients are significantly different from 0. I thought a high F-statistics indicated that all independent vaiables are significantly different from 0? Or can it still be high even though not all coeffisients are significantly different from 0?
The F-test is a null of b1=b2=…=bi=0 (except the intercept), while the alternative is that at least 1 bi is not equal to zero. If you have a significant F-test, it can only tell you that at least one of those coefficients is nonzero (in other words, as a group is there any benefit from using these variables to predict the DV?-- if yes, you could use more tests to determine which variables are significant-- if it were needed).
You certainly can have a significant F-test with only one significant t-test.
It’s important to note that the t-test and F-test ask a different question, and this should be kept in mind when interpreting results (and dealing with multicollinearity, for example, it makes it clear that there is no conflict between the F-test and t-tests).
As I mentioned, the F-test says “As a group, is there something valuable among the tested variables?” (and recall that if yes, we can only conlcude that at least 1 of the coefficients is nonzero)
The T-test says “After accounting for the other variables in the model, and holding all else constant, is this one coefficient nonzero?” You can also view this question as, “If I have the other variables to explain the DV, does this one variable do anything additional to predict the DV (holding all else constant)?”
Hope this helps!
You can have a signficiant F-test and zero significant t-tests.
A non-significant t-test does not mean that the coefficients are zero, just that you are not confident enough to say they are different from zero. You could be borderline, for example you could be 85% sure each one is different from zero, but that’s not enough to reject the null hypothesis. So you are 85% sure on each coefficient. It’s unlikely that they are all not different from zero. And that’s what the F-test tests. It’s saying, what are the chances that all these borderline t-tests are actually zero.
Yes sir, you can have a significant F statistics with just only one significant independent variable.
The rule is that at least one independent variable must be significant for F-stat to be significant.
Agreed, this is what I was alluding to with multicollinearity appearing to have contradictory results (when, in fact, they are logical).
Respectfully, this is not what it means. It means that the data don’t provide (statistically) compelling evidence to suggest that the null can be rejected (i.e. the evidence only says you can’t disagree with the null). But, this doesn’t mean you’re “not confident enough”.
This isn’t a correct understanding of confidence levels and terminology (and how they relate to hypothesis testing). There is no concept of “sureness” when it comes to confidence levels.
Against the alternative that at least 1 of the coefficients is nonzero (this is really the “tested” hypothesis).
Again, respectfully, that’s not at all what the F-test is testing. The F-test is looking answering the question, “Do these variables, taken all together, significantly improve our ability to explain the variation in the DV?” It doesn’t answer the question in the probabilistic manner that you described.
I would caution against this “rule”, because multicollinearity could give the appearance of no significant t-tests with a significant F-test. You wouldn’t need at least one t-test to be significant for a significant F-test (think about a case of multicollinearity), because these tests are providing answers to different questions.
Semantics. The inputs into the t-test say that, given the null hypothesis is true, these results would’ve occured by chance at least X% of the time. High X that is less than the level needed to reject the null hypothesis is still evidence that the null hypothesis is false. When you have 5 independent variables all having results that would’ve occurred less than 15% of the time by chance alone would make it unlikely that all 5 of them would’ve occurred by chance at the same time.
Sure, because words don’t have meaning-- especially when they’re used in a specific context. The “semantics” response implies sloppiness, misunderstanding, or some combination of these.
It sounds like you’re attempting to explain a p-value. This is close, but it needs a slight adjustment. You could say, “given that the null hypothesis is true, these results, or more extreme results, would be seen x% of the time.” There is certainly a difference between what you have said and what I have said.
It’s more conventionally described as there being evidence (sufficient evidence) in favor of the alternative hypothesis (reject Ho, but not that “Ho is false”).
Looking at your statement alone, the logic isn’t bad, but it’s not really related to the questions answered by the F-test or t-test.
Sorry, but it seems to be some disagreement here. Tickersu says a significant F-stat requires at least one bi not equal to zero and Air writes you can have a significant F-test and zero significant t-tests. What is the truth? Is it only in the case of multicollinearity you can have significant F-stat and no significant t-tests?
You can have a significant F-statistic and no significant t-statistics for the individual slopes.
tickersu didn’t say anything to contradict this.
Allow me to try clarifying what I said and meant. What I said is that when you conduct this type of F-test , the null and alternative hypotheses are as follows:
Ho: b1=b2=…=bi=0 (all coefficients are not different from zero/ the group of independent variables is not statistically significant for predicting the DV)
Ha: at least 1 bi is not equal to zero (if you reject Ho, this is the only conclusion that you can make from this test of hypothesis-- as a group, the terms are statistically useful in the prediction of the DV. However, we only know that at least one bi is not equal to zero).
Notice that I’m not saying what the t-tests must be, because the F-test isn’t addressing that question. It’s very important to understand what question each type of test answers.
The F-test answers the question of group significance, while a t-test answers the question of individual significance (holding all else constant and assuming all other terms are in the model).
Let’s pretend we have X1 and X2, and the respective betas are b1 and b2. Assume that this is the correct model:
E(y) = b0 + b1x1 +b2x2
The F-test would be:
Ho: b1=b2=0
Ha: at least 1 of the tested betas not equal to zero
Suppose we reject Ho. This means that we have sufficient evidence (at some chosen significance level) to conclude that at least one of the coefficients is nonzero (at least one of those variables is statistically significant for predicting the DV).
Now, let’s imagine that X1 and X2 have a sufficiently high pairwise correlation to cause each t-test to appear nonsignificant. The null and alternative for these t-tests, generically (practical interpretation in parentheses):
Ho: bi = 0 (assuming all else is held constant, and that all other variables remain in the model, xi is not statistically useful for predicting the DV).
Ha: bi not equal to zero (assuming all else is held constant, and that all other variables remain in the model, xi is statistically useful for predicting the DV).
Now, what would this mean in our example where we have a significant F-test, but multicollinearity has caused the t-tests to appear nonsignificant? Is this contradictory?
What it means:
-the F-test told us that at least one of the variables, X1, X2, or both are useful for predicting the DV. Note, though, that it didn’t tell us which variable(s)-- it only said you need at least one from the group.
- the t-tests each said, “When testing the null of b2=0: if X1 is in the model, and all else is held constant, X2 does not contribute in a statistically significant manner to predicting the DV.” AND"When testing the null of b1=0: if X2 is in the model, and all else is held constant, X1 does not contribute in a statistically significant manner to predicting the DV."
This should make sense since we already said that X1 and X2 exhibit a high pairwise correlation (in this example, a fair indication of collinearity). If these variables contribute a lot of the same information to predicting the DV, then do we really need both of them? According to (either of) the t-tests, we don’t, and this is congruent with the F-test saying that we need at least one of the terms/variables.
Are the F-test and t-test results contradictory?
-Absolutely not. Remember, the F-test answers the question of group or joint significance, while the t-test says, “if we have the other variables, does this one add anything extra (all else constant)?” If it still seems contradictory, spend some time with it to see that it boils down to two different questions that aren’t contradictory.
Hopefully, this example shows you that the F-test alternative hypothesis is “at least one of the coefficients is nonzero” and that you can have a significant F-test with no significant t-tests.
Let me know if anything is still unclear.
Edited to (hopefully) add clarity.
Thanks. Very helpful.
Glad to hear it.