This is perhaps a basic question. Can someone please help me understand what the following statements mean ? If independent variables are lagged values of the dependent variable, serial correlation renders them invalid. If independent variables are not lagged values then they are valid. What does “independent variables are lagged values of dependent variables” mean ? Can you give me an example ? Why does serial correlation render the independent variables invalid if they are lagged ?
Linear regression Y = a + b*X + noise, where X - independent variable and Y - dependent variable. If you feed lagged value of X in linear regression: X_t = Y_{t-1} or Y(t) = a+b*Y(t-1) + noise(t) Serial correlation of Y(t) will cause serial correlation of noise(t) which violates assumptions of linear regression. Does that answer your question?
Thanks Maratikus ! Got it. So the dependent variable from a previous period is used as the independent variable for the regression equation - thats why its invalid. If the independent variable was not a lagged value, the independent variable is valid but there is still an effect on regression with noise being understated due to serial correlation.
Geez…I don’t know what it means either. What is an invalid independent variable? What’s wrong with Y(t) = a+b*Y(t-1) + noise(t) (which is just a plain vanilla AR(1) model).
Sorry Joey - new to Level 2 Quant. Confused as one early section (model misspecification) says use of lagged variables is a no no especially if there is serial correlation but a later section on autoregressive model shows that AR model uses lagged variables and is used if there is serial correlation.
JoeyDVivre Wrote: ------------------------------------------------------- > Geez…I don’t know what it means either. What is > an invalid independent variable? > What’s wrong with Y(t) = a+b*Y(t-1) + noise(t) > (which is just a plain vanilla AR(1) model). You are right, Joey. It is AR(1) model. However, there can still be problems when serial correlation of residuals is present. Anyway … what’s important is to test whether the model is specified propertly or not before using it.
I am also confused with that, sejaldavar. In one of the first sections, it says that a model misspecification is to have a lagged variable. Then you get to the section on autoregression and lagged variables are obviously incorporated. Did I miss a large concept here? Can anyone clear this up? Thank you in advance.
Another L2 newbie here. It is talking about validity of linear regression model here, which is different thing from time series. If you have lagged variable as independent variable, error term would become serial correlated. It violates the assumption of linear regression. So it is an invalid model.
IMO, it is not a model misspecification when the regression has lagged dependent variable as an independent variable PROVIDED the error terms are not serially correlated ie autocorrelated (DW test) for a AR model to be valid, it does need to be free from serial correlation of the residuals.
I don’t know what they are saying but a regression model isn’t valid or invalid. It might be misspecified or improperly estimated. Now if you have an assumption that errors are independent then fitting an autoregressive term would be a misspecification.
I think it is much easier to look at linear regression and time series as 2 different problems. An example is that linear regression is about relationship between 2 or more time series. And time series is about relationship between variable and its lagged self.
Another level 2 newbie reopening this thread… (10 years later)
What I’ve got is that using a lagged is bad if error terms are serially correlated; and that serial correlation of error term is one of the three key things that could go wrong (std error too small, etc).
What I’m missing is the connection of how a lagged independent and a serially correlated error term are particularly bad in combination?
Any rationale or insights much appreciated! Cheers.