First Difference Regression

does anyone know how to relate differenced variables to original variables in regression? so im running an auto regression on rates so R(T+1) = b * R(T) but I take teh first difference of this so it becomes R(T+1) - R(T) = B * (R(T) - R(T-1)) i want to know the relationship between b and B. i read somewhere that B = b - 1 but i dont know if this is true or not, and how do i prove it? thanks!

If B=b-1, then R(T+1) = b * R(T) - R(T-1)*(1-b) which is not what you want. As far as I recall, rates are normally I(1), though you’d want to test, so you should probably probably use the second equation. Just as a sample test, I did models 1 and 2 on U.S. 10-year Treasury yields and got b=0.99 and B=0.1, confirming it’s not necessarily the case that B=b-1. You could have been mixed up about the dickey fuller test. Basically to test for unit roots, you would subtract R(T) from both sides of the first model and get R(T+1)-R(T)=(b-1)*R(T), which is slightly different than your model. If b-1 is not statistically different from zero (using a Dickey-fuller table), then there is a unit root.

Basically E(B)=E(b), but B has a higher standard error. It does look like you’re getting confused with the DF test, as jmh says.

thanks guys! so im pretty confused now. im trying to estimate mean reversion speeds for rates and volatilities, using an AR(1) model. i regress teh equation R(T+1) = b * R(T), and I get the coefficient b from the regression output. then i take 1 - b to get daily mean reversion speed. (i do 1 - b because when i simplify the black-karasinski equation to make it an AR(1) process, the mean reversion parameter in BK is equal to 1 - b). then i annualize this mean reversion speed by either multiplying by 260 ro multiplying by sqrt(260). the problem is, generally a mean reversion of around 4% (or somewhere around there) is used by most people in interest rate models. however, in order for me to get anything close to 4% annualized, that would mean a very high daily regression coefficient, very close to 1, and in that case, it probably wouldn’t pass stationarity tests. the way im doing it, i end up with mean reversions of more than 60%. any suggestions? thanks a lot!

Wouldn’t your autoregression equation have a constant in it? A true mean reversion model would have something like: R(t+1) = R(t) + b * [R(t) - mu] where mu is the long term mean. This basically says that next period’s rate is the last period’s rate, adjusted upwards or downwards by some fraction of the rate’s distance from the long term mean. if b>0, then there’s momentum. If b<0 there is mean reversion. If |b| > 1, then the regression is unstable. If |b| = 1, then the series has a unit root and is basically a random walk. Now I think I see where you might get something that looks like (1-b). R(t+1) = (1 + b) * R(t) - b*mu but if we assume that b < 0, which would be a 1-period mean reversion, you’d get R(t+1) = (1 - c) * R(t) + c*mu where I’ve defined c = -b Then you solve for c, given that Regression Slope = 1-c , Regression Intercept = c*mu or… c = 1 - (Regression Slope) mu = (Regression intercept) / c

thanks a lot bchad! you’re right, i forgot to include the constant: so basically, this is the Black Karasinski Process: d log ® = a (mu - log ®) dt + theta * dz where a is the mean reversion rate if i simplify that equation i get: log (R(T+1)) = (1 - a) log (R(T)) + a * mu + theta * dz this i modeled as an AR(1) and the regression coefficient i got was 1-a and i would subtract 1 from that to solve for a. then since my rates were daily, i multiplied by 260 to annualize a. however, now i am finding out that my data was not stationary so i couldnt do it like that. i think because the regression coefficients were really close to 1, and then my a becom really close to 0 and when i multiplied it by 260 i got something in the 5 % range for rates and something in the 20% range for volatilities. what i was trying to do was take differences and then do the same thing with differenced data, but my coefficients were now coming out to be small (which would mean very high mean reversion after i subtract 1, ie 98% daily mean reversion)). then i thought that if B = 1 - b in the equation i wrote above, that it would be 1 - (1 - b) and so that would just be b and i could multiply by coefficient directly by 260. but now i guess i cant do that. any suggestions? appreciate the advice! i even tried de

If you are doing daily data, I would expect that rate coefficients are really close to 1 and therefore potentially nonstationary at that frequency. It is possible that a higher frequency nonstationary process is dominated in the long term by a lower frequency stationary process, but that you wouldn’t be able to pick this up using daily data, because the long term mean reversion is just too small for the amount of daily noise. Essentially this means that you can’t tell where rates are going to be in a week, because it’s essentially a random walk from day-to-day, but you might be able to make some guess about where they are in a year, because the slower process dominates in the long term. It’s a bit like waves on the seashore. You won’t be able to predict how far the next wave goes, because it’s a random walk at the frequency of minutes, but you know that in 6 hours, the tide will be out (or in).

First run DF test. If R(t) is stationary, you can run the first regression. If not, you have to run a co-integration test. You can only run your first model, if the series are co-integrated. If not, then you can run the second regression and then extract R(t) from that formula. does that help?

maratikus, that’s the procedure, but what would the series be cointegrated with? it looks like it’s a univariate time series being modeled.

bchad, Well you could do some sort of error-correction framework if you know the underlying relationship (like with you waves example, tide goes in, tide goes out, as O’Reilly says). However, what do you do if you don’t know the underlying relationship and you’re trying to estimate some kind of long memory mean-reversion equation?

thanks for the advice guys! i think i will try computing annual averages and then seeing what happens

bchadwick Wrote: ------------------------------------------------------- > maratikus, that’s the procedure, but what would > the series be cointegrated with? it looks like > it’s a univariate time series being modeled. The original series with the lagged series. Though I think they should be co-integrated as long as returns are stationary. Then b has to equal 1 (because R(T)-R(T-1) is stationary) and the absolute value of B has to be below 1 (because returns are stationary). There is no relationship between b and B.

Interesting. I never thought about testing to see if a series might be cointegrated with its lagged self. How would you interpret the causality of that? With two separate series, X and Y, you can say that cointegration means that both X and Y respond to a common random walk process, Z So Z -> X and Z -> Y. But I’m having trouble wrapping my head around the idea of a process where Z(t-1) -> Y(t-1) and Z(t-1) -> Y(t) I confess that it is a little easier to imagine it now that I’ve written it out in this post, but you probably can still shed light on how you interpret it.

I was taught that cointegration is with two contemporaneous time series. Assume, R(T) is I(1) and d(R(T)) is I(0). By definition, there must be a linear combination of R(T) and R(T-1) that is I(0) (ie. there must be some “cointegrating” relationship) since d(R(T)) is one possible linear combination of R(T) and R(T-1), where b=1, and it is I(0). Hence, you don’t really obtain any new information by saying R(T) is cointegrated with R(T-1). Testing for I(0) is sufficient and normally comes before any cointegration tests.

jmh530 Wrote: ------------------------------------------------------- > I was taught that cointegration is with two > contemporaneous time series. > > Assume, R(T) is I(1) and d(R(T)) is I(0). By > definition, there must be a linear combination of > R(T) and R(T-1) that is I(0) (ie. there must be > some “cointegrating” relationship) since d(R(T)) > is one possible linear combination of R(T) and > R(T-1), where b=1, and it is I(0). Hence, you > don’t really obtain any new information by saying > R(T) is cointegrated with R(T-1). Testing for I(0) > is sufficient and normally comes before any > cointegration tests. That’s exactly what I was thinking about.

bchadwick Wrote: ------------------------------------------------------- > Interesting. I never thought about testing to see > if a series might be cointegrated with its lagged > self. > > How would you interpret the causality of that? > With two separate series, X and Y, you can say > that cointegration means that both X and Y respond > to a common random walk process, Z > > So Z -> X and Z -> Y. > > > But I’m having trouble wrapping my head around the > idea of a process where > > Z(t-1) -> Y(t-1) and Z(t-1) -> Y(t) > > > I confess that it is a little easier to imagine it > now that I’ve written it out in this post, but you > probably can still shed light on how you interpret > it. I’m not quite sure what you mean by causality. I will try to apply my understanding of co-integration to your example. The way I understand co-integration: Two processes are co-integrated if both of them are non-stationary and there is a linear combination of them that is stationary. In your example, let’s assume non-stationary process Z(t-1)=Y(t-1) and stationary differences epsilon(t)=Y(t)-Y(t-1). Then both Y(t) and Y(t-1) are non-stationary and there is a linear combination Y(t)-Y(t-1) that is stationary since it’s equal to epsilon(t) which is a stationary process. We know that Y(t) and Y(t-1) are co-integrated because both of them are non-stationary and there is a linear combination of them that is stationary. Does that help?

OK, so I guess it really just shows that if a series Y(t) is non-stationary, differencing it might make it stationary (or not, since there are lots of ways to be non-stationary). I normally just difference (when a series is clearly trending) and test the differences for stationarity. I never stopped to think that this was equivalent to testing to see if it was cointegrated with its lagged self. That’s fine, and useful… I just thought there might be another interpretation that I was missing. For me, I tend to start with a model in my head that is informed by some causal ideas of what leads to what. When I think about cointegration, I’m almost always thinking “these two series are both influenced/reacting to/caused by something else which is nonstationary… let’s try and subtract out the nonstationary process and see if we can find a relationship with what’s left over.”

bchadwick, I was talking about statistical tests and parameter values. You suggest a smart way of building predictive models. I’m curious whether you build your models to directly benefit from spread trading or use predictability of the spread to improve forecasts?

I actually haven’t been doing spread trading very actively these days, although I do like the thought process that goes into it, because I think it’s pretty important to compartmentalize what you know and know the limits of one’s insights. For example, I definitely spend time thinking about both macro and micro factors affecting a company (though, since I’m more of a macro type, I am not following individual companies very often in my professional life - this might change, though). My framework is to think the macro factors affect our decisions about systemic risk, and the micro factors affect the idiosyncratic risk components. So I like spread trades in theory because they allow you to use information and insights about companies and their relative positions, and quantitative methods are extremely useful for helping you separate out the portion of risk that your insights actually apply to. I don’t think this fully answers your question. I haven’t really use cointegration models all that much, which might be why it’s hard to answer squarely. Part of the issue is that I feel that the more mathematically sophisticated I get with stuff, the less I actually trust the models… because it’s just too easy to slip in assumptions that are completely off base or can turn on a dime. But I do like trying to think through the models as rigorously as I can, even if I know they don’t fully apply, because there are often insights about price and/or economic behavior that come out of it. Ultimately I am looking for simple systems that deliver decent returns most of the time without enormous tail risks, and then little insights (often qualitative, but quantitative as well) that can help me spot extra juicy opportunities quickly because of little insights that most people don’t think about that I’ve picked up through this process. The insights and opportunities don’t come all that often, but when they do, it’s usually a heuristic thing informed by some observation that came out when I was thinking through the math or the economics or the behavioral stuff in some other model. I also have to teach this stuff, occasionally, as I did for some Level II candidates. So cointegration is really something I “know about” more than something I “do.” That could change, though.

maratikus - To your point, most literature focuses on the latter (as far as ‘complete’ models go), rather than the former, though my sense it takes a certain level of infrastructure to prove viability in practice. Has this been your experience?