The variance of the residual term is constant for all observations.

The variance of the residual term is constant for all observations. I don’t understand this sentence for correlation. I can understand the other assumptions. My question is: The residual term is the difference from the actual observation to the line, there should be only one number for the variance of the residual term for all observations. How come the word “constant”? Sorry I’m not a native speaker of English. Thank you for your help.

Take any subperiod and check the variance of the error terms. If it’s statistically different from any other subperiod, then the variance is not constant, i.e., it’s not homoskedastic, meaning the regression equation is no good.

i.e. time to run a BreuschPagan test, if heteroskedasticity does exists… use robust standard errors to correct

Thanks. That’s clear.

Just to add to what the others said … remember that EVERY data point has an error term.

oooh… is this part of econ for level 2?!

June2010 Wrote: ------------------------------------------------------- > there should be > only one number for the variance of the residual > term for all observations. Each data point can be above or below the regression line. The difference constitutes the error term. So the variance of all these error-terms from all the data-points should be constant or we are in for heteroskadisticty.

> So the variance of all these error-terms from all the data-points should be constant or we are in for heteroskadisticty. just to pick on you, sure the variance is constant, by definition. The variance of all the error-terms from all the data-points is surely constant, so what’s your point?

Variance “SHOULD” be constant - not “IS” Constant. So you have a set of sample data points (X,Y) which you used to derive your single linear regression equation. now you substitute back the X,Y coordinates for a point - you now get a different Y-Val (NewY) e.g. (bcos of the process of making the regression equation). NewY - Y = your error term. Sum of Square of those error terms should be minimum (for all the points combined) and also should be a constant for all the different X,Y points in your sample (individually) – if not the line is not a good fit to the points.

> Sum of Square of those error terms should be minimum (for all the points combined) Almost correct. In order to get a good fit, this error should be small, not necessarily minimum. You could have a minimum sum of errors, but the fit is bad. You want a small amount relative to total variation. > and also should be a constant for all the different X,Y points in your sample (individually) This is not correct. Every actual point is set apart from the estimated point by a different amount, it is not constant. What “constant” means is that the variance of any string of y’s should be constant, as I indicated earlier. Feel free to disagree.

What’s the confusion here? That’s what we wanted to convey in our earlier posts. When the real data points are plotted and feeded to the Regression software, the software tries to plot the ‘best-fit’ line called the regression line. Which as you stated should have small error. And the constant term here is referred to variance of the error term and not the difference between the actual-data-plots and the points generated by the regression line. Since some data-points can be way above the best-fit line, some can be above but-close, some can be way below and/or any other combination. So if the variance is not constant then we are heteroskedastic. This is what I could remember from last time when I did L2. I still have to break-open the L2-09 books.

"Sum of Square of those error terms should be minimum (for all the points combined) " looks like a description of least squares to me. Agree that the second statement is pretty clumsy. The variance of the residuals being constant si one of those niggly regression assumptions that is almost surely never true.

June2010 - If you tell me your email address, I can send you a picture that explains this. (I am a visual learner)

Leveltwo, can you send me that picture? Pauls234 at yahoo.com Thanks!

Looking at a picture with heteroskedasticity in regression is the right way to go. There is one in Schweser notes that clearly explains this issue.

Paul - Sent.