Assumptions of the Multiple Linear Regression Model

The model is based on 6 assumptions. When these assumptions are hold the regression estimators are unbiased, efficient & consistent. Unbiased means that the expected value of the estimator is equal to the true value of the parameter. Efficient means that the estimator has a smaller variance. Consistent means that the bias and variance of the estimator approach zero as the sample size gets larger. 1. The relationship between the dependent variable, Y, and the independent variables, X1, X2… Xk, is linear. 2. The independent variables are not random & no exact linear relation (perfect 1) exists between two or more of the independent variables. 3. The expected value of the error term, conditional on the independent variables is 0. 4. The variance of the error term is constant for all observations. i.e. errors are Homoskedastic 5. The error term is uncorrelated across observations (i.e. no serial correlation). 6. The error term is normally distributed. It is important to note that Linear regression can’t be estimated when an exact linear relationship exists between two or more independent variables. But when two or more independent variables are highly correlated, although there is no exact relationship, then it leads to multicollinearity problem. Even if independent variable is random but uncorrelated with the error term, regression results are reliable.

The model is based on 6 assumptions. When these assumptions are hold the regression estimators are unbiased, efficient & consistent. Unbiased means that the expected value of the estimator is equal to the true value of the parameter. Efficient means that the estimator has a smaller variance. Consistent means that the bias and variance of the estimator approach zero as the sample size gets larger. 1. The relationship between the dependent variable, Y, and the independent variables, X1, X2… Xk, is linear. 2. The independent variables are not random & no exact linear relation (perfect 1) exists between two or more of the independent variables. 3. The expected value of the error term, conditional on the independent variables is 0. 4. The variance of the error term is constant for all observations. i.e. errors are Homoskedastic 5. The error term is uncorrelated across observations (i.e. no serial correlation). 6. The error term is normally distributed. It is important to note that Linear regression can’t be estimated when an exact linear relationship exists between two or more independent variables. But when two or more independent variables are highly correlated, although there is no exact relationship, then it leads to multicollinearity problem. Even if independent variable is random but uncorrelated with the error term, regression results are reliable.

Did you ask this question and answer it yourself? I guess that’s okay. A little odd, maybe. Anyway, if all you want is unbaised, efficient, and consistent then you can relax some of those assumptions. You can relax the stuff about the independent variables being non-random. Indeed most times you do regressions, you will have random independent variables. You can also relax that requirement of normality of the error for unbiased and consistent. In fact the OLS regression estimators have all kinds of great properties if the error distribution is mean 0 with finite variance (efficient is about distributions however).