Point of using squares rather than absolute value

Ok so this may be more of a math theory question, but it’s something that bugs me.

What is the advantage of using the sum of squares and then taking the root, as opposed to just using the standard error? I’m reading about using the RMSE and I don’t see why this would be used instead of the seemingly simpler option.

What are you talking about? Variance and standard deviation?

Sort of.

In the Time Series reading it tells me that the root mean square error is how we compare the accuracy of different forecasting models for out of sample predictions (another side question…is RMSE used for in-sample predictions as well, and if not, why not? It doesn’t say in my notes…). My question is why do we average out the squared errors and take the root of that rather than simply take the average of the absolute value of the errors? I understand that they would give slightly different values, but I don’t understand why using squares is more desirable.

I suppose it relates to variance as well. To find standard deviation why do we take the mean of the sum of squared differences and then square root it, rather than just take the average of the sum of differences? I feel like this was explained briefly in quant level I, but I can’t remember and I’m having issues finding it. Does it have something to do with reducing the impact of outliers?

If you look at the graph of y=|x|, you’ll see a sharp corner at (0, 0). You cannot calculate the derivative of a function at a sharp corner. Statisticians love to calculate derivatives.

The graph of y = _x_² has no sharp corners; you can calculate the derivative everywhere. Statisticians love to calculate derivatives.

In the L1 results thread I noticed a lot of people thanking you for helping them get through the material. Now I understand :slight_smile:

Thanks again.

My pleasure.

One way to calculate variance is to use a sum of square formula that takes into account likelihood of occurence and severity. The Root Mean Squared Error is a frequently used measure.This is the difference between values predicted by a model and the values observed from the environment that is being modelled.