Data Mining or Time Period Bias

" In a recent meeting, the team discussed how using the last two years of historical data for oil-related industries generated relationships between factors that had not existed in the past. One member of the team stated…"

What bias are we dealing with?

The answer says: time period which I get because it is sensitive to the time period chosen, but then it could be data mining too no? I mean you found a relationship that we didn’t have before.

In this particular example, I feel it’s so subtle the difference between data mining and time period

It doesn’t matter how many times I do this question, I keep choosing the wrong answer.

Can someone pls clarify it for me?

What’s the definition of data mining?

1 Like

its when you find relationship among variables that didn’t exist before

Is that the definition in the Level III curriculum?

  • Data-mining bias arises from repeatedly searching a dataset until a statistically significant pattern emerges. It is almost inevitable that some relationship will appear. Such patterns cannot be expected to have predictive value. Lack of an explicit economic rationale for a variable’s usefulness is a warning sign of a data-mining problem: no story, no future.6 Of course, the analyst must be wary of inventing the story after discovering the relationship and bear in mind that correlation does not imply causation.

But you see my point exactly? The problem states they found a relationship that didn’t exist before

I do.

Do you see my point?

(Hint: 13 December 2001 was the first day I had been married longer than I had been single.)

so I guess, in time period, they need to specify the time frame?

Not necessarily.

But for data mining they need to say something about “repeatedly searching a dataset until a statistically significant pattern emerges.”

1 Like