Thursday, October 19, 2006

The Curse of Data Mining

Most data miners are familiar with the Curse of Dimensionality - computation time is an exponential function of the number of the dimensions. The statistical version is that the amount of data needed to carry out a low variance inference is an exponential function of the dimension.

In data mining it often said that the 80% of a data miners time is spent on data preparation. I think the nature of data mining suggests that it will be very hard to break this barrier. In some sense it is the curse of data mining.

0 Comments:

Post a Comment

<< Home