Closet Data Miners
There is a tendency, especially among some statisticians and econometricians, to be condescending about "data mining". This is well known, understandable and can be explained as ignorance. It is the job of data miners to educate them and bring them to the 21st century. Recently, I noticed that even people involved in "machine learning" tend to look down upon data mining. Now that is a surprise -especially given the history of failures associated with AI. Paradoxically, the leaders in different fields tend to be most open to data mining. Partly because they are not shy to admit that all scientific inquiry involves some sort of data mining activity. In some sense all scientists are closet data miners.
When one of our papers was rejected at IEEE ICDM 2006, one of the reviewers can down on "association rule mining" like a ton-of-bricks. He (or She) made the preposterous claim that association rule mining cannot be used for classification because it is based on frequency and we should look at other machine learning based algorithms! Hello, what about classification algorithms which use Bayes rule (for e.g., Naive Bayes). Isn't that based on frequency. Data mining is unashamedly an inductive task - and it has to be based on counting.
I like to think of association rule mining framework as an "algorithm design pattern". In spirit similar to dynamic programming - "subproblems of optimal problems must be optimal". In association rule mining it is the anti-monotonic condition which is the key insight - "subsets of interesting sets must be interesting". How we go about using this insight is now upto us. Who would have have thought that Maximum A Posteriori (MAP) estimate for a Markov Random Field could be derived using a max-flow min-cut algorithm. Similary, the most fundamental advance in 21st century statistics is probably Gibbs sampling, and that appeared in an IEEE PAMI paper on image segmentation. Looks like for some machine learners the world is bookended by PAC estimates and the kernel trick.
When one of our papers was rejected at IEEE ICDM 2006, one of the reviewers can down on "association rule mining" like a ton-of-bricks. He (or She) made the preposterous claim that association rule mining cannot be used for classification because it is based on frequency and we should look at other machine learning based algorithms! Hello, what about classification algorithms which use Bayes rule (for e.g., Naive Bayes). Isn't that based on frequency. Data mining is unashamedly an inductive task - and it has to be based on counting.
I like to think of association rule mining framework as an "algorithm design pattern". In spirit similar to dynamic programming - "subproblems of optimal problems must be optimal". In association rule mining it is the anti-monotonic condition which is the key insight - "subsets of interesting sets must be interesting". How we go about using this insight is now upto us. Who would have have thought that Maximum A Posteriori (MAP) estimate for a Markov Random Field could be derived using a max-flow min-cut algorithm. Similary, the most fundamental advance in 21st century statistics is probably Gibbs sampling, and that appeared in an IEEE PAMI paper on image segmentation. Looks like for some machine learners the world is bookended by PAC estimates and the kernel trick.

1 Comments:
Oh, I think that reviewer is clueless. There are many papers that use assoc. rule mining / freq. mining to extract features for classification. I guess he wouldn't see what those extracted features would be and he also didn't know much about theory of induction (bayes or otherwise) anyway.
Unfortunately, data mining is not a very theory-driven field...
Post a Comment
<< Home