Efficient Discovery of Error-Tolerant Frequent Itemsets in High Dimensions

December 23, 2014

We present a generalization of frequent itemsets allowing for the notion of errors in the itemset definition. We motivate the problem and present an efficient algorithm that identifies error tolerant frequent clusters of items in transactional data (customer purchase data, web browsing data, text, etc.). The algorithm exploits sparseness of the underlying data to find large groups of items that are correlated over database records (rows).

Click here to view and download the full-screen version >>

Stay Informed

Stay Informed