|
计算机科学 2007
Closed Itemset Mining Algorithm Based on Index Array and Compound Frequent Itemset Tree
|
Abstract:
The set of frequent closed itemsets determines exactly the complete set of all frequent itemsets and is usually much smaller than the latter.Based on compound frequent itemset tree(CFIST),CROP is an efficient algorithm for mining frequent closed itemset.However,there are a lot of redundant candidate nodes of CFIST.Correspondingly,the operations,composed of the generation,closure checking,and pruning of those non-closed nodes,lead to high computational cost.In this paper,CROP-Index,which is an improved algorithm for mining frequent closed itemset,is proposed.Firstly,the "index array" is proposed,which is used for discovering those itemsets that always appear together.Then,based on bitmap,an algorithm for computing index array is presented.Furthermore,frequent items are merged to initial nodes of CFIST according to heuristic information provided by index array.Finally,some new properties,which can avoid the generation of redundant nodes,are proposed.Thus the improved algorithm only generates closed nodes.Correspondingly,the unnecessary operations are avoided,and the search space is reduced to certain extents.The experimental results show that the proposed algorithm is efficient.