|
IMPLEMENTATION OF KEA-KEYPHRASE EXTRAC-TION ALGORITHM BY USING BISECTING K-MEANS CLUSTERING TECHNIQUE FOR LARGE AND DYNAMIC DATA SETKeywords: KEA , K-means , Bisecting K-means , Clusters , Key Phrase Extraction Abstract: In most traditional techniques of document clustering, the number of total clusters is not known in advance and the cluster that contains the target information cannot be deter-mined since the semantic nature is not associated with the cluster. To solve this problem, this work proposes a new clustering algorithm based on the Kea[1] key phrase extrac-tion algorithm which returns several key phrases from the source documents by using some machine learning tech-niques. In this work, documents are grouped into several clusters like Bisecting K-means, but the number of clusters is automatically determined by the algorithm with some heu-ristics using the extracted key phrases. By this it is easy to extract test documents from massive quantities of resources.
|