全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Document Clustering in Web Search Engine

Keywords: Document clustering , k-means , Fast kmeans algorithm

Full-Text   Cite this paper   Add to My Lib

Abstract:

As the number of web pages grows, it becomes more difficult to find the relavant documentsfrom the information retrieval engines, so by using clustering concept we can find the grouped relavant documents. The main purpose of clustering techniques is to partitionate a set of entities into different groups, called clusters. These groups may be consistent in terms of similarity of its members. As the name suggests, the representative-based clustering techniques uses some form of representation for each cluster. Thus, every grouphas a member that represents it. The main use is to reduce the cost of the algorithm, the use of representatives makes the process easier to understand. The most popular Clustering technique is the k-means algorithm where it has a lot ofdisadvantages, it works very slow and it is not applicable for large databases. So fast greedy kmeans algorithm is used, which overcomes thedrawbacks of k-means algorithm and it is very much accurate and efficient. So we introduce an efficient method to compute the distortion for this algorithm.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133