OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

International Journal of Computer Trends and Technology 2012

Document Clustering in Web Search Engine

A.S.N.Chakravarthy, Deepthi.S, K.Satyatej, Sk.Nizmi, S.Sindhura

Keywords: Document clustering , k-means , Fast kmeans algorithm

Full-Text Cite this paper Add to My Lib

Abstract:

As the number of web pages grows, it becomes more difficult to find the relavant documentsfrom the information retrieval engines, so by using clustering concept we can find the grouped relavant documents. The main purpose of clustering techniques is to partitionate a set of entities into different groups, called clusters. These groups may be consistent in terms of similarity of its members. As the name suggests, the representative-based clustering techniques uses some form of representation for each cluster. Thus, every grouphas a member that represents it. The main use is to reduce the cost of the algorithm, the use of representatives makes the process easier to understand. The most popular Clustering technique is the k-means algorithm where it has a lot ofdisadvantages, it works very slow and it is not applicable for large databases. So fast greedy kmeans algorithm is used, which overcomes thedrawbacks of k-means algorithm and it is very much accurate and efficient. So we introduce an efficient method to compute the distortion for this algorithm.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133