全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Non-hierarchic document clustering

Keywords: information retrieval , document clustering , clustering , algorithms , cluster analysis , automatic classification , genetic algorithms

Full-Text   Cite this paper   Add to My Lib

Abstract:

Cluster analysis, or automatic classification, is a multivariate statistical technique that seeks to identify groups, or clusters, of similar objects in a multi-dimensional space. There have been many attempts over the years to use such procedures for the organisation of document databases, so that documents with large numbers of index terms in common are grouped together. In this paper, we consider the use of a genetic algorithm, henceforth a GA, for document clustering. GAs are a class of non-deterministic algorithms that derive from Darwinian theories of evolution. They provide good, though not necessarily optimal solutions to combinatorial optimisation problems, where the number of possible solutions is far too great for all of the possibilities to be explored in a reasonable time by a deterministic algorithm. One such problem is that of non-hierarchic clustering, where the clustering method seeks to partition a set of objects into a set of non-overlapping groups so as to maximise some external criterion of goodness of clustering , typically the extent to which the within-cluster inter-object similarities are maximised and the between-cluster similarities minimised.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133