OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

南京师范大学学报(自然科学版) 2010

聚类初始中心点选取研究

, PP. 161-165

杨天霞,王治和,王华,王凌云

Keywords: k-均值,序列模式,huffman树,聚类,初始中心

Full-Text Cite this paper Add to My Lib

Abstract:

研究了利用已发现的频繁序列模式对序列数据库进行再聚类再发现的问题,针对已有的k-均值聚类算法随机选取初始中心点而导致聚类结果不稳定性的缺点,提出了一种基于huffman思想的初始中心点选取算法――k-spam(k-meansalgorithmofsequencepatternminingbasedonthehuffmanmethod)算法.该算法能够在一定程度上减少陷入局部最优的可能,而且对序列间相似度的计算采用一种高效的"与"、"或"运算,可极大提高算法的执行效率.

References

[1]	agrawala,srikantr.miningsequentialpatterns[c]//taipe:iprocofthe11stintconfondataengineering,1995:3-14.
[2]	kaufmanl,roueeeuwpj.findinggroupsindata:anintroductiontoclusteranalysis[m].newyork:johnwiley&sons,1990.
[3]	morzyt,wojciechowskim,zakrzewiczm.scalablehierar-chicalclusteringmethodforsequencesofcategoricalvalues[c]//procofthe5thpacific-asiaconferenceonknowledgediscoveryanddatamining(pakdd),lecturenotesincomputerscience2035.newyork:springer-verlag,2001:282-293.
[4]	ayresj,gehrkeetalj.sequentialpatternminingusingabitmaprepresentation[c]//procofthe8thacmsigkddintconfonknowledgediscoveryanddatamining.edmonton,2002:429-435.
[5]	严蔚敏,吴伟民.数据结构[m].北京:清华大学出版社,2007:144-145.
[6]	uci数据集[db/ol].[2008-03-13].http://download.csdn.net/source/378926.
[7]	ibmalmadenresearchcenter.questdataminingproject[db/ol].(1996-03-12)[2007-05-26].http://www.almaden.ibm.com/cs/quest/syndata.html.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133