OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

自动化学报 2011

快速核密度估计定理和大规模图论松弛聚类方法

DOI: 10.3724/SP.J.1004.2011.01422, PP. 1422-1434

钱鹏江, 王士同, 邓赵红

Keywords: 核密度估计,大规模数据集,聚类,抽样子集

Full-Text Cite this paper Add to My Lib

Abstract:

？首先证明了快速核密度估计(Fastkerneldensityestimate,FKDE)定理:基于抽样子集的高斯核密度估计(KDE)与原数据集的KDE间的误差与抽样容量和核参数相关,而与总样本容量无关.接着本文揭示了基于高斯核形式的图论松弛聚类(Graph-basedrelaxedclustering,GRC)算法的目标表达式可分解成“Parzen窗加权和+平方熵”的形式,即此时GRC可视作一个核密度估计问题,这样基于KDE近似策略,本文提出了大规模图论松弛聚类方法(ScalingupGRCbyKDEapproximation,SUGRC-KDEA).较之先前的工作,这一方法的优势在于为GRC作用于大规模数据集提供了更简单和易于实现的方案.

References

[1]	Qian Peng-Jiang, Wang Shi-Tong, Deng Zhao-Hong, Xu Hua. Fast spectral clustering for large data sets using minimal enclosing ball. Acta Electronica Sinica, 2010, 38(9): 2035-2041(钱鹏江, 王士同, 邓赵红, 徐华. 基于最小包含球的大数据集快速谱聚类算法. 电子学报, 2010, 38(9): 2035-2041)
[2]	Tsang I, Kwok J, Zurada J. Generalized core vector machines. IEEE Transactions on Neural Networks, 2006, 17(5): 1126-1140
[3]	Badoiu M, Har-Peled S, Indyk P. Approximate clustering via core-sets. In: Proceedings of the 34th Annual ACM Symposium on Theory of Computing. Quebec, Canada: ACM, 2002. 250-257
[4]	Xu D X. Energy, Entropy and Information Potential for Neural Computation [Ph.D. dissertation], University of Florida, USA, 1998
[5]	Qian Peng-Jiang, Wang Shi-Tong, Deng Zhao-Hong. Fast adaptive similarity-based clustering using sparse Parzen window density estimation. Acta Automatica Sinica, 2011, 37(2): 179-187(钱鹏江, 王士同, 邓赵红. 基于稀疏Parzen窗密度估计的快速自适应相似度聚类方法. 自动化学报, 2011, 37(2): 179-187)
[6]	Chen S, Hong X, Harris C J. Probability density estimation with tunable kernels using orthogonal forward regression. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 2010, 40(4): 1101-1114
[7]	Jeon B, Landgrebe D A. Fast Parzen density estimation using clustering-based branch and bound. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1994, 16(9): 950-954
[8]	Freedman D, Kisilev P. Fast data reduction via KDE approximation. In: Proceedings of the Data Compression Conference. Utah, USA: IEEE, 2009. 445-445
[9]	Steele J M. The Cauchy Schwarz Master Class: an Introduction to the Art of Mathematical Inequalities. New York: Cambridge University Press, 2004. 99-102
[10]	Chen P H, Fan R E, Lin C J. A study on SMO-type decomposition methods for support vector machines. IEEE Transactions on Neural Networks, 2006, 17(4): 893-908
[11]	Lee C, Zaiane O, Park H, Huang J, Greiner R. Clustering high dimensional data: a graph-based relaxed optimization approach. Information Sciences, 2008, 178(23): 4501-4511
[12]	Deng Z H, Chung F L, Wang S T. FRSDE: fast reduced set density estimator using minimal enclosing ball approximation. Pattern Recognition, 2008, 41(4): 1363-1372
[13]	Badoiu M, Clarkson K L. Optimal core-sets for balls. Computational Geometry: Theory and Applications, 2008, 40(1): 14-22
[14]	Tsang I, Kwok J, Cheung P. Core vector machines: fast SVM training on very large data sets. The Journal of Machine Learning Research, 2005, 6: 363-392
[15]	Maynou J, Gallardo-Chacon J J, Vallverdu M, Caminal P, Perera A. Computational detection of transcription factor binding sites through differential Renyi entropy. IEEE Transactions on Information Theory, 2010, 56(2): 734-741
[16]	Jenssen R. Kernel entropy component analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(5): 847-860
[17]	Zeng X, Durrani T S. Estimation of mutual information using copula density function. Electronics Letters, 2011, 47(8): 493-494
[18]	Girolami M, He C. Probability density estimation from optimally condensed data samples. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(10): 1253-1264
[19]	Heiler M, Keuchel J, Schnorr C. Semidefinite clustering for image segmentation with a-priori knowledge. In: Proceedings of the 27th Symposium of the German Association for Pattern Recognition. Vienna, Austria: Springer, 2005. 309-317
[20]	Yang M S, Wu K L. A similarity-based robust clustering method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(4): 434-448
[21]	Fan R E, Chen P H, Lin C J. Working set selection using second order information for training support vector machines. The Journal of Machine Learning Research, 2005, 6: 1889-1918

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133