|
- 2017
基于流形学习的代价敏感特征选择
|
Abstract:
摘要: 为了得到一个低误分类代价的特征子集,本文通过定义样本间的代价距离并将代价距离引入了现有的特征选择架构,把流形学习和代价敏感特征选择问题相结合得到了一个新的代价敏感特征选择方法,称之为基于流形学习的代价敏感特征选择算法。以前提出的代价敏感特征选择算法在选择特征的过程中只考虑到了特征与误分类代价的关系,并对特征一个一个的进行选择,而本文所提出的代价敏感特征选择算法同时考虑了特征与误分类代价的关系和特征之间内在的判别信息,从而提高了代价敏感特征选择效果。在六个现实世界数据集上的实验证明了本文所提出的算法效果优于现有的相关算法。
Abstract: In order to get a low-cost subset of original features, we define the cost-distance among the samples and joint it to existing feature selection framework. We combine manifold learning into cost-sensitive feature selection model and develop a corresponding method, namely, cost-sensitive feature selection via manifold learning(CFSM). Most previous cost-sensitive feature selection algorithms rank features individually and select features just using correlation the between the cost and the features. Our cost-sensitive feature selection algorithm selects features not only using the correlation the between the cost and the features but also using the discriminative information implied within data to improve the features selection performance. Experimental results on different real world datasets show the promising performance of CFSM outperforms the state-of-the-arts
[1] | EFRON B, HASTIE T, JOHNSTONE I, et al. Least angle regression[J]. The Annals of Statistics, 2004, 32(2):407-499. |
[2] | SAITTA L. Machine learning — a technological roadmap[M]. Amsterdam: University of Amsterdam, 2001. |
[3] | SHI Jianbo, MALIK J. Normalized cuts and image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 888-905. |
[4] | FRASCA M, BASSIS S. Gene-disease prioritization through cost-Sensitive graph-based methodologies[C] //International Work-Conference on Bioinformatics and Biomedical Engineering. Berlin: Springer International Publishing, 2016:739-751. |
[5] | ZHAO Hong, MIN Fan, ZHU W. Cost-sensitive feature selection of data with errors[J]. Journal of Applied Mathematics, Article ID, 2013, 754698: 18. |
[6] | CAI Deng, ZHANG Chiyuan, HE Xiaofei. Unsupervised feature selection for multi-cluster data[C] //ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington Dc: s. n, 2010:333-342. |
[7] | ZHANG Yin, ZHOU Zhihua. Cost-sensitive face recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(10): 1758-1769. |
[8] | WEI Fan, STOLFO S J, ZHANG Jingdan, et al. Adacost: misclassification cost-sensitive boosting[C] //Sixteenth International Conference On Machine Learning. Burlington: Morgan Kaufmann Publishers Inc, 1999:97-105. |
[9] | TURNEY P D. Types of cost in inductive concept learning[C] //The Workshop on Cost-Sensitive Learning at the Seventeenth International Conference on Machine Learning. S. l: s. n, 2002:15-21. |
[10] | LU Jiwen, TAN Y P. Cost-Sensitive subspace analysis and extensions for face recognition[J]. IEEE Transactions on Information Forensics and Security, 2013, 8(3):510-519. |
[11] | LU Jiwen, ZHOU Xiuzhuang, TAN Y P, et al. Cost-sensitive semi-supervised discriminant analysis for face recognition[J]. IEEE Transactions on Information Forensics and Security, 2012, 7(3):944-953. |
[12] | ZADROZNY B, ELKAN C. Learning and making decisions when costs and probabilities are both unknown[C] //Seventh Acm Sigkdd International Conference on Knowledge Discovery and Data Mining. S. l: s. n, 2001:204-213. |
[13] | DOMINGOS P. MetaCost: a general method for making classifiers cost-sensitive[C] //Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining. S. l: s. n, 1999:155-164. |
[14] | MIAO Linsong, LIU Mingxia, ZHANG Daoqiang. Cost-sensitive feature selection with application in software defect prediction[C]. IEEE International Conference on Pattern Recognition, 2012:967-970. |
[15] | LU Jiwen, TAN Y P. Regularized locality preserving projections and its extensions for face recognition[J]. IEEE Transactions on Systems, Man, and Cybernetics, Part B(Cybernetics), 2009, 40(3): 958-963. |
[16] | BELKIN M, NIYOGI P. Laplacian eigenmaps and spectral techniques for embedding and clustering[J]. Advances in Neural Information Processing Systems, 2002, 14(6):585-591. |
[17] | ROWEIS S T, SAUL L K. Nonlinear dimensionality reduction by locally linear embedding[J]. Science, 2000, 290(5500): 2323-2326. |
[18] | NIE Feiping, HUANG Heng, CAI Xiao, et al. Efficient and robust feature selection via joint l2, 1-norms minimization[C] //Advances in Neural Information Processing Systems 23: Conference on Neural Information Processing Systems 2010. Proceedings of a Meeting Held 6-9 December 2010. Vancouver: s. n, 2010:1813-1821. |
[19] | ZHU Pengfei, ZUO Wangmeng, ZHANG Lei, et al. Unsupervised feature selection by regularized self-representation[J]. Pattern Recognition, 2015, 48(2): 438-446. |