|
- 2018
基于邻域粗糙集与鱼群智能的基因选择方法
|
Abstract:
针对高维、小样本及不确定性的基因表达数据,融合模糊可容忍性的邻域粒化技术与具有全局寻优能力的鱼群智能算法,提出基于邻域粗糙集与鱼群智能的基因选择方法。首先,采用邻域粗糙集对基因数据进行邻域粒化,形成邻域粒子;其次,提出基于邻域分类精度的不确定性评价函数,用以评价邻域粒子的不确定性,分辨关键性基因;进一步融合鱼群智能方法,设计一种基因选择算法,选取分类性强的少量关键基因;最后,在两个癌症基因数据集中进行基因选择,采用SVM分类器对获取的关键基因组进行分类实验。实验结果表明,采用该方法获取的基因组具有较低的冗余度及较好的分类性能。
[1] | LIU Y, HUANG W L, JIANG Y L, et al. Quick attribute reduct algorithm for neighborhood rough set model[J]. Information Sciences, 2014, 271:65-81. |
[2] | MENG J, ZHANG J, LUAN Y. Gene selection integrated with biological knowledge for plant stress response using neighborhood system and rough set theory[J]. IEEE/ACM Transactions on Computational Biology, 2015, 12(2):433-444. |
[3] | KOHAVI R, JOHN G H. Wrappers for feature subset selection[J]. Artificial Intelligence, 1997, 97(1-2):273-324. |
[4] | 明利特, 蒋芸, 王勇, 等. 基于邻域粗糙集和概率神经网络集成的基因表达谱分类方法[J]. 计算机应用研究, 2011, 28(12):4440-4444. MING Li-te, JIANG Yun, WANG Yong, et al. Gene expression profiles classification method based on neighborhood rough set and probabilistic neural networks ensemble[J]. Application Research of Computers, 2011, 28(12):4440-4444. |
[5] | 李晓磊, 邵之江, 钱积新. 一种基于动物自治体的寻优模式:鱼群算法[J]. 系统工程理论与实践, 2002, 22(11):32-38. LI Xiao-lei, SHAO Zhi-jiang, QIAN Ji-xin. An optimizing method based on autonomous animals:Fish swarm algorithm[J]. Systems Engineering-Theory & Practice, 2002, 22(11):32-38. |
[6] | PAL N R, AGUAN K, SHARMA A, et al. Discovering biomarkers from gene expression data for predicting cancer subgroups using neural networks and relational fuzzy clustering[J]. BMC Bioinformatics, 2007, 8(1):1-18. |
[7] | SAEYS Y, INZA I, LARRANAGA P. A review of feature selection techniques in bioinformatics[J]. Bioinformatics, 2007, 23(19):2507-2517. |
[8] | TIBSHIRANI R, HASTIE T, NARASIMHAN B, et al. Diagnosis of multiple cancer types by shrunken centroids of gene expression[J]. Proceedings of the National Academy of Sciences, 2002, 99(10):6567-6572. |
[9] | PUDIL P, NOVOVICOVA J, KITTLER J. Floating search methods in feature selection[J]. Pattern Recognition Letters, 1994, 15(11):1119-1125. |
[10] | PAWLAK Z. Rough sets[J]. International Journal of Information and Computer Sciences, 1982, 11(1):341-356. |
[11] | HU Q H, YU D R, LIU J F, et al. Neighborhood rough set based heterogeneous feature subset selection[J]. Information Sciences, 2008, 178:3577-3594. |
[12] | ZHANG L, ZHANG B. Fuzzy reasoning model under quotient space structure[J]. Information Sciences, 2005, 173(4):353-364. |
[13] | ZHU W, WANG F Y. Reduction and axiomization of covering generalized rough sets[J]. Information Sciences, 2003, 152(1):217-230. |
[14] | HU Q H, YU D R, XIE Z X. Neighborhood classifiers[J]. Expert Systems with Applications, 2008, 34:866-876. |
[15] | JAFARI P, AZUAJE F. An assessmernt of recently published gene expression data analyses:Reporting experimental design and statistical factors[J]. BMC Medical Informatics and Decision Making, 2006, 6(27):1-8. |
[16] | DAI J H, XU Q. Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification[J]. Applied Soft Computing, 2013, 13(1):211-221. |
[17] | WONG T T, LIU K L. A probabilistic mechanism based on clustering analysis and distance measure for subset gene selection[J]. Expert Systems with Applications, 2010, 37(3):2144-2149. |
[18] | 张丽娟, 李舟军. 微阵列数据癌症分类问题中的基因选择[J]. 计算机研究与发展, 2009, 46(5):794-802. ZHANG Li-juan, LI Zhou-jun. Gene selection for cancer classification in microarray data[J]. Journal of Computer Research and Development, 2009, 46(5):794-802. |
[19] | LIN H Y. Gene discretization based on EM clustering and adaptive sequential forward gene selection for molecular classification[J]. Applied Soft Computing, 2016, 48:683-690. |
[20] | GUYON I, ELISSEEFF A. An introduction to variable and feature selection[J]. Journal of Machine Learning Research, 2003, 3:1157-1182. |
[21] | ZADEH L A. Fuzzy sets[J]. Information and Control, 1965, 8:338-353. |