全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
PLOS ONE  2014 

A Graph-Theoretic Approach for Identifying Non-Redundant and Relevant Gene Markers from Microarray Data Using Multiobjective Binary PSO

DOI: 10.1371/journal.pone.0090949

Full-Text   Cite this paper   Add to My Lib

Abstract:

The purpose of feature selection is to identify the relevant and non-redundant features from a dataset. In this article, the feature selection problem is organized as a graph-theoretic problem where a feature-dissimilarity graph is shaped from the data matrix. The nodes represent features and the edges represent their dissimilarity. Both nodes and edges are given weight according to the feature’s relevance and dissimilarity among the features, respectively. The problem of finding relevant and non-redundant features is then mapped into densest subgraph finding problem. We have proposed a multiobjective particle swarm optimization (PSO)-based algorithm that optimizes average node-weight and average edge-weight of the candidate subgraph simultaneously. The proposed algorithm is applied for identifying relevant and non-redundant disease-related genes from microarray gene expression data. The performance of the proposed method is compared with that of several other existing feature selection techniques on different real-life microarray gene expression datasets.

References

[1]  Kohavi R, John G (1997) Wrapper for feature subset selection. Artificial Intelligence 97: 273–324. doi: 10.1016/s0004-3702(97)00043-x
[2]  Ruiza R, Riquelmea J, Aguilar-Ruizb J (2006) Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recognition 39: 2383–2392. doi: 10.1016/j.patcog.2005.11.001
[3]  Mitra P, Murthy C, Pal S (2002) Unsupervised feature selection using feature similarity. IEEE Transaction on Pattern Analysis and Machine Intellegence 24: 301–312. doi: 10.1109/34.990133
[4]  Jiang S, Wang L (2012) An unsupervised feature selection framework based on clustering. In: New Frontiers in Applied Data Mining.
[5]  Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: KDD10 Washington DC USA.
[6]  Dy J, Brodley C, Kak A, Broderick L, Aisen A (2003) Unsupervised feature selection applied to content-based retrieval of lung images. IEEE Transaction on Pattern Analysis and Machine Intellegence 25: 373–378. doi: 10.1109/tpami.2003.1182100
[7]  Morita M, Oliveira L, Sabourin R (2004) Unsupervised feature selection for ensemble of classifiers. In: Frontiers in Handwriting Recognition.
[8]  Zhang Z, Hancock E (2011) A graph-based approach to feature selection. Springer.
[9]  Bahmani B, Kumar R, Vassilvitskii S (2012) Densest subgraph in streaming and mapreduce. VLDB Endowment 5: 454–465.
[10]  Li Y, Lu B, Wu Z (2006) A hybrid method of unsupervised feature selection based on ranking. In: IEEE Computer Society Washington DC USA.
[11]  Liu Y, Wang G, Chen H, Dong H, Zhu X, et al. (2011) An improved particle swarm optimization for feature selection. Journal of Bionic Engineering 97: 191–200. doi: 10.1016/s1672-6529(11)60020-6
[12]  Tang E, Suganthan P, Yao X (2005) Feature selection for microarray data using least squares svm and particle swarm optimization. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.
[13]  Chen LF, Su CT, Chen KH (2012) An improved particle swarm optimization for feature selection. Intelligent Data Analysis 16: 167–182.
[14]  Mohamad M, Omatu S, Deris S, Yoshioka M, Abdullah A, et al.. (2013) An enhancement of binary particle swarm optimization for gene selection in classifying cancer classes. Algorithms for Molecular Biology 8.
[15]  Xue B, Cervante L, Shang L, Browne W, Zhang M (2012) A multi-objective particle swarm optimisation for filter-based feature selection in classification problems. Connect Sci 24: 91–116. doi: 10.1080/09540091.2012.737765
[16]  lashkargir M, Monadjemi S, Dastjerdi A (2009) A hybrid multi-objective particle swarm optimization method to discover biclusters in microarray data. International Journal of Computer Science and Information Security 4.
[17]  Xue B, Zhang M, Browne W (2013) Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Transaction On Cybernetics 43: 1656–1671. doi: 10.1109/tsmcb.2012.2227469
[18]  Deb K (2001) Multi-objective Optimization Using Evolutionary Algorithms. England: John Wiley and Sons.
[19]  Coello CC (2002) Evolutionary multiobjective optimization: a historical view of the field,. IEEE Computational Intelligence Magazine 1: 28–36. doi: 10.1109/mci.2006.1597059
[20]  Chuang L, Hsiao C, Yang C (2011) An improved binary particle swarm optimization with complementary distribution strategy for feature selection. In: International Conference on Machine Learning and Computing.
[21]  Cheok M, Yang W, Pui C, Downing J, Cheng C, et al.. (2003) Characterization of pareto dominance. Operations Research Letters 31.
[22]  Maulik U, Mukhopadhyay A, Bandyopadhyay S (2009) Combining pareto-optimal clusters using supervised learning for identifying co-expressed genes. BMC Bioinformatics 10.
[23]  Yoon Y, Lee J, Park S, Bien S, Chung H, et al. (2008) Direct integration of microarrays for selecting informative genes and phenotype classification. Pattern Recognition 178: 88–105. doi: 10.1016/j.ins.2007.08.013
[24]  Alon U, Barkai N, Notterman D, Gish K, Ybarra S, et al. (1999) Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci USA 96: 6745–6750.
[25]  Gordon G, Jensen R, Hsiao L, Gullans S, Blumenstock J, et al. (2002) Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Res 62: 4963–4967.
[26]  Jaeger J, Sengupta R, Ruzzo W (2003) Improved gene selection for classification of microarrays. In: Pac Symp Biocomput.
[27]  Hanczar B, Courtine M, Benis A, Hennegar C, Clement K, et al.. (2003) Improving classification of microarray data using prototype-based feature selection. In: SIGKDD Explor Newslett.
[28]  M-Cedeno A, Q-Dominguez J, C-Januchs M, Andina D (2010) Feature selection using sequential forward selection and classification applying artificial metaplasticity neural network. In: Proc of the IEEE Industrial Electronics Society.
[29]  Mao K (2004) Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Transactions on Systems, Man and Cybernetics-Part B: Cybernetics 34: 629–634. doi: 10.1109/tsmcb.2002.804363
[30]  Hall M, Smith L (1999) Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper. In: Proc. of the 12th International FLAIRS Conference.
[31]  Mankiewicz R (2000) The Story of Mathematics. Princeton University Press.
[32]  Troyanskaya O, Garber M, Brown P, Botstein D, Altman R (2002) Nonparametric methods for identifying differentially expressed genes in microarray data. Bioinformatics 18: 1454–1461. doi: 10.1093/bioinformatics/18.11.1454
[33]  Ding C, Peng H (2005) Minimum redundancy feature selection for microarray gene expression data. Journal of Bioinformatics ans Computational Biology 3: 185–205. doi: 10.1142/s0219720005001004
[34]  Kamandar M, Ghassemian H (2011) Maximum relevance, minimum redundancy band selection for hyperspectral images. In: 19th Iranian Conference on Electrical Engineering (ICEE).
[35]  Cover T, Thomas J (2006) Entropy, relative entropy and mutual information. Elements of Information Theory John Wiley & Sons.
[36]  Kamandar M, Ghassemian H (2009) A cluster-based feature selection approach. In: International Conference on Hybrid Artificial Intelligence Systems.
[37]  Kamandar M, Ghassemian H (2011) A graph-based approach to feature selection. In: International Workshop on Graph-Based Representations in Pattern Recognition.
[38]  Eisen M, Spellman P, Brown P, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc National Academy of Sciences 95: 14863–14867. doi: 10.1073/pnas.95.25.14863
[39]  Krause E Taxicab geometry Addison-Wesley Innovative Series. Addison-Wesley Pub Co.
[40]  Baya A, Larese M, Granitto P, Gomez J, Tapia E (2007) Gene set enrichment analysis using non-parametric scores. Springer-Verlag Berlin Heidelberg.
[41]  Parsopoulos K (2010) Particle swarm optimization and intelligence: Advances and applications. Information science reference Hershey New York.
[42]  Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Pattern Recognition 206: 528–539. doi: 10.1016/j.ejor.2010.02.032
[43]  Deb K, Pratap A, Agrawal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6: 182–197. doi: 10.1109/4235.996017
[44]  Sierra M, Coello CC (2006) Multi-objective particle swarm optimizers: A survey of the state-of-the-art. International Journal of Computational Intelligence Research 2: 287–308. doi: 10.5019/j.ijcir.2006.68
[45]  Lee I, Lushington G, Visvanathan M (2011) A filter-based feature selection approach for identifying potential biomarkers for lung cancer. Journal of Clinical Bioinformatics 1.
[46]  Wang X, Gotoh O (2009) Cancer classification using single genes. In: International Conference on Genome Informatics.
[47]  Fukuta K, Okada Y (2012) Informative gene discovery in dna microarray data using statistical approach. In: Proc of the Intelligent Control and Innovative Computing.
[48]  Shipp M, Ross K, Tamayo P,Weng A, Kutok J, et al.. (2002) Diffuse large b-cell lymphoma outcome prediction by geneexpression profiling and supervised machine learning. Nature Medicine 8.
[49]  Cheok M, Yang W, Pui C, Downing J, Cheng C, et al. (2003) Treatment-specific changes in gene expression discriminate in vivo drug response in human leukemia cells. Nature Genetics 34: 85–90. doi: 10.1038/ng1151

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133