OALib Journal期刊
ISSN: 2333-9721
费用：99美元

投递稿件

查看量	下载量

相关文章
更多...

International Journal of Computer Science and Security 2012

A Comparative Analysis of Feature Selection Methods for Clustering DNA Sequences

B.Umamageswari,B.Karthikeyan,T.Nalini

Keywords: Evolutionary Tree , Data Mining , Bioinformatics , Euclidean Distance , PCA , Hierarchical Clustering Algorithms , MtDNA , Codons.

Full-Text Cite this paper Add to My Lib

Abstract:

Large-scale analysis of genome sequences is in progress around the world, the major application ofwhich is to establish the evolutionary relationship among the species using phylogenetic trees.Hierarchical agglomerative algorithms can be used to generate such phylogenetic trees given thedistance matrix representing the dissimilarity among the species. ClustalW and Muscle are twogeneral purpose programs that generates distance matrix from the input DNA or protein sequences.The limitation of these programs is that they are based on Smith-Waterman algorithm which usesdynamic programming for doing the pair-wise alignment. This is an extremely time consuming processand the existing systems may even fail to work for larger input data set. To overcome this limitation,we have used the frequency of codons usage as an approximation to find dissimilarity amongspecies. The proposed technique further reduces the complexity by extracting only the significantfeatures of the species from the mtDNA sequences using the techniques like frequent codons, codonswith maximum range value or PCA technique. We have observed that the proposed system producesnearly accurate results in a significantly reduced running time.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133