全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Similarity Studies of Corona Viruses through Chaos Game Representation

DOI: 10.4236/cmb.2020.103004, PP. 61-72

Keywords: Covid-19, Chaos Game Representation, Deoxyribonucleic Acid, Phylogenetic Analysis, Shannon Entropy

Full-Text   Cite this paper   Add to My Lib

Abstract:

The novel coronavirus (SARS-COV-2) is generally referred to as Covid-19 virus has spread to 213 countries with nearly 7 million confirmed cases and nearly 400,000 deaths. Such major outbreaks demand classification and origin of the virus genomic sequence, for planning, containment, and treatment. Motivated by the above need, we report two alignment-free methods combing with CGR to perform clustering analysis and create a phylogenetic tree based on it. To each DNA sequence we associate a matrix then define distance between two DNA sequences to be the distance between their associated matrix. These methods are being used for phylogenetic analysis of coronavirus sequences. Our approach provides a powerful tool for analyzing and annotating genomes and their phylogenetic relationships. We also compare our tool to ClustalX algorithm which is one of the most popular alignment methods. Our alignment-free methods are shown to be capable of finding closest genetic relatives of coronaviruses.

References

[1]  Ni, H.M., Qi, D.W. and Mu, H. (2018) Applying MSSIM Combined Chaos Game Representation to Genome Sequences Analysis. Genomics, 110, 180-190.
https://doi.org/10.1016/j.ygeno.2017.09.010
[2]  Stan, C., Cristescu, C.P. and Scarlat, E.I. (2010) Similarity Analysis for DNA Sequences Based on Chaos Game Representation. Case Study: The Albumin. Journal of Theoretical Biology, 267, 513-518.
https://doi.org/10.1016/j.jtbi.2010.09.027
[3]  Li, Y., He, L., et al. (2017) A Novel Fast Vector Method for Genetic Sequence Comparison. Scientific Reports, 7, Article No. 12226.
https://doi.org/10.1038/s41598-017-12493-2
[4]  Jeffrey, H.J. (1990) Chaos Game Representation of Gene Structure. Nucleic Acids Research, 18, 2163-2170.
https://doi.org/10.1093/nar/18.8.2163
[5]  Goldman, N. (1993) Nucleotide, Dinucleotide and Trinucleotide Frequencies Explain Patterns Observed in Chaos Game Representations of DNA Sequences. Nucleic Acids Research, 21, 2487-2491.
https://doi.org/10.1093/nar/21.10.2487
[6]  Kari, L., Hill, K.A., Sayem, A.S., Karamichalis, R., Bryans, N., et al. (2015) Mapping the Space of Genomic Signatures. PLoS ONE, 10, e0119815.
https://doi.org/10.1371/journal.pone.0119815
[7]  Almeida, J.S., Carriço, J.A., Maretzek, A., Noble, P.A. and Fletcher, M. (2001) Analysis of Genomic Sequences by Chaos Game Representation. Bioinformatics, 17, 429-437.
https://doi.org/10.1093/bioinformatics/17.5.429
[8]  Wang, Y., Hill, K., Singh, S. and Kari, L. (2005) The Spectrum of Genomic Signatures: From Di-Nucleotides to Chaos Game Representation. Gene, 346, 173-185.
https://doi.org/10.1016/j.gene.2004.10.021
[9]  Deschavanne, P., Giron, A., Vilain, J., Fagot, G. and Fertil, B. (1999) Genomic Signature: Characterization and Classification of Species Assessed by Chaos Game Representation of Sequences. Molecular Biology and Evolution, 16, 1391-1399.
https://doi.org/10.1093/oxfordjournals.molbev.a026048
[10]  Karamichalis, R., Kari, L., Konstantinidis, S., et al. (2015) An Investigation into Inter- and Intra-Genomic Variations of Graphic Genomic Signatures. BMC Bioinformatics, 16, Article No. 246.
https://doi.org/10.1186/s12859-015-0655-4
[11]  Tanchotsrinon, W., Lursinsap, C. and Poovorawan, Y. (2015) A High Performance Prediction of HPV Genotypes by Chaos Game Representation and Singular Value Decomposition. BMC Bioinformatics, 16, 71.
https://doi.org/10.1186/s12859-015-0493-4
[12]  Solis-Reyes, S., Avino, M. and Poon, A. (2018) An Open-Source k-mer Based Machine Learning Tool for Fast and Accurate Subtyping of HIV-1 Genomes. PLoS ONE, 13, e0206409.
https://doi.org/10.1371/journal.pone.0206409
[13]  WHO (2019) Middle East Respiratory Syndrome Coronavirus (MERS-CoV).
http://www.who.int/emergencies/mers-cov/en
[14]  WHO (2020) Summary Table of SARS Cases by Country, November 1, 2002-August 7, 2003.
http://www.who.int/csr/sars/country/2003_08_15/en
[15]  Hu, B., Ge, X., Wang, L., et al. (2015) Bat Origin of Human Coronaviruses. Virology Journal, 12, 221.
https://doi.org/10.1186/s12985-015-0422-1
[16]  Akhter, S., Bailey, B., Salamon, P., et al. (2013) Applying Shannon’s Information Theory to Bacterial and Phage Genomes and Metagenomes. Scientific Reports, 3, Article No. 1033.
https://doi.org/10.1038/srep01033
[17]  Basu, S., Pan, A., Dutta, C. and Das, J. (1997) Chaos Game Representation of Proteins. Journal of Molecular Graphics and Modelling, 15, 279-289.
https://doi.org/10.1016/S1093-3263(97)00106-X
[18]  Larkin, M.A., et al. (2007) Clustal W and Clustal X Version 2.0. Bioinformatics, 23, 2947-2948.
https://doi.org/10.1093/bioinformatics/btm404

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133