全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

Open Reading Frame Phylogenetic Analysis on the Cloud

DOI: 10.1155/2013/614923

Full-Text   Cite this paper   Add to My Lib

Abstract:

Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. 1. Introduction Understanding the evolutionary relationships between groups of organisms has become increasingly reliant on phylogenetic analysis. Phylogenies are usually presented as tree diagrams, known as phylogenetic trees. These trees are constructed from genetic similarities and differences between different organisms. Comparative sequence analysis is a useful method by which one can identify gene, infer the function of a gene's product, and identify novel functional elements. By comparing several sequences along their entire length, researchers can find conserved residues that are likely preserved by natural selection. Reconstructing ancestral sequences can reveal the timing and directionality of mutations. These comparative analyses rely on the phylogenetic tree construct. A reading frame is a set of consecutive, nonoverlapping triplets of three consecutive nucleotides. A codon is a triplet equating to an amino acid or stop signal during translation. An open reading frame (ORF) is the section of reading frame containing no stop codons. A protein cannot be made if RNA transcription ceases prior to reaching the stop codon. Therefore, to ensure that the stop codon is translated at the correct position, the transcription termination pause site is located after the ORF. The ORFs can identify translated regions in DNA sequences. Long ORFs indicate candidate protein coding regions in a DNA sequence. ORFs also have been utilized to classify various virus families [1–3], including members of Norovirus [3, 4]. The Open Reading

References

[1]  S. Zimmerly, G. Hausner, and X. C. Wu, “Phylogenetic relationships among group II intron ORFs,” Nucleic Acids Research, vol. 29, no. 5, pp. 1238–1250, 2001.
[2]  C. Brandt-Carlson, J. S. Butel, and D. Wheeler, “Phylogenetic and structural analyses of MMTV LTR ORF sequences of exogenous and endogenous origins,” Virology, vol. 193, no. 1, pp. 171–185, 1993.
[3]  G. Zhao, X. Lu, X. Gu, et al., “Molecular evolution of the H6 subtype influenza a viruses from poultry in eastern China from 2002 to 2010,” Virology Journal, vol. 8, p. 470, 2011.
[4]  K. Motomura, T. Oka, M. Yokoyama et al., “Identification of monomorphic and divergent haplotypes in the 2006-2007 norovirus GII/4 epidemic population by genomewide tracing of evolutionary history,” Journal of Virology, vol. 82, no. 22, pp. 11247–11262, 2008.
[5]  ORF Finder, http://www.ncbi.nlm.nih.gov/projects/gorf/.
[6]  D. V. Dhar and M. S. Kumar, “ORF investigator: a new ORF finding tool combining Pairwise Global Gene Alignment,” Research Journal of Recent Sciences, vol. 1, pp. 32–35, 2012.
[7]  StarORF, http://star.mit.edu/orf/index.html.
[8]  National Center for Biotechnology Information (NCBI), http://www.ncbi.nlm.nih.gov/.
[9]  European Molecular Biology Laboratory (EMBL), http://www.ebi.ac.uk/embl/.
[10]  M. Sanjay Ram and V. Vijayaraj, “Analysis of the characteristics and trusted security of cloud computing,” International Journal on Cloud Computing, vol. 1, pp. 61–69, 2011.
[11]  Amazon EC2, http://aws.amazon.com/ec2/.
[12]  Google app Engine, http://code.google.com/appengine/.
[13]  WindowsAzure, http://www.microsoft.com/windowsazure/windowsazure/.
[14]  D. Nurmi, R. Wolski, C. Grzegorczyk et al., “The eucalyptus open-source cloud-computing system,” in Proceedings of the Cloud Computing and Its Applications (CCA '08), pp. 124–131, May 2009.
[15]  P. Watson, P. Lord, F. Gibson, P. Periorellis, and G. Pitsilis, “Cloud computing for e-science with carmen,” in Proceedings of the 2nd Iberian Grid Infrastructure Conference Proceedings (IBERGRID '08), pp. 1–5, 2008.
[16]  B. Rochwerger, D. Breitgand, E. Levy et al., “The Reservoir model and architecture for open federated cloud computing,” IBM Journal of Research and Development, vol. 53, no. 4, pp. 535–545, 2009.
[17]  C. Jin and R. Buyya, “MapReduce programming model for. NET-based cloud computing,” Lecture Notes in Computer Science, vol. 5704, pp. 417–428, 2009.
[18]  Hadoop, http://hadoop.apache.org/.
[19]  D. Borthakur, The Hadoop Distributed File System: Architecture and Design, 2007.
[20]  A. Matsunaga, M. Tsugawa, and J. Fortes, “CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications,” in Proceedings of the 4th IEEE International Conference on eScience (eScience '08), pp. 222–229, December 2008.
[21]  B. Langmead, M. C. Schatz, J. Lin, M. Pop, and S. L. Salzberg, “Searching for SNPs with cloud computing,” Genome Biology, vol. 10, no. 11, article R134, 2009.
[22]  K. Motomura, T. Oka, M. Yokoyama et al., “Identification of monomorphic and divergent haplotypes in the 2006-2007 norovirus GII/4 epidemic population by genomewide tracing of evolutionary history,” Journal of Virology, vol. 82, no. 22, pp. 11247–11262, 2008.
[23]  J. D. Thompson, D. G. Higgins, and T. J. Gibson, “CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice,” Nucleic Acids Research, vol. 22, no. 22, pp. 4673–4680, 1994.
[24]  A. Bessani, V. V. Cogo, M. Correia, et al., “Making Hadoop MapReduce Byzantine Fault-Tolerant,” http://www.gsd.inesc-id.pt/~mpc/pubs/bft-mapreduce-fa-dsn10.pdf.
[25]  G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of Lipschitz-Hankel type involving products of Bessel functions,” Philosophical Transactions of the Royal Society London A, vol. 247, pp. 529–551, 1955.
[26]  J. N. Xi, D. Y. Graham, K. Wang, and M. K. Estes, “Norwalk virus genome cloning and characterization,” Science, vol. 250, no. 4987, pp. 1580–1583, 1990.
[27]  G. Belliot, S. V. Sosnovtsev, T. Mitra, C. Hammer, M. Garfield, and K. Y. Green, “In vitro proteolytic processing of the MD145 Norovirus ORF1 nonstructural polyprotein yields stable precursors and products similar to those detected in calicivirus-infected cells,” Journal of Virology, vol. 77, no. 20, pp. 10957–10974, 2003.
[28]  J. L. Hyde, S. V. Sosnovtsev, K. Y. Green, C. Wobus, H. W. Virgin, and J. M. Mackenzie, “Mouse norovirus replication is associated with virus-induced vesicle clusters originating from membranes derived from the secretory pathway,” Journal of Virology, vol. 83, no. 19, pp. 9709–9719, 2009.
[29]  P. J. Glass, L. J. White, J. M. Ball, I. Leparc-Goffart, M. E. Hardy, and M. K. Estes, “Norwalk virus open reading frame 3 encodes a minor structural protein,” Journal of Virology, vol. 74, no. 14, pp. 6581–6591, 2000.
[30]  A. Bertolotti-Ciarlet, S. E. Crawford, A. M. Hutson, and M. K. Estes, “The 3′ end of norwalk virus mRNA contains determinants that regulate the expression and stability of the viral capsid protein VP1: a novel function for the VP2 protein,” Journal of Virology, vol. 77, no. 21, pp. 11603–11615, 2003.
[31]  G. S. Hansman, K. Natori, H. Shirato-Horikoshi et al., “Genetic and antigenic diversity among noroviruses,” Journal of General Virology, vol. 87, no. 4, pp. 909–919, 2006.
[32]  T. Kageyama, M. Shinohara, K. Uchida et al., “Coexistence of multiple genotypes, including newly identified genotypes, in outbreaks of gastroenteritis due to Norovirus in Japan,” Journal of Clinical Microbiology, vol. 42, no. 7, pp. 2988–2995, 2004.
[33]  K. Katayama, H. Shirato-Horikoshi, S. Kojima et al., “Phylogenetic analysis of the complete genome of 18 norwalk-like viruses,” Virology, vol. 299, no. 2, pp. 225–239, 2002.
[34]  T. Ando, J. S. Noel, and R. L. Fankhauser, “Genetic classification of Norwalk-like viruses,” The Journal of Infectious Diseases, vol. 181, supplement 2, pp. S336–S348, 2000.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133