全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...

TnpPred: A Web Service for the Robust Prediction of Prokaryotic Transposases

DOI: 10.1155/2012/678761

Full-Text   Cite this paper   Add to My Lib

Abstract:

Transposases (Tnps) are enzymes that participate in the movement of insertion sequences (ISs) within and between genomes. Genes that encode Tnps are amongst the most abundant and widely distributed genes in nature. However, they are difficult to predict bioinformatically and given the increasing availability of prokaryotic genomes and metagenomes, it is incumbent to develop rapid, high quality automatic annotation of ISs. This need prompted us to develop a web service, termed TnpPred for Tnp discovery. It provides better sensitivity and specificity for Tnp predictions than given by currently available programs as determined by ROC analysis. TnpPred should be useful for improving genome annotation. The TnpPred web service is freely available for noncommercial use. 1. Introduction Insertion sequences (ISs) are small, mobile DNA elements that usually contain a gene encoding a transposase that catalyzes the movement of the ISs from one part of the genome to another. ISs are found in nearly all prokaryotes [1, 2], sometimes at very high frequency per genome and are among the most abundant genes in nature [3]. They play a major role in lateral gene transfer, genome organization, and genome evolution [4]. Many ISs are bounded by short terminal inverted repeats (IRs) and some generate short direct repeats (DRs) when they integrate into the genome. ISs are classified into 19 families based on amino acid sequence similarity of the transposases, DNA sequence similarity including respective IRs and DRs and, in some cases, supported by phylogenetic profiling [5, 6]. Given the increasing availability of prokaryotic genomes and metagenomes, it is incumbent to develop rapid, high quality automatic annotation of ISs. Unfortunately, currently transposases of many ISs are incorrectly annotated as having other functions or are identified as “hypothetical.” In addition, their annotation is exacerbated by the presence of numerous partial ISs scattered in most genomes, representing the remains of once active ISs. Recently, the web application ISsaga was released, providing high quality ISs annotation [7], based on information available from curated ISs families present in the ISfinder database [5]. One advantage of the ISsaga pipeline is that it combines IS (DNA) and transposase (protein) sequence searches for the prediction of complete and partial ISs. The DNA and protein sequence searches are based on a suite of BLAST programs (BLASTN, BLASTX, and BLASTP) [8, 9]. IScan is another application that makes use of BLAST to scan whole genomes for ISs and includes in its

References

[1]  M. Touchon and E. P. C. Rocha, “Causes of insertion sequences abundance in prokaryotic genomes,” Molecular Biology and Evolution, vol. 24, no. 4, pp. 969–981, 2007.
[2]  P. Siguier, J. Filée, and M. Chandler, “Insertion sequences in prokaryotic genomes,” Current Opinion in Microbiology, vol. 9, no. 5, pp. 526–531, 2006.
[3]  R. K. Aziz, M. Breitbart, and R. A. Edwards, “Transposases are the most abundant, most ubiquitous genes in nature,” Nucleic Acids Research, vol. 38, no. 13, Article ID gkq140, pp. 4207–4217, 2010.
[4]  F. De la Cruz and J. Davies, “Horizontal gene transfer and the origin of species: lessons from bacteria,” Trends in Microbiology, vol. 8, no. 3, pp. 128–133, 2000.
[5]  P. Siguier, J. Perochon, L. Lestrade, J. Mahillon, and M. Chandler, “ISfinder: the reference centre for bacterial insertion sequences,” Nucleic Acids Research, vol. 34, pp. D32–D36, 2006.
[6]  J. Mahillon and M. Chandler, Insertion Sequences Revisited. In Mobile DNA II, ASM Press, Washington, DC, USA, 2002.
[7]  A. M. Varani, P. Siguier, E. Gourbeyre, V. Charneau, and M. Chandler, “ISsaga is an ensemble of web-based methods for high throughput identification and semi-automatic annotation of insertion sequences in prokaryotic genomes,” Genome Biology, vol. 12, no. 3, article R30, 2011.
[8]  S. F. Altschul, T. L. Madden, A. A. Sch?ffer et al., “Gapped BLAST and PSI-BLAST: a new generation of protein database search programs,” Nucleic Acids Research, vol. 25, no. 17, pp. 3389–3402, 1997.
[9]  S. F. Altschul, J. C. Wootton, E. M. Gertz et al., “Protein database searches using compositionally adjusted substitution matrices,” FEBS Journal, vol. 272, no. 20, pp. 5101–5109, 2005.
[10]  A. Wagner, C. Lewis, and M. Bichsel, “A survey of bacterial insertion sequences using IScan,” Nucleic Acids Research, vol. 35, no. 16, pp. 5284–5293, 2007.
[11]  M. Madera and J. Gough, “A comparison of profile hidden Markov model procedures for remote homology detection,” Nucleic Acids Research, vol. 30, no. 19, pp. 4321–4328, 2002.
[12]  R. D. Finn, J. Mistry, and J. Tate, “The Pfam protein families database,” Nucleic Acids Research, vol. 38, pp. D211–D222, 2010.
[13]  R. Leplae, G. Lima-Mendez, and A. Toussaint, “ACLAME: a CLAssification of mobile genetic elements, update 2010,” Nucleic Acids Research, vol. 38, no. 1, Article ID gkp938, pp. D57–D61, 2009.
[14]  D. Wilson, M. Madera, C. Vogel, C. Chothia, and J. Gough, “The SUPERFAMILY database in 2007: families and functions,” Nucleic Acids Research, vol. 35, no. 1, pp. D308–D313, 2007.
[15]  A. Andreeva, D. Howorth, J. M. Chandonia et al., “Data growth and its impact on the SCOP database: new developments,” Nucleic Acids Research, vol. 36, no. 1, pp. D419–D425, 2008.
[16]  K. D. Pruitt, T. Tatusova, W. Klimke, and D. R. Maglott, “NCBI reference sequences: current status, policy and new initiatives,” Nucleic Acids Research, vol. 37, no. 1, pp. D32–D36, 2009.
[17]  R. Chenna, H. Sugawara, T. Koike et al., “Multiple sequence alignment with the Clustal series of programs,” Nucleic Acids Research, vol. 31, no. 13, pp. 3497–3500, 2003.
[18]  R. Durbin, S. R. Eddy, A. Krogh, and G. J. Mitchison, Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids, Cambridge University Press, 1998.
[19]  Upgrading to TLS Within HTTP/1.1, http://tools.ietf.org/html/rfc2817.
[20]  The text/css Media Type, http://tools.ietf.org/html/rfc2318.
[21]  T. Fawcett, “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, pp. 861–874, 2006.
[22]  B. Boeckmann, A. Bairoch, R. Apweiler et al., “The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003,” Nucleic Acids Research, vol. 31, no. 1, pp. 365–370, 2003.
[23]  R. Apweiler, M. J. Martin, C. O'Donovan et al., “Ongoing and future developments at the Universal Protein Resource,” Nucleic Acids Research, vol. 39, supplement 1, pp. D214–D219, 2011.
[24]  S. D. Hooper, K. Mavromatis, and N. C. Kyrpides, “Microbial co-habitation and lateral gene transfer: what transposases can tell us,” Genome Biology, vol. 10, no. 4, article R45, 2009.
[25]  S. Schmitz-Esser, T. Penz, A. Spang, and M. Horn, “A bacterial genome in transition—an exceptional enrichment of IS elements but lack of evidence for recent transposition in the symbiont Amoebophilus asiaticus,” BMC Evolutionary Biology, vol. 11, no. 1, article 270, 2011.

Full-Text

Contact Us

service@oalib.com

QQ:3279437679

WhatsApp +8615387084133