%0 Journal Article %T TnpPred: A Web Service for the Robust Prediction of Prokaryotic Transposases %A Gonzalo Riadi %A Cristobal Medina-Moenne %A David S. Holmes %J International Journal of Genomics %D 2012 %I Hindawi Publishing Corporation %R 10.1155/2012/678761 %X Transposases (Tnps) are enzymes that participate in the movement of insertion sequences (ISs) within and between genomes. Genes that encode Tnps are amongst the most abundant and widely distributed genes in nature. However, they are difficult to predict bioinformatically and given the increasing availability of prokaryotic genomes and metagenomes, it is incumbent to develop rapid, high quality automatic annotation of ISs. This need prompted us to develop a web service, termed TnpPred for Tnp discovery. It provides better sensitivity and specificity for Tnp predictions than given by currently available programs as determined by ROC analysis. TnpPred should be useful for improving genome annotation. The TnpPred web service is freely available for noncommercial use. 1. Introduction Insertion sequences (ISs) are small, mobile DNA elements that usually contain a gene encoding a transposase that catalyzes the movement of the ISs from one part of the genome to another. ISs are found in nearly all prokaryotes [1, 2], sometimes at very high frequency per genome and are among the most abundant genes in nature [3]. They play a major role in lateral gene transfer, genome organization, and genome evolution [4]. Many ISs are bounded by short terminal inverted repeats (IRs) and some generate short direct repeats (DRs) when they integrate into the genome. ISs are classified into 19 families based on amino acid sequence similarity of the transposases, DNA sequence similarity including respective IRs and DRs and, in some cases, supported by phylogenetic profiling [5, 6]. Given the increasing availability of prokaryotic genomes and metagenomes, it is incumbent to develop rapid, high quality automatic annotation of ISs. Unfortunately, currently transposases of many ISs are incorrectly annotated as having other functions or are identified as ˇ°hypothetical.ˇ± In addition, their annotation is exacerbated by the presence of numerous partial ISs scattered in most genomes, representing the remains of once active ISs. Recently, the web application ISsaga was released, providing high quality ISs annotation [7], based on information available from curated ISs families present in the ISfinder database [5]. One advantage of the ISsaga pipeline is that it combines IS (DNA) and transposase (protein) sequence searches for the prediction of complete and partial ISs. The DNA and protein sequence searches are based on a suite of BLAST programs (BLASTN, BLASTX, and BLASTP) [8, 9]. IScan is another application that makes use of BLAST to scan whole genomes for ISs and includes in its %U http://www.hindawi.com/journals/ijg/2012/678761/