|
BMC Bioinformatics 2010
VIGOR, an annotation program for small viral genomesAbstract: We have developed VIGOR (Viral Genome ORF Reader), a web application tool for gene prediction in influenza virus, rotavirus, rhinovirus and coronavirus subtypes. VIGOR detects protein coding regions based on sequence similarity searches and can accurately detect genome specific features such as frame shifts, overlapping genes, embedded genes, and can predict mature peptides within the context of a single polypeptide open reading frame. Genotyping capability for influenza and rotavirus is built into the program. We compared VIGOR to previously described gene prediction programs, ZCURVE_V, GeneMarkS and FLAN. The specificity and sensitivity of VIGOR are greater than 99% for the RNA viral genomes tested.VIGOR is a user friendly web-based genome annotation program for five different viral agents, influenza, rotavirus, rhinovirus, coronavirus and SARS coronavirus. This is the first gene prediction program for rotavirus and rhinovirus for public access. VIGOR is able to accurately predict protein coding genes for the above five viral types and has the capability to assign function to the predicted open reading frames and genotype influenza virus. The prediction software was designed for performing high throughput annotation and closure validation in a post-sequencing production pipeline.Rapid and cost effective genomic surveillance of RNA viruses is a critical component of vaccine and drug development pipelines for the control of emerging viral diseases. Improvements in sequencing technology and the concomitant decrease in costs have made it easier and more common for the re-sequencing of large genomes as well as parallel sequencing of small genomes. This has led to an exponential increase in the genomic data available in public databases. However, accurate gene prediction is a challenge that has created a bottleneck in the gene predication pipeline.Two major approaches, ab initio gene finding and similarity-based prediction [1], have been commonly applied to gene predict
|