|
BMC Bioinformatics 2005
Integrating alternative splicing detection into gene predictionAbstract: We have used a new integrative approach that allows to incorporate AS detection into ab initio gene prediction. This method relies on the analysis of genomically aligned transcript sequences (ESTs and/or cDNAs), and has been implemented in the dynamic programming algorithm of the graph-based gene finder EuGèNE. Given a genomic sequence and a set of aligned transcripts, this new version identifies the set of transcripts carrying evidence of alternative splicing events, and provides, in addition to the classical optimal gene prediction, alternative optimal predictions (among those which are consistent with the AS events detected). This allows for multiple annotations of a single gene in a way such that each predicted variant is supported by a transcript evidence (but not necessarily with a full-length coverage).This automatic combination of experimental data analysis and ab initio gene finding offers an ideal integration of alternatively spliced gene prediction inside a single annotation pipeline.Alternative splicing (AS) is a biological process that occurs during the maturation step of a pre-mRNA, allowing the production of different mature mRNA variants from a unique transcription unit. AS is known to play a key role in the regulation of gene expression and transcriptome/proteome diversity [1]. First considered as an exceptional event, AS is now thought to involve the majority of the human multi-exon genes, from 50% to 74% [1-3]. This observation raises new issues for genome annotation, especially concerning the computational gene finding process that generally provides only one exon-intron structure per sequence.In the context of structural gene prediction, two classes of approaches are usually considered. In the first approach, usually denoted as intrinsic or ab initio, the only type of information used for gene prediction lies in the statistical properties of the various gene elements (exons, splice sites and other biological signals). On the contrary, so-called
|