|
BMC Genomics 2007
Identification of unannotated exons of low abundance transcripts in Drosophila melanogaster and cloning of a new serine protease gene upregulated upon injuryAbstract: Bioinformatic analysis of 1,303 Drosophila ORESTES clusters identified 68 sequences derived from unannotated regions in the current Drosophila genome version (4.3). Of these, a set of 38 was analysed by polyA+ northern blot hybridization, validating 17 (50%) new exons of low abundance transcripts. For one of these ESTs, we obtained the cDNA encompassing the complete coding sequence of a new serine protease, named SP212. The SP212 gene is part of a serine protease gene cluster located in the chromosome region 88A12-B1. This cluster includes the predicted genes CG9631, CG9649 and CG31326, which were previously identified as up-regulated after immune challenges in genomic-scale microarray analysis. In agreement with the proposal that this locus is co-regulated in response to microorganisms infection, we show here that SP212 is also up-regulated upon injury.Using the ORESTES methodology we identified 17 novel exons from low abundance Drosophila transcripts, and through a PCR approach the complete CDS of one of these transcripts was defined. Our results show that the computational identification and manual inspection are not sufficient to annotate a genome in the absence of experimentally derived data.Genome sequence determination of the model organism Drosophila melanogaster was a landmark that launched a new era for functional genomic studies in complex organisms. The almost complete version of the euchromatic DNA sequence was first released in March 2000 due to a collaborative effort of the Drosophila Genome Projects and Celera Genomics [1]. Using gene prediction softwares in combination with searches of protein and EST databases, initial in silico analyses indicated the existence of 13,601 protein-coding genes (PCG), an extraordinarily small number of genes when compared to the approximately 19.000 PCG encoded in the C.elegans genome [1].After the release 1, an intensive collective work took place in order to improve sequence quality and annotation, fill in the gaps,
|