|
Genome Biology 2011
Coding potential of the products of alternative splicing in humanAbstract: In this study we analyze alternative splicing isoforms of human gene products that are unambiguously identified by mass spectrometry and compare their properties with those of isoforms of the same genes for which no peptide was found in publicly available mass spectrometry datasets. We analyze them in detail for the presence of uninterrupted functional domains, active sites as well as the plausibility of their predicted structure. We report how well each of these strategies and their combination can correctly identify translated isoforms and derive a lower limit for their specificity, that is, their ability to correctly identify non-translated products.The most effective strategy for correctly identifying translated products relies on the conservation of active sites, but it can only be applied to a small fraction of isoforms, while a reasonably high coverage, sensitivity and specificity can be achieved by analyzing the presence of non-truncated functional domains. Combining the latter with an assessment of the plausibility of the modeled structure of the isoform increases both coverage and specificity with a moderate cost in terms of sensitivity.Alternative splicing (AS) is a mechanism used by cells to diversify the proteins produced by a gene. Estimates of the amount of AS in human have risen dramatically over recent years, especially since the advent of novel high-throughput sequencing technologies [1-3], reaching up to the 95% of the multi-exon genes [4].While the role of AS in expanding the functional complexity of a genome is established, less clear is whether all generated transcripts do indeed encode functional proteins and therefore expand the coding potential of a genome. Cases are known of events that produce splicing variants (isoforms) showing novel and sometimes unexpected structural and functional properties [5,6]. On the other hand, evidence from analysis of sequences, structures and homology models suggest that many AS isoforms, even if detectable a
|