|
BMC Bioinformatics 2009
In Silico screening for functional candidates amongst hypothetical proteinsAbstract: Here, we present an in silico selection strategy where eukaryotic hypothetical proteins are sorted according to two criteria that can be reliably identified in silico: the presence of subcellular targeting signals and presence of characterized protein domains. To validate the selection strategy we applied it on a database of human hypothetical proteins dating to 2006 and compared the proteins predicted to be expressed by our selecting strategy, with their status in 2008. For the comparison we focused on mitochondrial proteins, since considerable amounts of research have focused on this field in between 2006 and 2008. Therefore, many proteins, defined as hypothetical in 2006, have later been characterized as mitochondrial.Among the total amount of human proteins hypothetical in 2006, 21% have later been experimentally characterized and 6% of those have been shown to have a role in a mitochondrial context. In contrast, among the selected hypothetical proteins from the 2006 dataset, predicted by our strategy to have a mitochondrial role, 53-62% have later been experimentally characterized, and 85% of these have actually been assigned a role in mitochondria by 2008.Therefore our in silico selection strategy can be used to select the most promising candidates for subsequent in vitro and in vivo analyses.According to the Human Genome Organization (HUGO), the human genome is predicted to consist of 19599 protein-encoding genes [[1], Human Genome Project http://www.hugo-international.org/ webcite]. A substantial part of these genes is predicted to encode a group of proteins, where translation has not been demonstrated and the proteins themselves have not been characterized. This group of proteins is accordingly defined as hypothetical. Although many of the listed hypothetical proteins most likely are predicted products of pseudogenes, there is a reasonable probability that a number of the listed hypothetical proteins are truly novel and can perform uncharacterized biologica
|