|
Genome Biology 2008
Sequence-based prediction of protein-protein interactions by means of codon usageAbstract: The need to transform the growing amount of biological information into knowledge has involved several disciplines that, by means of experimental and computational approaches, aim to decipher functional linkages and interactions between proteins [1,2]. Current computational methods for predicting protein-protein interactions demand data that, compared to the huge amount of available genomic sequences, are scarce. Only in a few organisms have features such as essentiality, biological function and mRNA co-expression of genes been partially determined. Also, a combination of different homology-based predictors, including phylogenetic profiles [3], Rosetta stone [4] and interolog mapping [5], has provided incomplete information about interactions of only one-third of all Saccharomyces cerevisiae proteins. Hence, a method to identify protein-protein interactions solely on the basis of gene sequences would significantly expand the ability to predict interaction networks.A few studies have been performed on the prediction of protein-protein interactions based only on amino acid sequence information [6-8]. However, the highest specificity reported in these studies is 86%. Considering the number of possible protein pairs in a genome consisting of no more than 6,000 protein-coding genes, this level of specificity results in the unacceptable number of 2.5 × 106 false positives. These studies consider protein sequences, and ignore the plethora of information that exists in their coding sequences. The still-unsatisfied demand for reliable sequence-based prediction of protein-protein interactions encourages exploration of relevant sequence features in the genome instead of the proteome.It has been widely acknowledged that codon usage is correlated with expression level [9]. In addition, it has been shown that codon usage is structured along the genome [10], with near neighbor genes having similar codon compositions. Some function-specific codon preferences have also been hypothes
|