|
BMC Bioinformatics 2009
Multi-level learning: improving the prediction of protein, domain and residue interactions by allowing information flow between levelsAbstract: To link up the predictions at the three levels, we propose a multi-level machine-learning framework that allows for explicit information flow between the levels. We demonstrate, using representative yeast interaction networks, that our algorithm is able to utilize complementary feature sets to make more accurate predictions at the three levels than when the three problems are approached independently. To facilitate application of our multi-level learning framework, we discuss three key aspects of multi-level learning and the corresponding design choices that we have made in the implementation of a concrete learning algorithm. 1) Architecture of information flow: we show the greater flexibility of bidirectional flow over independent levels and unidirectional flow; 2) Coupling mechanism of the different levels: We show how this can be accomplished via augmenting the training sets at each level, and discuss the prevention of error propagation between different levels by means of soft coupling; 3) Sparseness of data: We show that the multi-level framework compounds data sparsity issues, and discuss how this can be dealt with by building local models in information-rich parts of the data. Our proof-of-concept learning algorithm demonstrates the advantage of combining levels, and opens up opportunities for further research.The software and a readme file can be downloaded at http://networks.gersteinlab.org/mll webcite. The programs are written in Java, and can be run on any platform with Java 1.4 or higher and Apache Ant 1.7.0 or higher installed. The software can be used without a license.The functions of many proteins depend highly on their interactions with other proteins. Complete protein-protein interaction (PPI) networks provide insights into the working mechanisms of proteins at a global level. While high-throughput experiments such as yeast two-hybrid (Y2H) [1-4] and tandem-affinity purification with mass spectrometry (TAP-MS) [5,6] have enabled the survey of whole
|