|
Predicting direct protein interactions from affinity purification mass spectrometry dataAbstract: We first propose a simple probabilistic model for the interactions captured by AP-MS experiments, under which the problem of separating direct interactions from indirect ones is formulated. Then, given idealized quantitative AP-MS data, we study the problem of identifying the most likely set of direct interactions that produced the observed data. We address this challenging graph theoretical problem by first characterizing signatures that can identify weakly connected nodes as well as dense regions of the network. The rest of the direct PPI network is then inferred using a genetic algorithm.Our algorithm shows good performance on both simulated and biological networks with very high sensitivity and specificity. Then the algorithm is used to predict direct interactions from a set of AP-MS PPI data from yeast, and its performance is measured against a high-quality interaction dataset.As the sensitivity of AP-MS pipeline improves, the fraction of indirect interactions detected will also increase, thereby making the ability to distinguish them even more desirable. Despite the simplicity of our model for indirect interactions, our method provides a good performance on the test networks.Understanding the organization of protein-protein interactions (PPIs) as a complex network is one of the main pursuits in proteomics today. With the help of high-throughput experimental techniques, a large amount of PPI data has recently become available, providing us with a rough picture of how proteins interact in biological systems. However, the interaction data from these high-throughput experiments suffer from low resolution as compared to data from low-throughput technologies such as protein co-crystallization, and to make matters worse, they are prone to problems including relatively high error rates and protocol-specific biases. Therefore, inferring the direct, physical PPI network from high-throughput data remains a challenge in systems biology.The leading technologies for identif
|