|
BMC Bioinformatics 2009
Prediction of hot spot residues at protein-protein interfaces by combining machine learning and energy-based methodsAbstract: We present a novel computational strategy to identify hot spot residues, given the structure of a complex. We consider the basic energetic terms that contribute to hot spot interactions, i.e. van der Waals potentials, solvation energy, hydrogen bonds and Coulomb electrostatics. We treat them as input features and use machine learning algorithms such as Support Vector Machines and Gaussian Processes to optimally combine and integrate them, based on a set of training examples of alanine mutations. We show that our approach is effective in predicting hot spots and it compares favourably to other available methods. In particular we find the best performances using Transductive Support Vector Machines, a semi-supervised learning scheme. When hot spots are defined as those residues for which ΔΔG ≥ 2 kcal/mol, our method achieves a precision and a recall respectively of 56% and 65%.We have developed an hybrid scheme in which energy terms are used as input features of machine learning models. This strategy combines the strengths of machine learning and energy-based methods. Although so far these two types of approaches have mainly been applied separately to biomolecular problems, the results of our investigation indicate that there are substantial benefits to be gained by their integration.Protein-protein interactions are central to most biological processes including for example cellular communication, gene regulation, and immune response [1]. The complexity of these processes, coupled with the intricate interaction networks that biomolecules form in a cell, requires proteins to be able to selectively bind to other proteins. Indeed, erroneous or disrupted protein interactions can be the causes of a number of diseases [2]. Elucidating the fundamental biophysical principles that govern molecular recognition and drive protein association is therefore a topic of primary importance in biomedical research. However, at present the energetic determinants of affinity and specificit
|