|
BMC Bioinformatics 2006
An interactive visualization tool to explore the biophysical properties of amino acids and their contribution to substitution matricesAbstract: We demonstrate the three fundamental types of question that our software can address by taking as a specific example the connections between 49 measures of amino acid biophysical properties (e.g., size, charge and hydrophobicity), a generalized model of amino acid substitution (as represented by the PAM74-100 matrix), and the mutational distance that separates amino acids within the standard genetic code (i.e., the number of point mutations required for interconversion during protein evolution). We show that our software allows a user to recapture the insights from several key publications on these topics in just a few minutes.Our software facilitates rapid, interactive exploration of three interconnected topics: (i) the multidimensional molecular descriptors of the twenty proteinaceous amino acids, (ii) the correlation of these biophysical measurements with observed patterns of amino acid substitution, and (iii) the causal basis for differences between any two observed patterns of amino acid substitution. This software acts as an intuitive bioinformatic exploration tool that can guide more comprehensive statistical analyses relating to a diverse array of specific research questions.Molecular biology has made great progress in observing and quantifying the patterns by which amino acids exchange for one another within protein sequences over time. A key motivation here has been to create amino acid substitution matrices (such as the PAM and BLOSUM matrix families), which lie at the heart of mainstream bioinformatics procedures, from algorithms that determine whether [1] and how exactly [2] two proteins are homologous, to those that predict protein tertiary structure by comparison with known folds [3]. However, these matrices represent generalized patterns of change "averaged" across all proteins: although they typically encompass the idea that patterns of substitution will vary with evolutionary distance, other systematic sources of variation are overlooked. An increa
|