|
BMC Bioinformatics 2010
Detection of distant evolutionary relationships between protein families using theory of sequence profile-profile comparisonAbstract: Here, we present a new homology detection method based on sequence profile-profile comparison. The method has a number of new features including position-dependent gap penalties and a global score system. Position-dependent gap penalties provide a more biologically relevant way to represent and align protein families as sequence profiles. The global score system enables an analytical solution of the statistical parameters needed to estimate the statistical significance of profile-profile similarities. The new method, together with other state-of-the-art profile-based methods (HHsearch, COMPASS and PSI-BLAST), is benchmarked in all-against-all comparison of a challenging set of SCOP domains that share at most 20% sequence identity. For benchmarking, we use a reference ("gold standard") free model-based evaluation framework. Evaluation results show that at the level of protein domains our method compares favorably to all other tested methods. We also provide examples of the new method outperforming structure-based similarity detection and alignment. The implementation of the new method both as a standalone software package and as a web server is available at http://www.ibt.lt/bioinformatics/coma webcite.Due to a number of developments, the new profile-profile comparison method shows an improved ability to match distantly related protein domains. Therefore, the method should be useful for annotation and homology modeling of uncharacterized proteins.Common evolutionary origin or homology is one of the key concepts in biology. Homologous proteins usually share similar three-dimensional shape and often perform identical or similar molecular functions. Therefore, detection of homology is now routinely used to make inferences regarding structure, function or evolution for the protein of interest. Protein sequence comparison is the primary means for establishing homology. For closely related proteins, sequence similarity can be detected even by an untrained eye, however, the
|