%0 Journal Article
%T Exploring protein structural dissimilarity to facilitate structure classification
%A Pooja Jain
%A Jonathan D Hirst
%J BMC Structural Biology
%D 2009
%I BioMed Central
%R 10.1186/1472-6807-9-60
%X We compute a coefficient of dissimilarity (ІИ) between proteins, based on structural and sequence-based descriptors characterising the respective constituent SSEs. For a set of 1,661 pairs of proteins with sequence identity up to 35%, the performance of ІИ in predicting shared Class, Fold and Super-family levels is comparable to that of DaliLite Z score and shows a greater than four-fold increase in the true positive rate (TPR) for proteins sharing the Family level. On a larger set of 600 domains representing 200 families, the performance of Z score improves in predicting a shared Family, but still only achieves about half of the TPR of ІИ. The TPR for structures sharing a Super-family is lower than in the first dataset, but ІИ performs slightly better than Z score. Overall, the sensitivity of ІИ in predicting common Fold level is higher than that of the DaliLite Z score.Classification to a deeper level in the hierarchy is specific and difficult. So the efficiency of ІИ may be attractive to the curators and the end-users of SCOP. We suggest ІИ may be a better measure for structure classification than the DaliLite Z score, with the caveat that currently we are restricted to comparing structures with equal number of SSEs.The increased pace of protein structure determination, due to high-throughput, synchrotron-based X-ray crystallography and multi-dimensional NMR, promises rapid growth in the number of known protein structures [1-3]. Comparison and classification of newly resolved structures contributes to our understanding of the structural architecture, evolution and function of proteins, especially those with low sequence identity to well characterised proteins [4,5]. This information is important for the identification of new protein folds, drug discovery, and phylogenetic analysis of the proteome.Classification schemes, such as SCOP (Structural Classification Of Proteins) [6] and CATH [7], are well established. SCOP is a curated database and probably the leading classif
%U http://www.biomedcentral.com/1472-6807/9/60