%0 Journal Article %T CUSP: an algorithm to distinguish structurally conserved and unconserved regions in protein domain alignments and its application in the study of large length variations %A Sankaran Sandhya %A Barah Pankaj %A Madabosse Govind %A Bernard Offmann %A Narayanaswamy Srinivasan %A Ramanathan Sowdhamini %J BMC Structural Biology %D 2008 %I BioMed Central %R 10.1186/1472-6807-8-28 %X CUSP, examines protein domain structural alignments to distinguish regions of conserved structure common to related proteins from structurally unconserved regions that vary in length and type of structure. On a non-redundant dataset of 353 domain superfamily alignments from PASS2, we find that 'length- deviant' protein superfamilies show > 30% length variation from their average domain length. 60% of additional lengths that occur in indels are short-length structures (< 5 residues) while 6% of indels are > 15 residues in length. Structural types in indels also show class-specific trends.The extent of length variation varies across different superfamilies and indels show class-specific trends for preferred lengths and structural types. Such indels of different lengths even within a single protein domain superfamily could have structural and functional consequences that drive their selection, underlying their importance in similarity detection and computational modelling. The availability of systematic algorithms, like CUSP, should enable decision making in a domain superfamily-specific manner.Protein databanks such as the PDB [1], with nearly 47,000 structures in the current year, are growing at a rapid pace. Interestingly, the increase in the number of protein structures in the last decade is not accompanied by a concomitant rise in the number of novel folds. This suggests that protein folds are resilient to exploit their large degrees of conformational freedom and can tolerate large modifications in sequence and length. Structural comparisons of related proteins show that changes, in the form of substitutions, deletions or insertions are accommodated into existing protein scaffolds. Protein domains show from two-three residue variation to over two-fold length variations as in the PDB entries for P-loop NTP hydrolases and the TIM fold.Recent studies correlating domain length variations with the taxonomy spans of domains report that over one-third of all domains tend %U http://www.biomedcentral.com/1472-6807/8/28