|
BMC Bioinformatics 2007
Insertions and the emergence of novel protein structure: a structure-based phylogenetic study of insertionsAbstract: Phylogenies were reconstructed from protein 3D structural data. The phylogenetic trees were used to infer ancestral structures with a consensus method. From these ancestral reconstructions, 42.7% of the observed insertions are nested insertions, which locate in previous insert regions. The average size of inserts tends to increase with the insert rank or total number of insertions in the variable regions. We found that the structures of some nested inserts show complex or even domain-like fold patterns with helices, strands and loops. Furthermore, a basal level of structural innovation was found in inserts which displayed a significant structural similarity exclusively to themselves. The β-Lactamase/D-ala carboxypeptidase domain family is provided as an example to illustrate the inference of insertion events, and how the incremental growth of a variable region is capable to generate novel structural patterns.Using 3D data, we proposed a method to reconstruct phylogenies. We applied the method to reconstruct the sequences of insertion events leading to the emergence of potentially novel structural elements within existing protein domains. The results suggest that structural innovation is possible via the stochastic process of insertions and rapid evolution within variable regions where inserts tend to be nested. We also demonstrate that the structure-based phylogeny enables the study of new questions relating to the evolution of protein domain and biological function.The majority of protein folds descend from a relatively small set of ancestral domains through divergent evolution [1-4]. The mechanism by which new structures emerge or evolve from existing proteins is still an open question. Unlike sequence evolution, the drift of the core of a domain structure is unlikely be stable and functional. Therefore, it is reasonable to postulate that structural innovation is more likely to be the result of evolution at the periphery of the conserved core of domains. A recent
|