|
Biophysics 2022
蛋白质-DNA复合物中残基界面偏好性分析及在识别界面中的应用
|
Abstract:
蛋白质-DNA识别在生物过程中起着重要作用,其结合是由序列特异性识别和结构特征共同影响的。为了研究残基类型和蛋白质二级结构对结合的贡献,本文构建了一个新的非冗余蛋白质-DNA复合物数据库,其中包含1545个结构。经过统计分析发现,残基和二级结构类型对蛋白质与DNA结合有很大贡献,二级结构中π-helix和β-ladder是最偏好界面的类型。对蛋白质二级结构按界面偏好进行分类,构建了60 × 4氨基酸–核苷酸成对界面偏好性。从该偏好性中获得氨基酸界面偏好性,并探讨了将该信息用于预测蛋白质-DNA结合界面的可能性,研究对象为对接基准数据集中的47个复合物体系。结果发现成对界面偏好性信息可以将87.23%的体系的真实界面打分排在所有表面区域的前10%。这说明本文构建的60 × 4氨基酸–核苷酸成对界面偏好性很好地反映了蛋白质-DNA的界面识别,对界面和复合物结构预测具有重要意义。
Protein-DNA recognition plays an important role in biological processes, and its binding is influ-enced by sequence specific recognition and structural characteristics. To investigate the contribu-tion of residue types and protein secondary structure elements to binding, a new non-redundant protein-DNA database with 1545 complex structures was constructed. Statistical analysis reveals that protein residue and secondary structure types have significant contributions to its binding with DNA. Among the secondary structures, π-helix and β-ladder have the highest preferences. We classified the protein secondary structures according to their interface preferences, and construct-ed the 60 × 4 amino acid-nucleotide pairwise interface preferences. The amino acid interface pref-erences obtained from the pairwise ones were used to explore the possibility of predicting pro-tein-DNA binding interfaces for 47 complex systems from the docking benchmark dataset. The re-sult shows that the pairwise interface preferences can rank the real interfaces in the top 10% of all surface patches for 87.23% of all cases. These results indicate that the 60 × 4 amino acid-nucleotide pairwise interface preferences constructed by us can well reflect protein-DNA recognition, which is of great significance for interface and complex structure predictions.
[1] | Luscombe, N.M., Austin, S.E., Berman, H.M., et al. (2000) An Overview of the Structures of Protein-DNA Complexes. Ge-nome Biology, 1, S1. https://doi.org/10.1186/gb-2000-1-1-reviews001 |
[2] | Corona, R.I., Sudarshan, S., Aluru, S., et al. (2018) An SVM-Based Method for Assessment of Transcription Factor-DNA Complex Models. BMC Bioinformatics, 19, Ar-ticle No. 506. https://doi.org/10.1186/s12859-018-2538-y |
[3] | Berman, H.M., Westbrook, J., Feng, Z., et al. (2002) The Nucleic Acid Database. Acta Crystallographica. Section D, Biological Crystallography, 58, 889-898. https://doi.org/10.1107/S0907444902003487 |
[4] | Berman, H.M., Westbrook, J., Feng, Z., et al. (2000) The Protein Data Bank. Nucleic Acids Research, 28, 235-242.
https://doi.org/10.1093/nar/28.1.235 |
[5] | Qin, S. and Zhou, H.X. (2011) Structural Models of Protein-DNA Complexes Based on Interface Prediction and Docking. Current Protein and Peptide Science, 12, 531-539. https://doi.org/10.2174/138920311796957694 |
[6] | Steven, A.C. and Baumeister, W. (2008) The Future Is Hybrid. Jour-nal of Structural Biology, 163, 186-195.
https://doi.org/10.2174/138920311796957694 |
[7] | Parisien, M., Freed, K.F. and Sosnick, T.R. (2012) On Docking, Scoring and Assessing Protein-DNA Complexes in a Rigid-Body Framework. PLOS ONE, 7, e32647. https://doi.org/10.1371/journal.pone.0032647 |
[8] | Xu, B., Yang, Y., Liang, H., et al. (2009) An All-Atom Knowledge-Based Energy Function for Protein-DNA Threading, Docking Decoy Discrimination, and Prediction of Transcrip-tion-Factor Binding Profiles. Proteins, 76, 718-730.
https://doi.org/10.1002/prot.22384 |
[9] | Tuszynska, I., Magnus, M., Jonak, K., et al. (2015) NPDock: A Web Server for Protein-Nucleic Acid Docking. Nucleic Acids Research, 43, W425-W430. https://doi.org/10.1093/nar/gkv493 |
[10] | Tuszynska, I. and Bujnicki, J.M. (2011) DARS-RNP and QUASI-RNP: New Statistical Potentials for Protein-RNA Docking. BMC Bioinformatics, 12, Article No. 348. https://doi.org/10.1186/1471-2105-12-348 |
[11] | Robertson, T.A. and Varani, G. (2007) An All-Atom, Dis-tance-Dependent Scoring Function for the Prediction of Protein-DNA Interactions from Structure. Proteins, 66, 359-374. https://doi.org/10.1002/prot.21162 |
[12] | Li, C.H., Cao, L.B., Su, J.G., et al. (2012) A New Residue-Nucleotide Propensity Potential with Structural Information Considered for Discriminating Protein-RNA Docking Decoys. Proteins, 80, 14-24. https://doi.org/10.1002/prot.23117 |
[13] | 陆林, 刘洋, 李春华. 蛋白质-RNA序列结构界面偏好性及用于对接打分统计势的构建[J]. 生物化学与生物物理进展, 2020, 47(7): 634-644. |
[14] | Pabo, C.O. and Sauer, R.T. (1984) Protein-DNA Recognition. Annual Review of Biochemistry, 53, 293-321.
https://doi.org/10.1146/annurev.bi.53.070184.001453 |
[15] | Coimbatore, N.B., Westbrook, J., Ghosh, S., et al. (2014) The Nucleic Acid Database: New Features and Capabilities. Nucleic Acids Research, 42, D114-D122. https://doi.org/10.1093/nar/gkt980 |
[16] | Li, W. and Godzik, A. (2006) Cd-hit: A Fast Program for Clustering and Com-paring Large Sets of Protein or Nucleotide Sequences. Bioinformatics, 22, 1658-1659. https://doi.org/10.1093/bioinformatics/btl158 |
[17] | van Dijk, M. and Bonvin, A.M. (2008) A Protein-DNA Docking Benchmark. Nucleic Acids Research, 36, e88.
https://doi.org/10.1093/nar/gkn386 |
[18] | Lee, B. and Richards, F.M. (1971) The Interpretation of Protein Structures: Esti-mation of Static Accessibility. Journal of Molecular Biology, 55, 379-400. https://doi.org/10.1016/0022-2836(71)90324-X |
[19] | Kabsch, W. and Sander, C. (1983) Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers, 22, 2577-2637. https://doi.org/10.1002/bip.360221211 |
[20] | Jones, S. and Thornton, J.M. (1997) Analysis of Protein-Protein Interaction Sites Using Surface Patches. Journal of Molecular Biology, 272, 121-132. https://doi.org/10.1006/jmbi.1997.1234 |
[21] | Yang, Z., Deng, X., Liu, Y., et al. (2020) Analyses on Clustering of the Conserved Residues at Protein-RNA Interfaces and Its Application in Binding Site Identification. BMC Bioinformatics, 21, Ar-ticle No. 57.
https://doi.org/10.1186/s12859-020-3398-9 |
[22] | Kulandaisamy, A., Srivastava, A., Nagarajan, R., et al. (2018) Dissecting and Analyzing Key Residues in Protein-DNA Complexes. Journal of Molecular Recognition, 31, e2692. https://doi.org/10.1002/jmr.2692 |
[23] | Nadassy, K., Wodak, S.J. and Janin, J. (1999) Structural Features of Pro-tein-Nucleic Acid Recognition Sites. Biochemistry, 38, 1999-2017. https://doi.org/10.1021/bi982362d |
[24] | Bahadur, R.P., Zacharias, M. and Janin, J. (2008) Dissecting Protein-RNA Recognition Sites. Nucleic Acids Research, 36, 2705-2716. https://doi.org/10.1093/nar/gkn102 |
[25] | Corona, R.I. and Guo, J.T. (2016) Statistical Analysis of Structural Determinants for Protein-DNA-Binding Specificity. Proteins, 84, 1147-1161. https://doi.org/10.1002/prot.25061 |
[26] | Lin, M. and Guo, J.T. (2019) New Insights into Protein-DNA Binding Specificity from Hydrogen Bond Based Comparative Study. Nucleic Acids Research, 47, 11103-11113. https://doi.org/10.1093/nar/gkz963 |