蛋白质折叠问题被称为第二遗传密码,至今未破译;蛋白质序列的天书仍然是“句读之不知,惑之不解”。在最近工作的基础上,我们提出了蛋白质结构的“限域下最低能量结构片段”假说。这一假说指出,蛋白质中存在一些关键的长程强相互作用位点,这些位点相当于标点符号,将蛋白质序列的天书变成可读的句子(多肽片段)。这些片段的天然结构是在这些强长程相互作用位点限域下的能量最低状态。完整的蛋白质结构由这些“限域下最低能量结构片段”拼合而成,而蛋白质整体结构并不一定是全局性的能量最低状态。在蛋白质折叠过程中,局部片段的天然结构倾向性为强长程相互作用的形成提供主要基于焓效应的驱动力,而天然强长程相互作用的形成为局部片段的天然结构提供主要基于熵效应的稳定性。在蛋白质进化早期,可能存在一个“石器时代”,即依附不同界面(比如岩石)的限域作用而稳定的多肽片段先进化出来,后由这些片段逐步进化(包括拼合)而成蛋白质。 The protein folding problem is regarded as the second genetic code which has yet to be deciphered. To date, Anfinsen's thermodynamic hypothesis, i.e., the native structure of a protein is its most stable state, is the only generally accepted theory for protein folding, although exceptions have been reported. However, this hypothesis is a simple overall statement, with no information regarding where or how a protein is folded. The mechanism underlying protein folding has not yet been elucidated, and it is still not clear how the overall sequence (context) determines the structure of a protein. Based on our recent study, we propose a "Confined Lowest Energy Structure Fragments" (CLESFs) hypothesis. This hypothesis states that proteins are CLESFs joined together by a small number of strong constraints (key long-range interactions). Although the native structure of a protein contains various long-range interactions between amino acids that are far apart in the sequence, only a few strong interactions, such as disulfide bonds, hydrophobic packing, structural ion coordination as in zinc fingers, and hydrogen-bonding networks within beta-sheets, are critical. These key long-range interactions serve as a form of punctuation in the "language" of protein sequence and divide the protein sequence into different "sentences, " i.e., fragments (CLESFs). The local native structures of these CLESFs are the lowest energy structures under the confinements of those key long-range interactions, but the overall protein structure is not necessarily the global minimum as Anfinsen hypothesized. The same fragment may adopt different native structures in different proteins. Each native structure of the same fragment in a different protein is a local minimum for the free fragment and the "global minimum" for the fragment under the specific confinement in the specific protein. Essentially, the native local structures of the CLESFs have an enthalpic advantage (local minimum) which serves as a driving force to form the key long-range interactions; the key long-range interactions stabilize the native local structures with entropy effects by excluding enormous amount of random conformations