Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Friesner R.A. (ed.) - Advances in chemical physics, computational methods for protein folding (2002)(en)

.pdf
Скачиваний:
12
Добавлен:
15.08.2013
Размер:
6.52 Mб
Скачать

ab initio protein structure prediction

235

Energy

3.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

'ARG.ARG.1'

 

 

3

 

 

 

 

 

'ARG.ARG.2'

 

 

 

 

 

 

 

 

'ARG.ARG.3'

 

 

2.5

 

 

 

 

 

'ARG.ARG.4'

 

 

 

 

 

 

 

 

'ARG.ARG.5'

 

 

2

 

 

 

 

 

'ARG.ARG.6'

 

 

 

 

 

 

 

 

 

 

 

 

1.5

 

 

 

 

 

 

 

 

 

 

1

 

 

 

 

 

 

 

 

 

 

0.5

 

 

 

 

 

 

 

 

 

 

0

 

 

 

 

 

 

 

 

 

 

−0.5

 

 

 

 

 

 

 

 

 

 

0

5

10

15

20

25

30

35

40

45

50

Distance

Figure 2. Size dependence of three representative terms in Eq. (4) for the amino acid pair arginine–arginine–arginine. Data for the first six size bins are shown.

Energy

2.2

 

2

'ARG.ILE.1'

'ARG.ILE.2'

 

1.8

'ARG.ILE.3'

'ARG.ILE.4'

 

1.6

'ARG.ILE.5'

'ARG.ILE.6'

 

1.4

 

1.2

 

1

0.8

0.6

0.4

0

5

10

15

20

25

30

35

40

45

50

 

 

 

 

Distance

 

 

 

 

Figure 3. Size dependence of three representative terms in Eq. (4) for the amino acid pair arginine–isoleucine. Data for the first six size bins are shown.

236 volker a. eyrich, richard a. friesner, and daron m. standley

Energy

3

 

 

'ILE.ILE.1'

2.5

'ILE.ILE.2'

'ILE.ILE.3'

 

 

'ILE.ILE.4'

2

'ILE.ILE.5'

'ILE.ILE.6'

1.5

 

1

 

0.5

 

0

0

5

10

15

20

25

30

35

40

45

50

 

 

 

 

Distance

 

 

 

 

Figure 4. Size dependence of three representative terms in Eq. (4) for the amino acid pair isoleucine–isoleucine. Data for the first six size bins.

 

7

 

 

 

 

 

 

 

 

 

 

 

6.5

 

 

 

 

 

 

 

 

 

 

 

6

 

 

 

 

 

 

 

 

 

 

 

5.5

 

 

 

 

 

 

 

 

 

 

Energy

5

 

 

 

 

 

 

 

 

 

 

4.5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

4

 

 

 

 

 

 

 

 

 

 

 

3.5

 

 

 

 

 

 

'size1'

 

 

 

 

 

 

 

 

 

'size2'

 

 

 

 

 

 

 

 

 

 

 

 

 

3

 

 

 

 

 

 

'size3'

 

 

 

 

 

 

 

 

 

'size4'

 

 

 

 

 

 

 

 

 

 

 

 

 

2.5

 

 

 

 

 

 

'size5'

 

 

 

 

 

 

 

 

 

'size6'

 

 

 

2

5

10

15

20

25

30

35

40

45

50

 

0

Distance

Figure 5. Density profiles for the first six size bins.

ab initio protein structure prediction

237

be easily rationalized: The arginine–arginine residues are pushed apart, while the isoleucine–isoleucine interaction is attractive. The arginine–isoleucine term is repulsive is well, but the minimum values occur at shorter distances than in the corresponding arginine–arginine plots, consistent with our intuitive picture of a spheroid with hydrophilic residues residing primarily on the surface. Not surprisingly, the basic effect of the density profile is to restrict the interresidue separation as a function of protein size. Note also that the density profile is the most sensitive to protein size (although the isoleucine–isoleucine pair potential clearly decreases with size).

Figure 6 illustrates the effect of adding the excluded volume and density profile to the arginine–arginine, arginine–isoleucine, and isoleucine-isoleucine potentials, respectively, for size bin 6. We see here that the linear portions of the

˚

potential are now restricted to a small range in distance (about 6–12 A), outside of which the density profile and excluded volume become the dominant terms.

˚

The energies of each of the three residue pairs at large separation (e.g., 25 A) relative to their minimum values increase in the expected order (EIle-Ile >

EArg-Ile > EArg-Arg).

Energy

5

4.5

4

3.5

3

2.5

'ARG.ARG.6' 2 'ARG.ILE.6'

'ILE.ILE.6'

1.5

0

5

10

15

20

25

30

35

40

45

50

 

 

 

 

Distance

 

 

 

 

Figure 6. Total energy for three representative residue pairs: arginine–arginine, arginine– isoleucine, and isoleucine–isoleucine. The data corresponds to size bin 6.

238volker a. eyrich, richard a. friesner, and daron m. standley

III. TERTIARY FOLDING SIMULATIONS: PDB DERIVED AND IDEAL SECONDARY STRUCTURES

A.Physical Model

The physical model of the polypeptide chain we use has been described previously [2]; a few minor modifications are introduced as noted below. All bond angles and bond lengths are fixed at ideal values. The variables in the optimization are the torsional angles f and c of the peptide backbone. Each residue is represented by a Ca atom and a Cb-like atom. The Cb atom position is given by the average projection of the side-chain center of mass onto the Ca–Cb bond vector.

We employ three different methods to describe the location and threedimensional structure of secondary structure elements (i.e., a-helices and b-strands). The first is to take both the sequence location and backbone angles (which are frozen during the simulation) directly from the PDB entry. This is obviously not a realistic data set in a predictive situation, but is an essential computational experiment in that it indicates what level of accuracy is possible with ‘‘perfect’’ secondary structure information. The second is the replacement of PDB backbone angles with ideal backbone angles; this separates the effects of distortion of secondary structural elements from ideal geometries from errors in location in the sequence or in length. For these two types of calculations the correct size-dependent potential is selected by evaluating the radius of the gyration of the corresponding native structure. The third is to employ predicted, rather than PDB, secondary structure (along with the use of ideal geometries for the predicted elements) and to select the correct potential by predicting the radius of gyration from the number of residues of the target [22]. We have carried out an extensive investigation in this regard, using secondary structure prediction from various secondary structure prediction servers that are available over the Internet. These results are then combined to produce genuine ab initio structural prediction. The results, while far from a robust ab initio methodology over all protein types, yield important insights into the key obstacles to ab initio prediction and are in many cases surprisingly accurate. Predictions from the CASP3 contest are also included so that comparisons can be made with the work of others. While we are not generating these predictions as a blind test, it is the case that our CASP3 calculations were carried out using our software in a completely automated fashion, with no readjustment of parameters after obtaining results for the CASP3 targets.

B.Simulation Methodology

Our simulation methodology is identical to that presented in previous publications [2], so we will describe it only briefly here. The algorithm is based

ab initio protein structure prediction

239

on the Monte Carlo plus minimization (MCM) strategy proposed by Li and Scheraga [23]. This approach has proven to be extraordinarily efficacious in our previous work, and the present results reinforce our conclusions concerning its robustness and efficiency in enumerating the low-energy basins of attraction for low-resolution models such as those employed here. As in previous work [2], we have incorporated several key modifications of the algorithm, the most important of which is that the number of minimization steps is annealed as a function of the simulation temperature (i.e., more steps are taken later in the simulation), which yields a factor of 5–10 times reduction in computational effort. Finally, calculations are performed using a parallelized version of the code (an MPI implementation) on a network of PCs using Intel microprocessors and also on a large SGI Origin at the National Center for Supercomputing Applications.

The MCM procedure produces a large number of low-energy structures. The structurally unique predictions are extracted from the raw simulation data by a clustering algorithm. Figure 7 illustrates this process for the protein 1ACP. The raw simulation data (red dots) are combined into structurally similar clusters using a procedure discussed in Ref. 24. The criterion for separating structures into clusters is that the average RMSD between clusters (calculated over all

˚

structures in a particular cluster) be at least 5 A. Clusters are represented by their lowest energy structure (black circles), which means that energies and RMSDs reported for clusters are based on their lowest-energy structure. The

1ACP

RMSD from the native structure

14

 

 

 

 

 

 

12

 

 

 

 

 

 

10

 

 

 

 

 

 

8

 

 

 

 

 

 

6

 

 

 

 

 

 

4

 

 

 

Clusters

 

2

 

 

 

Raw Data

 

 

 

 

Minimized native

0

 

 

 

 

 

 

5400

5450

5500

5550

5600

5650

5700

Energy

Figure 7. (See also color insert.) Comparison of raw data and clustered results (red dots: raw simulation data, black circles: cluster representatives, green square: locally minimized native structure).

240 volker a. eyrich, richard a. friesner, and daron m. standley

TABLE III

RMSD with Respect to the Native of the ten Lowest Energy Clusters (Represented by Their Lowest Energy Member) for the Protein 1acpa

Cluster #

RMSD

Energy

N

 

 

 

 

1

6.65

5415.43

43

2

5.72

5417.83

58

3

8.64

5420.17

23

4

7.54

5420.80

25

5

11.08

5422.82

31

6

9.65

5424.04

34

7

4.79

5427.21

22

8

5.91

5431.10

16

9

8.27

5431.23

12

10

12.10

5432.68

19

aN gives the number of structures combined into a cluster.

˚

RMSD between resulting representative structures is usually at least 5 A, but this is not guaranteed by the clustering algorithm because we use the average RMSD as the clustering criterion. For the 10 representative structures lowest in energy we list energy, RMSD with respect to the native and number of structures combined into a cluster in Table III, and the RMSD between the representative structures themselves in Table IV. Derivation of the ranks of structures (discussed below) is straightforward given the data in Table III.

The lowest-energy structure obtained from the simulations is generally highly refined, meaning that its energy cannot be lowered significantly by performing more extensive searches. Refinement of higher-energy structures, structures that do not rank first, is possible though and in some of the cases,

TABLE IV

RMSD Between the Representative Structures from the Ten Lowest-Energy Clusters for the Protein 1acp

Cluster #

1

2

3

4

5

6

7

8

9

10

 

 

 

 

 

 

 

 

 

 

 

1

0.00

2.23

8.73

2.31

9.91

5.01

7.18

9.35

8.20

12.23

2

 

0.00

8.70

3.41

9.94

6.04

6.08

8.35

8.16

11.93

3

 

 

0.00

8.85

11.86

8.63

8.46

8.24

3.04

8.63

4

 

 

 

0.00

9.70

4.08

7.97

9.91

8.53

12.14

5

 

 

 

 

0.00

8.69

11.31

11.02

11.77

8.40

6

 

 

 

 

 

0.00

10.54

11.91

8.45

11.75

7

 

 

 

 

 

 

0.00

3.01

8.27

10.81

8

 

 

 

 

 

 

 

0.00

8.46

9.91

9

 

 

 

 

 

 

 

 

0.00

9.04

10

 

 

 

 

 

 

 

 

 

0.00

 

 

 

 

 

 

 

 

 

 

 

ab initio protein structure prediction

241

especially the larger proteins, actually results in improved ranks. We have not yet developed the optimal refinement strategy though and therefore do not report results for this approach.

C.Comparison of the Size-Dependent Potential with Previous Results

Using PDB-Derived Secondary Structure

As a test set, we employed the subset of the 95 proteins used in Ref. 2 which are less than 100 residues and are not all b-strand. There is some overlap with the training set; but in tertiary folding, this is less of a concern than in secondary structure prediction because the three-dimensional phase space of the protein is so large that as long as an adequate number of proteins are used to generate the pair potential statistics, systematic bias of the results coming from the training set is unlikely to be large. In fact, we see little difference in performance for proteins depending upon whether they were included in the training set or not (or for the CASP3 targets we examined). By retaining the test set used in the previous chapter, we are able to directly compare our new potential with the older potential lacking size dependence, and thus assess the degree of progress that has been made by incorporating size dependence into the potential function.

As discussed above, after the tertiary folding simulations are completed, we group the resulting structures into clusters (without any reference to the native

structure, which is presumed to be unknown during clustering)

and

report

˚

˚

˚

the highest-ranking clusters with RMSD from the native below 4 A , 5 A, 6 A,

˚

and 7 A, respectively.

In Table V, we compare these results for our test set with those obtained in Ref. 2. Note that Ref. 2 also included postsimulation screening algorithms; we have not developed such methods for the new potentials because some of the ideas have been incorporated directly into the energy function. Consequently we compare only with results taken directly from the simulations in Table V. However, we note that the overall quality of the results from the new potential is substantially better than those from the old, even when screening is employed in the latter. Table VI summarizes performance for various types of proteins and size classes.

The performance of the new potential function is particularly striking for proteins in the 50–100 residue size. For a-helical proteins in this category, the

˚

average rank of the best structure less than 7 A is 3.6; furthermore, in the overwhelming majority of cases, the rank is 5 or better. This is a sufficient reduction in the number of possible structures that discrimination among the resulting structures via more expensive calculations at an atomic level of detail [25] becomes feasible. The reliability of the results demonstrates that the basic physics of the low-resolution model have been qualitatively improved as compared to previous efforts.

242 volker a. eyrich, richard a. friesner, and daron m. standley

TABLE V

Comparison to Previous Resultsa

 

 

 

 

‘‘Old’’ Potential

 

 

 

Size-dependent Potential

 

 

 

 

 

 

 

——————— ————————————————————————

 

 

 

 

PDB—X-RAY

 

PDB—X-RAY

 

 

 

PDB—IDEAL

 

 

 

 

 

——————— ———————————— ———————————

 

Nres

Na

Nb

˚

˚

˚

˚

˚

˚

˚

LER

˚

˚

˚

˚

LER

 

<5 A

<6 A

<7 A

<4 A

<5 A

<6 A

<7 A

<4 A <5 A <6 A <7 A

 

 

 

 

 

 

Alpha Proteins (Nres <50)

 

 

 

 

 

 

1ajj

17

6

0

1

1

1

1

1

4.0

1

1

1

4.9

1bgk

27

18

0

4

2

1

2

2

2

1

6.5

2

2

2

1

6.2

1erd

29

25

0

1

1

1

1

1

1

1

3.8

1

1

1

1

3.3

2erl

35

29

0

2

2

2

1

1

1

4.9

1

1

1

1

2.8

1res

35

27

0

3

1

1

1

1

1

1

3.5

1

1

1

1

3.8

1roo

17

14

0

1

1

1

1

1

1

1

3.7

1

1

1

1

3.7

1uxd

43

31

0

1

1

1

4

4

4

1

6.0

4

4

1

6.4

 

 

 

 

 

Mixed Alpha/Beta Proteins (Nres <50)

 

 

 

 

 

1aho

31

10

10

5

3

1

7

5

2

1

6.8

3

3

2

2

7.5

1ayj

46

11

15

33

1

1

2

2

7.7

3

3

2

8.6

1cmr

26

8

10

3

1

1

3

2

2

1

6.6

4

4

3

1

6.8

1gpt

47

13

19

23

2

2

13

13

12

3

8.1

2

2

8.9

1hev

25

7

11

1

1

1

3

1

1

1

5.0

3

2

2

7.1

2ktx

34

11

14

1

1

1

1

1

1

1

3.6

1

1

1

4.2

1pce

30

12

10

2

2

2

1

1

1

1

2.8

1

1

5.1

1ptq

43

6

8

732

21

18

20

11

8.6

16

1

6.8

2sn3

48

8

15

94

21

7

29

2

2

8.5

13

3

3

8.9

2vgh

34

6

12

126

61

21

4

7.1

3

8.2

1vtx

36

7

10

78

2

34

3

7.8

9

1

7.0

5znf

25

12

11

1

1

1

1

1

1

1

2.6

1

1

6.0

 

 

 

 

 

 

Alpha Proteins (50 Nres < 100)

 

 

 

 

 

 

1acp

73

45

0

256

115

30

7

2

1

6.7

11

11

11.3

1ail

67

60

0

5

5

2

1

1

1

1

3.0

1

1

1

1

3.9

1aj3

95

86

0

2

2

2

2

2

2

2

9.3

2

1

1

1

4.6

1am3

57

45

0

8

8

6

6

2

10.7

24

5

1

6.1

1c5a

62

49

0

1

1

1

3

3

2

8.2

10

3

3

3

8.0

1cc5

76

41

0

78

21

6

6

2

8.5

18

6

3

7.2

1ddf

87

66

0

7

7

63

3

2

12.7

58

8

8

7.1

2ezh

59

45

0

16

5

2

1

1

1

1

3.8

3

3

3

2

9.7

2ezk

76

64

0

28

8

1

1

1

5.7

1

1

5.9

2hp8

56

44

0

4

2

2

2

2

9.7

2

2

2

7.1

1hsn

62

46

0

88

88

67

19

19

11.4

98

17

8.3

1jvr

74

59

0

5

5

5

31

31

1

1

5.3

10

9

7

10.4

1Ifb

69

48

0

94

94

5

5

10.4

15

11

11

10.6

1mzm

71

54

0

8

8

5

4

4

10.7

3

2

2

11.0

1nkl

70

56

0

2

1

1

1

1

3.9

2

2

2

2

9.6

1nre

66

55

0

22

22

22

22

1

1

1

4.9

19

1

1

1

4.6

2pac

77

26

0

136

53

1

6.4

76

5

11.2

1pou

70

57

0

6

6

1

1

1

1

2.3

4

4

4

4

11.2

1r69

61

41

0

46

9

8

6

6

3

11.3

23

12

5

10.7

 

 

 

 

ab initio protein structure prediction

 

 

 

243

 

 

 

 

 

 

TABLE V

(Continued)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

‘‘Old’’ Potential

 

 

 

 

Size-dependent Potential

 

 

 

 

 

 

 

——————— ————————————————————————

 

 

 

 

PDB—X-RAY

 

PDB—X-RAY

 

 

 

PDB—IDEAL

 

 

 

 

 

——————— ———————————— ———————————

 

Nres

Na

Nb

˚

˚

˚

˚

 

˚

˚

˚

LER

˚

˚

˚

˚

LER

 

<5 A

<6 A

<7 A

<4 A

<5 A

<6 A

<7 A

<4 A <5 A <6 A <7 A

1utg

62

53

0

4

2

1

21

 

1

1

5.6

14

1

1

5.3

5icb

72

52

0

8

8

 

2

1

6.1

8

1

6.2

 

 

 

 

 

Mixed Alpha/Beta Protein (50 < Nres < 100)

 

 

 

 

 

1aa3

56

31

8

19

19

 

6

3

8.4

7

7

7

5

9.4

2acy

92

24

41

16

 

5

5

12.0

13.0

1ag2

97

58

8

349

 

87

10.9

— 187

12.3

1bor

52

9

14

187

22

8

 

17

6

7.2

40

12

8.3

1btb

89

45

19

274

24

1

1

 

1

1

3.8

31

28

8.1

1ctf

67

38

19

15

12

4

1

1

 

1

1

3.0

4

4

11.1

2fdn

53

8

6

123

4

4

 

38

6

8.1

30

10.3

2fow

66

29

8

181

56

8

 

23

8

10.6

69

4

7.9

1fwp

66

22

17

484

2

2

3

 

3

3

10.3

42

10

10

10.3

1gb1

54

13

16

1

1

1

 

15

1

6.5

2

1

6.5

1pgx

57

15

33

4

4

4

2

2

 

2

2

9.5

35

28

11

8.1

1leb

63

36

6

142

27

4

3

 

3

3

10.9

6

6

6

8.7

1orc

56

25

17

2

2

1

8

6

 

6

6

7.1

46

2

2

1

6.2

5pti

55

16

14

109

16

16

 

14

4

10.1

47

14

7.1

2ptl

60

15

34

1

1

1

1

1

 

1

1

3.4

35

4

4

8.2

1ris

92

25

42

180

11

9

9

 

9

9

11.1

— 129

11

11.7

1svq

90

22

34

— 119

 

117

32

12.5

— 462

43

9.0

aFollowing global energy minimization, structures are clustered without reference to the native; the energetic ranks of clusters that have an RMSD close to the native (for old results, three RMSD

˚ ˚ ˚ ˚ ˚ ˚

cutoffs—5 A, 6 A, and 7 A—were used; for new results, four RMSD cutoffs—4 A, 5 A, 6 A, and

˚

7 A—were used). Energetic rank was defined so that the lowest-energy structure ranks 1, the secondlowest ranks 2, and so on. LER refers to the RMSD of the lowest-energy structure. The column ‘‘PDB—X-Ray’’ list’s results of runs using location and configuration of secondary structure derived from the PDB entry. Column ‘‘PDB-Ideal’’ lists results for calculations where the location of secondary structure was derived from the PDB, but configuration of secondary structural elements was assumed to be ideal.

For mixed a/b-proteins, the absolute quality of the results is somewhat diminished, but the improvement as compared to previous work is even larger. There are two cases, 1ag2 and 1svq, where the rank obtained for the best low RMSD structure is above 10, with the 1ag2 result being particularly problematic. We have investigated this case further and show improved results for 1ag2 below. On the other hand, there is a significant number of cases for which no reasonable structures were recovered previously which now rank in the top 10.

The energies of structures located by the global optimization algorithm are lower than the native and locally minimized native structures in all cases, a

244 volker a. eyrich, richard a. friesner, and daron m. standley

TABLE VI

Summary of Ranks Listed in Table Va

(a)

 

 

 

˚

 

 

 

˚

RMSD <

6

˚

 

 

 

˚

 

RMSD < 4 A

 

RMSD < 5 A

A

 

RMSD < 7 A

 

——————————

————————— ————————— ————————

 

 

 

Ave

Max

 

Ave

Max

 

Ave

 

Max

 

Ave

Max

Class

Nprot

Nconv Rank

Rank

Nconv

Rank

Rank

Nconv

Rank

 

Rank

Nconv

Rank Rank

Small a

7

6

2

4

7

1

 

 

2

7

1

2

Small a/b

12

11

93

732

12

16

 

 

78

12

5

21

Medium a

21

11

43

256

18

26

 

 

115

20

21

136

Medium a/b

17

11

114

484

13

46

 

 

274

15

30

349

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(b)

 

 

 

˚

 

 

 

˚

 

 

˚

 

 

˚

 

 

RMSD < 4 A

 

RMSD < 5 A

RMSD < 6 A

RMSD < 7 A

 

——————————

—————————

————————— ————————

 

 

 

Ave

Max

 

Ave

Max

 

Ave

Max

 

Ave

Max

Class

Nprot

Nconv Rank

Rank

Nprot

Rank

Rank

Nconv

Rank

Rank

Nprot

Rank

Rank

Small a

7

5

2

4

7

2

4

7

2

4

7

1

1

Small a/b

12

7

4

13

8

7

29

11

7

34

12

3

11

Medium a

21

8

8

31

17

10

63

21

6

53

21

3

19

Medium a/b

17

7

6

19

10

16

119

16

16

117

17

10

87

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(c)

 

 

 

˚

 

 

˚

 

 

 

˚

 

 

˚

 

 

RMSD < 4 A

 

RMSD < 5 A

 

RMSD < 6 A

RMSD < 7 A

 

——————————— —————————

————————— ————————

 

 

 

Ave

Max

 

Ave

Max

 

Ave

Max

 

Ave

Max

Class

Nprot

Nconv Rank

Rank

Nconv

Rank

Rank

Nconv

Rank

Rank

Nconv

Rank

Rank

Small a

7

5

1

2

7

2

4

7

2

4

7

1

1

Small a/b

12

2

4

4

6

5

13

11

4

16

12

2

3

Medium a

21

7

6

19

16

11

58

21

13

98

21

4

17

Medium a/b

17

2

26

46

6

21

42

14

60

462

16

23

187

aPart a lists old results; part b lists results using the size-dependent potential and X-ray-derived secondary structure; part c lists results using the size-dependent potential and ideal secondary structure. The number of proteins Nprot is listed in column 2; the number of cases that converged

˚ ˚ ˚ ˚

within a specified RMSD from the native (< 4 A, < 5 A, < 6 A, or < 7 A) Nconv is listed in columns

˚

3, 6, 9, and 12. (Note that the rank < 4 A was not calculated for the old results, so a ‘‘—’’ is shown). Also listed are the average and maximum rank of converged clusters within each RMSD range.

feature that other groups using similar approaches have also observed [25]. A very important aspect of the results though, not apparent in the data presented here, is that for all simulations discussed above, the energy gap between the lowest-energy misfolded structures and low-energy native-like structures is quite small, on the order of 5–30 energy units where the energy scale is