Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
Скачиваний:
71
Добавлен:
15.08.2013
Размер:
5.59 Mб
Скачать

192 Simonson

scaled by a coupling parameter λ, the second term in Eq. (39) contributes a constant term λq2i ξEw to the free energy derivative.

Although lattice summation methods avoid the introduction of an electrostatic cutoff, they impose a periodicity at all times that does not exist in any real system (even a crystal, let alone a liquid). This affects the polarization in a nonrandom way, e.g., the

˚

alignment of dipoles was shown to be overstabilized at long ( 10 A) distances with common Ewald protocols (‘‘tinfoil’’ boundary conditions) [60]. Periodicity artifacts appear to be small for the free energy of charge creation in water [59] but have not yet been estimated for a macromolecule in solution. Methods to correct for them have been proposed for simple system geometries, which calculate the free energy difference between the periodic lattice and the nonperiodic system of interest from a dielectric continuum model; see, e.g., Refs. 12 and 61. With increasing computer power and simulation cell sizes, such artifacts will decrease.

VI. IMPROVING SAMPLING

The fundamental difficulty in free energy calculations lies in obtaining adequate sampling of conformations. Because of the ruggedness of the energy landscapes of proteins and nucleic acids, many energy barriers cannot be crossed in simulations spanning even a few nanoseconds. Therefore, specific strategies are needed to identify and sample all the energy basins, or substates, that contribute to a given free energy difference. The general problem of exploring and characterizing complex energy surfaces is much too vast to be discussed in detail here; see, e.g., Ref. 62. Even the techniques developed specifically for free energy calculations are so numerous that only a brief overview can be given.

A. Multisubstate Approaches

An alchemical free energy calculation compares two systems A and B, each of which usually possesses several slightly different, stable conformations. Thus Asp and Asn (Fig. 3) each possess three distinct stable rotamers around the χ1 torsion angle as well as multiple shallow energy basins corresponding to different orientations of the backbone groups and the χ2 torsion angle [26]. Whereas the latter basins are separated by small energy barriers ( kT), the χ1 wells are separated by barriers of 3 kcal/mol, which are rarely crossed on the 100–1000 ps time scale. Therefore, it is best to view each system (Asp in solution, Asn in solution) as a superposition of three conformational substates, identified by the side chain χ1 rotamer. The A B free energy calculation can then be based on a thermodynamic cycle analogous to the one in Figure 6 [38,63]. The free energy of Asp (system A) can be written (to within a constant c(N, V, T) [see Eq. (4)]),

FA kT ln e UA/kTdrN e UA/kTdrN e UA/kTdrN

1 2 3 (40)

kT ln(e FA1/kT e FA2/kT e FA3/kT)

Here, the configuration integral QA has been split into three integrals; the integration i is over all conformations where χ1 is in the ith rotameric state. FAi is the ‘‘configurational

Free Energy Calculations

193

Figure 6 Thermodynamic cycle for multi-substate free energy calculation. System A has n substates; system B has m. The free energy difference between A and B is related to the substate free energy differences through Eq. (41). A numerical example is shown in the graph (from Ref. 39), where A and B are two isomers of a surface loop of staphylococcal nuclease, related by cis–trans isomerization of proline 117. The cis trans free energy calculation took into account 20 substates for each isomer; only the six or seven most stable are included in the plot.

free energy’’ of that state. More precisely, FAi is the free energy [to within the constant c(N, V, T)] of a hypothetical system where the potential energy inside the ith χ1 energy well is the same as for A, but the potential energy outside the well is infinite. Whereas the absolute free energies FAi are difficult to compute [see discussion following Eq. (4)], the relative free energies of the three rotameric states are readily obtained by the methods of Section IV. The same calculation is performed for system B (Asn).

Finally, an alchemical free energy simulation is needed to obtain the free energy difference between any one substate of system A and any one substate of system B, e.g., FB1 FA1. In practice, one chooses two substates that resemble each other as much as possible. In the alchemical simulation, it is necessary to restrain appropriate parts of the system to remain in the chosen substate. Thus, for the present hybrid Asp/Asn molecule, the Asp side chain should be confined to the Asp substate 1 and the Asn side chain confined to its substate 1. Flat-bottomed dihedral restraints can achieve this very conveniently [38], in such a way that the most populated configurations (near the energy minimum) are hardly perturbed by the restraints. Note that if the substates A1 and B1 differ substantially, the transformation will be difficult to perform with a single-topology approach.

194

 

 

Simonson

The A B free energy change takes the final form

 

FB FA FB1 FA1

(41)

 

 

exp[ FB(1 2)/kT] exp[ FB(1 3)/kT]

1

 

kT ln

 

 

 

 

1

exp[ FA(1 2)/kT] exp[ FA(1 3)/kT]

 

where FA(1 2) FA2 FA1, and the other notations are defined similarly. Illustrative applications of this technique are found in, e.g., Refs. 38, 39, and 63–65.

The multisubstate approach requires initially identifying all important substates, a difficult and expensive operation. In cases of moderate complexity (e.g., a nine-residue protein loop), systematic searching and clustering have been used [39,66]. For larger systems, methods are still being developed.

B. Umbrella Sampling

A powerful and general technique to enhance sampling is the use of umbrella potentials, discussed in Section IV. In the context of alchemical free energy simulations, for example, umbrella potentials have been used both to bias the system toward an experimentally determined conformation [26] and to promote conformational transitions by reducing dihedral and van der Waals energy terms involving atoms near a mutation site [67].

Similar to the approaches described in Section IV, free energies for the unbiased system can be recovered from the biased simulations in at least two ways. First, one can introduce steps where the umbrella potential is turned on (initially) and off (at the end) and compute the corresponding free energies in analogy to Eq. (30) [67]. Second, although the configurational probabilities are modified by the umbrella potential [Eqs. (31), (33)], it is possible in principle to recover ensemble averages for the system of interest, i.e., the system without the umbrella potential [37]. For an observable O, we obtain [e.g., by integrating Eq. (31) over q]

O OeUr/kT r/eUr/kT r

(42)

where the brackets r and indicate averages over the system with and without the umbrella potential, respectively. In particular, the free energy derivatives F/∂λ, 2F/∂λ2,

. . . are ensemble averages [Eqs. (11), (12)] and can be obtained in this way from simulations with the umbrella potential [26]. Equation (42) can be generalized [37] to cases where simulations are run at a higher temperature than that of the system of interest (e.g., to promote conformational transitions further).

C. Moving Along

Another way to improve sampling for some problems is to treat the coupling coordinate or coordinates as dynamic variables. Thus, free energy simulations have been done where changes in a coupling coordinate λ were treated as Monte Carlo moves instead of being determined ahead of time [41,68]. More recently, coupling coordinates were included in the simulation as coordinates participating in the molecular dynamics, with artificial masses, akin to ‘‘pseudoparticles’’ [47,69] . An umbrella potential was used to drive the coupling coordinates from 0 to 1. The alchemical free energy calculation is thus treated as a pmf calculation along the coupling coordinate(s). Data from such a ‘‘λ-dynamics’’

Free Energy Calculations

195

simulation can be efficiently processed with the WHAM approach (above), where the coupling coordinates λ1, λ2, . . . play the role of the multidimensional reaction coordinates q, s, . . . .

These approaches can be used to simulate several ligands simultaneously, either in solution or in a receptor binding site. Each ligand i is associated with its own coupling constant or weight, λi, and with a term λiUi in the energy function. The different weights obey iλi 1. As the system evolves, the weights tend to adjust spontaneously in such a way that the most favorable ligand has the largest weight. Alternatively, the ligands can be made equiprobable by incorporating their free energies Fi into the energy function: Each term λiUi is replaced by λi(Ui Fi). Fi is not known ahead of time but can be determined iteratively [47]. This provides a new route to determining the relative solvation or binding free energies of two or more ligands, which was found to be more efficient than traditional thermodynamic perturbation or integration protocols in applications to simple systems. The variation of the λi with time implies that the system is never truly at equilibrium; to limit this effect, sufficient pseudomasses are needed for the λi; large masses in turn slow the exploration along each λi and limit efficiency. The performance for macromolecules has yet to be determined.

VII. PERSPECTIVES

With significant advances in recent years, free energy simulations can now be performed reliably for many biochemical problems if sufficient computing resources are available. Only a few representative applications could be mentioned above, including point mutations of buried residues [64,65], the creation of net charges in proteins [9,26], and conformational changes as large as the unfolding of small proteins [13]. References to many other applications can be found in papers just cited and in the review articles cited in Section I. Calculations of enthalpy and entropy changes are also becoming common. With increasing computer power, it will become straightforward to study charge creation with fully solvated simulation cells, lattice summation or reaction field methods, and force fields including atomic polarizability. All these calculations provide a direct connection between the macroscopic thermodynamics and the microscopic interactions of the investigated system.

Several important developing areas could only be touched on. One is the use of simplified free energy techniques to rapidly screen series of ligands or receptors [11,21,70]. These make use of a single simulation of a reference state and obtain the relative free energy of other, related molecules from a perturbation formula such as Eq. (9), e.g., truncated at second order (linear response). A recent twist has been to simulate a mixture of ligands with adjustable weights simultaneously, in analogy to a competitive binding experiment in solution [47,68]. Such calculations are parallelizable and will eventually be applicable to much larger, truly combinatorial libraries of ligands [71].

The protein folding problem is a central problem in computational biophysics that requires global optimization of the free energy. Although the prediction of structure from sequence alone is still out of reach, progress is being made in developing techniques to search broad ranges of conformations and estimate their free energies at different levels of approximation [62]. Some were mentioned above and are being used to study the folding and unfolding of proteins of known structures [13,43–45].