Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
Скачиваний:
71
Добавлен:
15.08.2013
Размер:
5.59 Mб
Скачать

184

Simonson

Many applications are concerned only with binding free energy differences. Comparing the binding of two ligands, L and L, to the receptors R and R, we have

∆∆F b

(L, L) F b

(RL) F b(RL) kT ln ZRL

kT ln ZL

(25)

0

0

0

 

ZRL

 

 

ZL

 

0

0

0

ZRL

 

ZR

 

∆∆F b

(R, R) Fb(RL) F b(RL) kT ln

 

kT ln

 

 

(26)

ZRL

ZR

Thus, the standard state concentration cancels from these double free energy differences. The calculation can be done by mutating L to L(or R to R) both in the complex and in solution (horizontal legs of Fig. 1a).

IV. CONFORMATIONAL FREE ENERGIES

Free energy changes associated with conformational changes are the second major application of free energy calculations. Simple examples are the free energy profile for rotating a protein side chain around one or more dihedral torsion angles or for modifying the length of an individual covalent bond. Recent applications have been as complex as the unfolding of a protein [13]. In all cases, a reaction coordinate q is defined, involving one or more conformational degrees of freedom. The Helmholtz free energy W(q) along this coordinate is a configuration integral over all other degrees of freedom and takes the form

W(q) kT ln P(q)

(27)

where P(q) is the reaction coordinate probability density. W(q) is known as the potential of mean force (pmf). When comparing two or a few conformations separated by very low energy barriers ( kT 1 kcal/mol), the relative probabilities of each conformation can be estimated from an ordinary simulation, and Eq. (27) can be used directly to obtain the relative free energies. When the conformations are separated by larger barriers, barrier crossings in a simulation will be rare and P(q) statistically unreliable. The system must then be driven along q with an appropriate set of constraints or restraints. The formalism is simpler in the case of restraints, so this case is treated first.

A. Conformational Restraints or Umbrella Sampling

To bias the sampling toward a region of interest that would not otherwise be significantly populated, a restraining potential Ur(q) is added to the potential energy of the system. Ur is often referred to as an umbrella potential [37]. For concreteness, we assume the harmonic form

Ur(q; λ) kh[q q0(λ)]2

(28)

where kh is a force constant, q0(λ) is a target value of q, and λ a coupling parameter. However, umbrella potentials are by no means limited to a harmonic form (see below and Section VI.B). The reaction coordinate q could be a dihedral angle, a distance between two selected atoms, or a more complicated, collective degree of freedom such as a normal mode amplitude. q0(λ) is constructed so that as the coupling coordinate λ varies, q0(λ) traverses the region or regions of interest.

Free Energy Calculations

185

The free energy difference between two stable conformations can be obtained by a thermodynamic integration approach [38,39]. Let qA and qB represent the centers of the two corresponding energy wells. The free energy derivative is seen to be

F(λ) 2kh(q λ q0

(λ))

dq0

(λ)

(29)

 

∂λ

 

dλ

 

which can be obtained from a simulation with the restraining potential Ur(q; λ). Equation (29) is a generalization of Eq. (15) (where q b). Integrating between qA and qB gives the free energy difference between the two wells, but with restraints present at each endpoint. Additional steps are needed in which the restraints are removed at the endpoints. The corresponding free energies can be obtained from the thermodynamic perturbation formula (8),

F(restrained unrestrained) kT ln exp(Ur/kT) λA

(30)

where the averaging is performed over the restrained endpoint simulation, i.e., q0(λA) qA; a similar calculation is made at the B endpoint. This approach is easily generalized to nonharmonic restraint terms and to cases where several restraint terms are used. Thus, if restraints are applied to several dihedral angles or several interatomic distances, each will contribute a term of the form (29) to the free energy derivative.

Application of Eq. (30) corrects the free energies of the endpoints but not those of the intermediate conformations. Therefore, the above approach yields a free energy profile between qA and qB that is altered by the restraint(s). In particular, the barrier height is not that of the natural, unrestrained system. It is possible to correct the probability distributions Pr observed all along the pathway (with restraints) to obtain those of the unrestrained system [8,40]. From the relation P(q)Zur Pr(q)Zr exp(Ur/kT) and Eqs. (6)–(8), one obtains

P(q) Pr(q)eUr(q)/kT/eUr/kT r

(31)

Pr(q) eUr(q)/kT e Ur/kT ur

(32)

where the subscripts r and ur refer to the restrained and unrestrained systems, respectively. For reasons already discussed [see Fig. 2 and the discussion following Eq. (7)], the resulting P(q) is expected to be accurate only if there is a large overlap between probable conformations of the unrestrained system and conformations where the restraint energy is small. Thus in practice, q0(λ) must be close to a stable energy minimum of the unrestrained system, and P(q) will be accurate only close to q0(λ). To obtain P(q) over a broader range, a series of umbrella potentials is required, covering a range of q0 values. Let Urbe a second umbrella potential, corresponding to a q0(λ′) slightly displaced relative to q0(λ). One can show that [40]

P(q) P

r

(q) eUr(q)/kT/(eUr/kT e(UrUr)/kT

r

)

(33)

 

r

 

 

This formula is expected to be accurate close to q0(λ′). Continuing in this manner, one can obtain accurate formulas for P(q) over a broad range of q, provided the regions sampled with the successive umbrellas overlap.

For many problems, the ideal umbrella potential would be one that completely flattens the free energy profile along q, i.e., Ur(q) W(q). Such a potential cannot be determined in advance. However, iterative approaches exist that are known as adaptive

186

Simonson

umbrella sampling [41]. Such approaches are especially important for large-scale sampling of many very different conformations, such as folded and unfolded conformations of a protein or peptide. Recent applications to protein folding have used the potential energy as a ‘‘reaction coordinate’’ q, building up an umbrella potential that leads to a flat probability distribution and smooth sampling over a broad range of potential energies [42–45].

B. Weighted Histogram Analysis Method

It is often of interest to investigate not a one-dimensional but a twoor higher dimensional reaction coordinate. Free energy maps of polypeptides as a function of a pair of (φ, ψ) backbone torsion angles are an example. Equation (33) can be used to explore more than one coordinate q by using sets of umbrellas whose minima span a two-dimensional grid covering the range of interest. However, as the number of dimensions increases, propagation of error through Eq. (33) increases rapidly, and this approach becomes increasingly difficult. The weighted histogram analysis method (WHAM) is an alternative approach designed to minimize propagation of error by making optimal use of the information from multiple simulations with different umbrella potentials [42,46].

We consider the case of a two-dimensional reaction coordinate (q, s) first. R simulations are carried out, each having its own restraint energy term Uj(q, s). The (q, s) values observed in each simulation j are binned and counted, giving a series of R two-dimensional histograms. Let the bins along q be indexed by k and those along s by l; let the number of counts in each bin in simulation j be nj,kl, and let Nj klnj,kl be the total counts in simulation j. Let cj,kl exp[ Uj(qk, sl)/kT], where qk and sl are the centers of the bins k and l. The problem is to combine the histograms to obtain an estimate of the probability distribution p0kl P(qk, sl) of the unrestrained system. Making use of Eq. (31), assuming the observed counts nj ,kl follow a multinominal distribution, and maximizing a likelihood function, one obtains [42,46] the WHAM equations:

 

 

nj ,kl

 

p0kl

j

(34)

Nj fjcj ,kl

 

 

j

 

1

 

(35)

fj

 

 

cj ,kl p0kl

kl

Here, j runs over all simulations and k, l run over all bins. These equations can be solved iteratively, assuming an initial set of fj (e.g., fj 1), then calculating p0kl from Eq. (34) and updating the fj by Eq. (35), and so on, until the p0kl no longer vary, i.e., the two equations are self-consistent. From the p0kl P(qk, sl) and Eq. (27), one then obtains the free energy of each bin center (qk, sl). Error estimates are also obtained [46]. The method can be applied to a one-dimensional reaction coordinate or generalized to more than two dimensions and to cases in which simulations are run at several different temperatures [46]. It also applies when the reaction coordinates are alchemical coupling coordinates (see below and Ref. 47).

Free Energy Calculations

187

C. Conformational Constraints

The foregoing approaches used an umbrella potential to restrain q. The pmf W(q) can also be obtained from simulations where q is constrained to a series of values spanning the region of interest [48,49]. However, the introduction of rigid constraints complicates the theory considerably. Space limitations allow only a brief discussion here; for details, see Refs. 8 and 50–52.

To obtain thermodynamic perturbation or integration formulas for changing q, one must go back and forth between expressions of the configuration integral in Cartesian coordinates rN and in suitably chosen generalized coordinates uN [51]. This introduces Jacobian factors

J(q) det ri (q)

uj

into the formulas (where i, j 1, . . . , N and ‘‘det’’ represents the matrix determinant). Furthermore, it becomes necessary to perform averages in an ensemble where q is fixed (at some value q0) but the conjugate momentum pq is unconstrained [50,52]. Indeed, we seek the probability distribution P(q) of the natural system, whose momentum is not subjected to any particular constraints. This is not a problem in a Monte Carlo simulation, where the configurational degrees of freedom can be sampled without any assumptions about the velocities. But in a molecular dynamics simulation, fixing q q0 immediately constrains the conjugate momentum to be zero. Averages over such a simulation must therefore be corrected to remove the biasing effect of the momentum constraint. This introduces factors containing the mass-metric tensor; if q is one-dimensional, this tensor is a scalar function

 

i

i

i

 

N

 

 

2

N

1

 

q

Z(r

; q)

m

 

r

 

where mi is the mass of the particle corresponding to the coordinate ri [50]. The free energy to change the reaction coordinate from q δq to q δq takes the rather formidable form [51]

W(q δq) W(q δq)

(36)

kT ln J(q) 1(q δq)Z(rN; q) 1/2 exp( U /kT) qJ(q) 1(q δq)Z(rN; q) 1/2 exp( U /kT) q

where U and U represent the potential energy difference required to change q into q δq or q δq, respectively, with all other coordinates unchanged. The brackets indicate averages over a simulation where q is constrained (and the conjugate momentum is consequently zero).

A tractable example is the pmf between two particular particles in a macromolecule as a function of their separation q. The free energy to increase q by δq becomes

W(q δq) W(q) kT ln exp kTU q 2kT ln

q δq

 

(37)

q