Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
Скачиваний:
68
Добавлен:
15.08.2013
Размер:
5.59 Mб
Скачать

Internal Coordinate Simulation

123

alized accelerations if the system can be treated as a tree of articulated rigid bodies. They are directly applicable to molecular models considered in the previous section, and when these algorithms were first implemented [8,36] it appeared that, as in Newtonian dynamics, the cost of the time step could be made close to that of the evaluation of forces.

Yet another difficulty was encountered in the numerical integration of dynamics equations. The general structure of the internal coordinate equations precludes the use of familiar Verlet or leapfrog algorithms, and that is why, at first, general-purpose predictorcorrector and Runge–Kutta integrators were used [8,36,39,40]. The results, however, clearly indicated that the quality of trajectories is much inferior to the conventional MD, even though the possibility of a considerable increase in time step length was demonstrated on some examples [36,39,40]. This difficulty could not be anticipated, because only recently was it realized that the exceptional stability of the integrators of the Sto¨rmer– Verlet–leapfrog group is bound to their symplectic property, which in turn is due to the fact that the Newtonian equations are essentially Hamiltonian. A very recent approach [42] seems to overcome this difficulty, and it has been demonstrated that ICMD is able to give a net gain in terms of computations per picosecond of dynamics.

The last problem to be mentioned concerns the physical factors that limit time steps in ICMD. All biopolymers have hierarchical spatial organization, and this structural hierarchy is naturally mapped onto the spectrum of their motions. That is, the fast motions involve individual atoms and chemical groups, whereas the slow ones correspond to displacements of secondary structures, domains, etc. Every such movement considered separately can be characterized by a certain maximum time step, and in this sense one can say that there exists a hierarchy of fast motions and, accordingly, step size limits. The lowest limit is determined by bond-stretching vibrations of hydrogens. It was always assumed that stretching of bonds between non-hydrogen atoms follows next [47], but in fact, until very recently, other fast motions did not attract much attention because only bond length constraints were technically possible anyway. With the development of ICMD, this issue acquired practical importance, and simultaneously it became possible to try different sorts of constraints and study the hierarchy of fast motions in larger detail. It was found [48] that, in proteins, this hierarchy does not always agree with the common intuitive suggestions. For instance, very fast collective vibrations in which hydrogen bonding plays a major role are rather common. On the other hand, nonbonded interatom interactions impose ubiquitous anharmonic limitations starting from rather small step sizes. This type of limitation is most important for ICMD, but, unfortunately, it is also the most difficult to reveal and overcome.

The last two problems have been realized only recently, and additional progress in these research directions may be expected in the near future. At present it is clear that with the standard geometry approximation all time step limitations below 10 fs can be overcome rather easily. This time step increase gives a substantial net increase in performance compared to conventional MD. The possibility of larger step sizes now looks problematic, although it has been demonstrated for small molecules. Larger steps should be possible, however, with constraints beyond the standard geometry approximation.

B. Dynamics of Molecular Trees

For the model of free point particles the Newtonian equations present by far the simplest and most efficient analytical formalism. In contrast, for chains of rigid bodies, there are several different, but equally applicable, analytical methods in mechanics, with their spe-

124

Mazur

cific advantages and disadvantages. Recent studies in this field made clear, however, that the analytical difficulties connected with the size and chemical complexity of biological macromolecules can probably be overcome with any such formalism, and the main question is whether a given method is numerically efficient. Until now, only a few approaches have been able to treat large molecules, and they all take advantage of tree topologies in fast recurrent algorithms similar to the one outlined in Section III.B. The best performance has been achieved by combining the fast mass matrix inversion technique resulting from the Newton–Euler analysis of rigid-body dynamics [35] with equations of motion in canonical variables [42], which make possible symplectic numerical integration.

The symplectic property is a key feature of an integrator in the calculation of longtime trajectories of classical mechanics [49]. The term ‘‘symplectic’’ means that the discrete mapping corresponding to one time step must conserve the set of symplectic invariants of mechanics, one of which is the phase volume [50]. This condition looks very complex, but in practice it just means that one iteration of the integrator can be represented as a sequence of moves, each of which is an exact mechanical trajectory corresponding to some Hamiltonian. For instance, leapfrog or Verlet steps can be represented as several moves with only the kinetic or potential part of the full Hamiltonian used. However, such steps are possible only in the space of canonical variables, and that is why the generalized velocities and accelerations corresponding to internal coordinates are not appropriate as dynamic variables.

By definition, the vector of conjugate momenta corresponding to the vector of generalized internal coordinates q is

p

L(q,

q˙ )

M(q)

(3)

 

where L is the Lagrangian and M(q) is the mass matrix. It can be shown [42] that the conjugate momentum of a translational variable is given by the projection of the total Cartesian momentum of the articulated body onto the direction of translation. The conjugate momentum of a rotational variable is the projection of the angular momentum of the articulated body onto the rotation axis. Neither of entities is convenient as an independent variable, but one avoids the difficulties by using equations of motion of the form

U

w(q, )

(4a)

q

and

 

 

 

q˙ M 1p

 

(4b)

where w(q, ) is an inertial term. For translational and rotational variables, respectively, its components read

wi iPi

(5a)

and

 

wi iQi Pi (ei i)

(5b)

where Pi and Qi are the translational and angular momenta, respectively, of the articulated body Di. Similarly to forces and torques in Eqs. (1) and (2), these can be rapidly computed by a recurrent summation. Similar summation techniques are employed in computing the

Internal Coordinate Simulation

125

product M 1p in Eq. (4b). The corresponding algorithms originate from robot mechanics and are based on a special factorization of the mass matrix. A very clear unified presentation of these methods is given by Jain [51].

Equations (4) are called quasi-Hamiltonian because, even though they employ generalized velocities, they describe the motion in the space of canonical variables. Accordingly, numerical trajectories computed with appropriate integrators will conserve the symplectic structure. For example, an implicit leapfrog integrator can be expressed as

fn f(qn)

h

qn 1/2 qn n 1/2 2

h

pn 1/2 pn 1/2 fnh (wn 1/2 wn 1/2) 2

qn 1/2 Mn 11/2 pn 1/2

qn 1 qn n 1/2h

(6a)

(6b)

(6c)

(6d)

(6e)

where the conventional notation is used for denoting on-step and half-step values. The lines marked by circles ( ) are iterated until the convergence of Eqs. (6b) and (6c). When the mass matrix does not depend on coordinates, w(q, ) vanishes, and this integrator is reduced to the standard leapfrog. It is symplectic in the same sense as leapfrog, namely, the symplectic structure is conserved for pairs (pn 1/2, qn) and (pn 1/2, qn).

C. Simulation of Flexible Rings

Treatment of flexible rings is a special and inherently difficult task for algorithms that use specific advantages of tree topologies. If such a topology is imposed on a ring, it will be broken once all internal coordinates start to change independently. Several possible ways out of this can be considered. The simplest consists in applying harmonic restraints to the broken ring bonds. In this case, in dynamics, the time step may be limited by the frequencies introduced by these restraints. The rigorous but complex way is to treat some of the internal coordinates as dependent variables and exclude them from equations of motion [52]. However, this involves mass matrix transformations that would be incompatible with the fast inversion algorithms. The third way is to impose ring closure constraints explicitly, similarly to the method of constraints in Cartesian MD. The last possibility has been recently checked, and it gives an acceptable solution [53,54]. This difficulty is most critical for simulations of nucleic acids in which the bases are connected with the sugar– phosphate backbone via five-membered rings, and we now consider this specific example.

Figure 3 shows how the tree is constructed for a sugar ring in a nucleic acid. Ring atoms are numbered 1, . . . , 5 corresponding to C4, C3, C2, C1, O4. The base is placed at the 5end; the main chain goes along the backbone with branching for bases at C3atoms. The ring conformation is determined by five valence and dihedral angles q1, . . . , q5 indicated by arrows. The bond C4O4shown by the broken line is excluded from the tree and replaced by the distance constraint

C |r5 r1 | l15 0.

(7)

126

Mazur

Figure 3 The underlying tree of a furanose ring in nucleic acids. Atoms are numbered 1, . . . , 5 corresponding to the natural tree ordering. All bond lengths are fixed. Arrows illustrate five internal coordinates that determine the ring conformation.

Here and below, ri, lij, and eij, i, j 1, . . . , 5, denote atomic position vectors, atom– atom distances, and the corresponding unit vectors, respectively. In order to construct a correctly closed conformation, variables q1, . . . , q4 are considered independent, and the last valence angle q5 is computed from Eq. (7) as follows. Variables q1, . . . , q4 determine the orientation of the plane of q5 specified by vector e34 and an in-plane unit vector e345 orthogonal to it. In the basis of these two vectors, condition (7) results in

x(e14e34) y(e14e345)

l 152 l 452 l 142

 

(8a)

2l14l45

 

 

x2 y2 1

(8b)

where x and y are local coordinates of vector e45. This system is reduced to a square equation and gives a single x 0 solution, which solves the problem.

When equations of motion are integrated, all five generalized coordinates shown in Figure 3 are considered independent. The constraint condition of Eq. (7) means, however, that there is an additional reaction force applied between atoms C4and O4. Such forces in all sugar rings result in a generalized reaction force f that has to be added to other forces in the system. Reactions depend upon both coordinates and velocities, but it appears, fortunately, that their explicit calculation is unnecessary. It is sufficient that the components of velocities along constrained bonds be canceled, which is achieved by projecting the vector of generalized velocities, predicted with constraints ignored, upon a certain multidimensional plane. Integrator (6) is modified as follows:

 

fn f(qn)

 

 

 

 

 

(9a)

 

qn 1/2

qn n 1/2

h

 

 

 

 

 

(9b)

2

 

 

 

 

 

 

 

n 1/2

pn 1/2 fnh (wn 1/2

wn 1/2)

h

fn 1/2

h

(9c)

 

 

 

 

2

 

 

 

 

 

2