- •Foreword
- •Preface
- •Contents
- •Introduction
- •Oren M. Becker
- •Alexander D. MacKerell, Jr.
- •Masakatsu Watanabe*
- •III. SCOPE OF THE BOOK
- •IV. TOWARD A NEW ERA
- •REFERENCES
- •Atomistic Models and Force Fields
- •Alexander D. MacKerell, Jr.
- •II. POTENTIAL ENERGY FUNCTIONS
- •D. Alternatives to the Potential Energy Function
- •III. EMPIRICAL FORCE FIELDS
- •A. From Potential Energy Functions to Force Fields
- •B. Overview of Available Force Fields
- •C. Free Energy Force Fields
- •D. Applicability of Force Fields
- •IV. DEVELOPMENT OF EMPIRICAL FORCE FIELDS
- •B. Optimization Procedures Used in Empirical Force Fields
- •D. Use of Quantum Mechanical Results as Target Data
- •VI. CONCLUSION
- •REFERENCES
- •Dynamics Methods
- •Oren M. Becker
- •Masakatsu Watanabe*
- •II. TYPES OF MOTIONS
- •IV. NEWTONIAN MOLECULAR DYNAMICS
- •A. Newton’s Equation of Motion
- •C. Molecular Dynamics: Computational Algorithms
- •A. Assigning Initial Values
- •B. Selecting the Integration Time Step
- •C. Stability of Integration
- •VI. ANALYSIS OF DYNAMIC TRAJECTORIES
- •B. Averages and Fluctuations
- •C. Correlation Functions
- •D. Potential of Mean Force
- •VII. OTHER MD SIMULATION APPROACHES
- •A. Stochastic Dynamics
- •B. Brownian Dynamics
- •VIII. ADVANCED SIMULATION TECHNIQUES
- •A. Constrained Dynamics
- •C. Other Approaches and Future Direction
- •REFERENCES
- •Conformational Analysis
- •Oren M. Becker
- •II. CONFORMATION SAMPLING
- •A. High Temperature Molecular Dynamics
- •B. Monte Carlo Simulations
- •C. Genetic Algorithms
- •D. Other Search Methods
- •III. CONFORMATION OPTIMIZATION
- •A. Minimization
- •B. Simulated Annealing
- •IV. CONFORMATIONAL ANALYSIS
- •A. Similarity Measures
- •B. Cluster Analysis
- •C. Principal Component Analysis
- •REFERENCES
- •Thomas A. Darden
- •II. CONTINUUM BOUNDARY CONDITIONS
- •III. FINITE BOUNDARY CONDITIONS
- •IV. PERIODIC BOUNDARY CONDITIONS
- •REFERENCES
- •Internal Coordinate Simulation Method
- •Alexey K. Mazur
- •II. INTERNAL AND CARTESIAN COORDINATES
- •III. PRINCIPLES OF MODELING WITH INTERNAL COORDINATES
- •B. Energy Gradients
- •IV. INTERNAL COORDINATE MOLECULAR DYNAMICS
- •A. Main Problems and Historical Perspective
- •B. Dynamics of Molecular Trees
- •C. Simulation of Flexible Rings
- •A. Time Step Limitations
- •B. Standard Geometry Versus Unconstrained Simulations
- •VI. CONCLUDING REMARKS
- •REFERENCES
- •Implicit Solvent Models
- •II. BASIC FORMULATION OF IMPLICIT SOLVENT
- •A. The Potential of Mean Force
- •III. DECOMPOSITION OF THE FREE ENERGY
- •A. Nonpolar Free Energy Contribution
- •B. Electrostatic Free Energy Contribution
- •IV. CLASSICAL CONTINUUM ELECTROSTATICS
- •A. The Poisson Equation for Macroscopic Media
- •B. Electrostatic Forces and Analytic Gradients
- •C. Treatment of Ionic Strength
- •A. Statistical Mechanical Integral Equations
- •VI. SUMMARY
- •REFERENCES
- •Steven Hayward
- •II. NORMAL MODE ANALYSIS IN CARTESIAN COORDINATE SPACE
- •B. Normal Mode Analysis in Dihedral Angle Space
- •C. Approximate Methods
- •IV. NORMAL MODE REFINEMENT
- •C. Validity of the Concept of a Normal Mode Important Subspace
- •A. The Solvent Effect
- •B. Anharmonicity and Normal Mode Analysis
- •VI. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Free Energy Calculations
- •Thomas Simonson
- •II. GENERAL BACKGROUND
- •A. Thermodynamic Cycles for Solvation and Binding
- •B. Thermodynamic Perturbation Theory
- •D. Other Thermodynamic Functions
- •E. Free Energy Component Analysis
- •III. STANDARD BINDING FREE ENERGIES
- •IV. CONFORMATIONAL FREE ENERGIES
- •A. Conformational Restraints or Umbrella Sampling
- •B. Weighted Histogram Analysis Method
- •C. Conformational Constraints
- •A. Dielectric Reaction Field Approaches
- •B. Lattice Summation Methods
- •VI. IMPROVING SAMPLING
- •A. Multisubstate Approaches
- •B. Umbrella Sampling
- •C. Moving Along
- •VII. PERSPECTIVES
- •REFERENCES
- •John E. Straub
- •B. Phenomenological Rate Equations
- •II. TRANSITION STATE THEORY
- •A. Building the TST Rate Constant
- •B. Some Details
- •C. Computing the TST Rate Constant
- •III. CORRECTIONS TO TRANSITION STATE THEORY
- •A. Computing Using the Reactive Flux Method
- •B. How Dynamic Recrossings Lower the Rate Constant
- •IV. FINDING GOOD REACTION COORDINATES
- •A. Variational Methods for Computing Reaction Paths
- •B. Choice of a Differential Cost Function
- •C. Diffusional Paths
- •VI. HOW TO CONSTRUCT A REACTION PATH
- •A. The Use of Constraints and Restraints
- •B. Variationally Optimizing the Cost Function
- •VII. FOCAL METHODS FOR REFINING TRANSITION STATES
- •VIII. HEURISTIC METHODS
- •IX. SUMMARY
- •ACKNOWLEDGMENT
- •REFERENCES
- •Paul D. Lyne
- •Owen A. Walsh
- •II. BACKGROUND
- •III. APPLICATIONS
- •A. Triosephosphate Isomerase
- •B. Bovine Protein Tyrosine Phosphate
- •C. Citrate Synthase
- •IV. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Jeremy C. Smith
- •III. SCATTERING BY CRYSTALS
- •IV. NEUTRON SCATTERING
- •A. Coherent Inelastic Neutron Scattering
- •B. Incoherent Neutron Scattering
- •REFERENCES
- •Michael Nilges
- •II. EXPERIMENTAL DATA
- •A. Deriving Conformational Restraints from NMR Data
- •B. Distance Restraints
- •C. The Hybrid Energy Approach
- •III. MINIMIZATION PROCEDURES
- •A. Metric Matrix Distance Geometry
- •B. Molecular Dynamics Simulated Annealing
- •C. Folding Random Structures by Simulated Annealing
- •IV. AUTOMATED INTERPRETATION OF NOE SPECTRA
- •B. Automated Assignment of Ambiguities in the NOE Data
- •C. Iterative Explicit NOE Assignment
- •D. Symmetrical Oligomers
- •VI. INFLUENCE OF INTERNAL DYNAMICS ON THE
- •EXPERIMENTAL DATA
- •VII. STRUCTURE QUALITY AND ENERGY PARAMETERS
- •VIII. RECENT APPLICATIONS
- •REFERENCES
- •II. STEPS IN COMPARATIVE MODELING
- •C. Model Building
- •D. Loop Modeling
- •E. Side Chain Modeling
- •III. AB INITIO PROTEIN STRUCTURE MODELING METHODS
- •IV. ERRORS IN COMPARATIVE MODELS
- •VI. APPLICATIONS OF COMPARATIVE MODELING
- •VII. COMPARATIVE MODELING IN STRUCTURAL GENOMICS
- •VIII. CONCLUSION
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Roland L. Dunbrack, Jr.
- •II. BAYESIAN STATISTICS
- •A. Bayesian Probability Theory
- •B. Bayesian Parameter Estimation
- •C. Frequentist Probability Theory
- •D. Bayesian Methods Are Superior to Frequentist Methods
- •F. Simulation via Markov Chain Monte Carlo Methods
- •III. APPLICATIONS IN MOLECULAR BIOLOGY
- •B. Bayesian Sequence Alignment
- •IV. APPLICATIONS IN STRUCTURAL BIOLOGY
- •A. Secondary Structure and Surface Accessibility
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Computer Aided Drug Design
- •Alexander Tropsha and Weifan Zheng
- •IV. SUMMARY AND CONCLUSIONS
- •REFERENCES
- •Oren M. Becker
- •II. SIMPLE MODELS
- •III. LATTICE MODELS
- •B. Mapping Atomistic Energy Landscapes
- •C. Mapping Atomistic Free Energy Landscapes
- •VI. SUMMARY
- •REFERENCES
- •Toshiko Ichiye
- •II. ELECTRON TRANSFER PROPERTIES
- •B. Potential Energy Parameters
- •IV. REDOX POTENTIALS
- •A. Calculation of the Energy Change of the Redox Site
- •B. Calculation of the Energy Changes of the Protein
- •B. Calculation of Differences in the Energy Change of the Protein
- •VI. ELECTRON TRANSFER RATES
- •A. Theory
- •B. Application
- •REFERENCES
- •Fumio Hirata and Hirofumi Sato
- •Shigeki Kato
- •A. Continuum Model
- •B. Simulations
- •C. Reference Interaction Site Model
- •A. Molecular Polarization in Neat Water*
- •B. Autoionization of Water*
- •C. Solvatochromism*
- •F. Tautomerization in Formamide*
- •IV. SUMMARY AND PROSPECTS
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Nucleic Acid Simulations
- •Alexander D. MacKerell, Jr.
- •Lennart Nilsson
- •D. DNA Phase Transitions
- •III. METHODOLOGICAL CONSIDERATIONS
- •A. Atomistic Models
- •B. Alternative Models
- •IV. PRACTICAL CONSIDERATIONS
- •A. Starting Structures
- •C. Production MD Simulation
- •D. Convergence of MD Simulations
- •WEB SITES OF INTEREST
- •REFERENCES
- •Membrane Simulations
- •Douglas J. Tobias
- •II. MOLECULAR DYNAMICS SIMULATIONS OF MEMBRANES
- •B. Force Fields
- •C. Ensembles
- •D. Time Scales
- •III. LIPID BILAYER STRUCTURE
- •A. Overall Bilayer Structure
- •C. Solvation of the Lipid Polar Groups
- •IV. MOLECULAR DYNAMICS IN MEMBRANES
- •A. Overview of Dynamic Processes in Membranes
- •B. Qualitative Picture on the 100 ps Time Scale
- •C. Incoherent Neutron Scattering Measurements of Lipid Dynamics
- •F. Hydrocarbon Chain Dynamics
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Appendix: Useful Internet Resources
- •B. Molecular Modeling and Simulation Packages
- •Index
Internal Coordinate Simulation |
123 |
alized accelerations if the system can be treated as a tree of articulated rigid bodies. They are directly applicable to molecular models considered in the previous section, and when these algorithms were first implemented [8,36] it appeared that, as in Newtonian dynamics, the cost of the time step could be made close to that of the evaluation of forces.
Yet another difficulty was encountered in the numerical integration of dynamics equations. The general structure of the internal coordinate equations precludes the use of familiar Verlet or leapfrog algorithms, and that is why, at first, general-purpose predictorcorrector and Runge–Kutta integrators were used [8,36,39,40]. The results, however, clearly indicated that the quality of trajectories is much inferior to the conventional MD, even though the possibility of a considerable increase in time step length was demonstrated on some examples [36,39,40]. This difficulty could not be anticipated, because only recently was it realized that the exceptional stability of the integrators of the Sto¨rmer– Verlet–leapfrog group is bound to their symplectic property, which in turn is due to the fact that the Newtonian equations are essentially Hamiltonian. A very recent approach [42] seems to overcome this difficulty, and it has been demonstrated that ICMD is able to give a net gain in terms of computations per picosecond of dynamics.
The last problem to be mentioned concerns the physical factors that limit time steps in ICMD. All biopolymers have hierarchical spatial organization, and this structural hierarchy is naturally mapped onto the spectrum of their motions. That is, the fast motions involve individual atoms and chemical groups, whereas the slow ones correspond to displacements of secondary structures, domains, etc. Every such movement considered separately can be characterized by a certain maximum time step, and in this sense one can say that there exists a hierarchy of fast motions and, accordingly, step size limits. The lowest limit is determined by bond-stretching vibrations of hydrogens. It was always assumed that stretching of bonds between non-hydrogen atoms follows next [47], but in fact, until very recently, other fast motions did not attract much attention because only bond length constraints were technically possible anyway. With the development of ICMD, this issue acquired practical importance, and simultaneously it became possible to try different sorts of constraints and study the hierarchy of fast motions in larger detail. It was found [48] that, in proteins, this hierarchy does not always agree with the common intuitive suggestions. For instance, very fast collective vibrations in which hydrogen bonding plays a major role are rather common. On the other hand, nonbonded interatom interactions impose ubiquitous anharmonic limitations starting from rather small step sizes. This type of limitation is most important for ICMD, but, unfortunately, it is also the most difficult to reveal and overcome.
The last two problems have been realized only recently, and additional progress in these research directions may be expected in the near future. At present it is clear that with the standard geometry approximation all time step limitations below 10 fs can be overcome rather easily. This time step increase gives a substantial net increase in performance compared to conventional MD. The possibility of larger step sizes now looks problematic, although it has been demonstrated for small molecules. Larger steps should be possible, however, with constraints beyond the standard geometry approximation.
B. Dynamics of Molecular Trees
For the model of free point particles the Newtonian equations present by far the simplest and most efficient analytical formalism. In contrast, for chains of rigid bodies, there are several different, but equally applicable, analytical methods in mechanics, with their spe-
124 |
Mazur |
cific advantages and disadvantages. Recent studies in this field made clear, however, that the analytical difficulties connected with the size and chemical complexity of biological macromolecules can probably be overcome with any such formalism, and the main question is whether a given method is numerically efficient. Until now, only a few approaches have been able to treat large molecules, and they all take advantage of tree topologies in fast recurrent algorithms similar to the one outlined in Section III.B. The best performance has been achieved by combining the fast mass matrix inversion technique resulting from the Newton–Euler analysis of rigid-body dynamics [35] with equations of motion in canonical variables [42], which make possible symplectic numerical integration.
The symplectic property is a key feature of an integrator in the calculation of longtime trajectories of classical mechanics [49]. The term ‘‘symplectic’’ means that the discrete mapping corresponding to one time step must conserve the set of symplectic invariants of mechanics, one of which is the phase volume [50]. This condition looks very complex, but in practice it just means that one iteration of the integrator can be represented as a sequence of moves, each of which is an exact mechanical trajectory corresponding to some Hamiltonian. For instance, leapfrog or Verlet steps can be represented as several moves with only the kinetic or potential part of the full Hamiltonian used. However, such steps are possible only in the space of canonical variables, and that is why the generalized velocities and accelerations corresponding to internal coordinates are not appropriate as dynamic variables.
By definition, the vector of conjugate momenta corresponding to the vector of generalized internal coordinates q is
p |
∂L(q, |
q˙ ) |
M(q)q˙ |
(3) |
∂q˙ |
|
where L is the Lagrangian and M(q) is the mass matrix. It can be shown [42] that the conjugate momentum of a translational variable is given by the projection of the total Cartesian momentum of the articulated body onto the direction of translation. The conjugate momentum of a rotational variable is the projection of the angular momentum of the articulated body onto the rotation axis. Neither of entities is convenient as an independent variable, but one avoids the difficulties by using equations of motion of the form
p˙ |
∂U |
w(q, q˙) |
(4a) |
∂q |
|||
and |
|
|
|
q˙ M 1p |
|
(4b) |
where w(q, q˙) is an inertial term. For translational and rotational variables, respectively, its components read
wi e˙iPi |
(5a) |
and |
|
wi e˙iQi Pi (ei r˙i) |
(5b) |
where Pi and Qi are the translational and angular momenta, respectively, of the articulated body Di. Similarly to forces and torques in Eqs. (1) and (2), these can be rapidly computed by a recurrent summation. Similar summation techniques are employed in computing the
Internal Coordinate Simulation |
125 |
product M 1p in Eq. (4b). The corresponding algorithms originate from robot mechanics and are based on a special factorization of the mass matrix. A very clear unified presentation of these methods is given by Jain [51].
Equations (4) are called quasi-Hamiltonian because, even though they employ generalized velocities, they describe the motion in the space of canonical variables. Accordingly, numerical trajectories computed with appropriate integrators will conserve the symplectic structure. For example, an implicit leapfrog integrator can be expressed as
fn f(qn)
h
qn 1/2 qn q˙n 1/2 2
h
pn 1/2 pn 1/2 fnh (wn 1/2 wn 1/2) 2
qn 1/2 Mn 11/2 pn 1/2
qn 1 qn q˙n 1/2h
(6a)
(6b)
(6c)
(6d)
(6e)
where the conventional notation is used for denoting on-step and half-step values. The lines marked by circles ( ) are iterated until the convergence of Eqs. (6b) and (6c). When the mass matrix does not depend on coordinates, w(q, q˙) vanishes, and this integrator is reduced to the standard leapfrog. It is symplectic in the same sense as leapfrog, namely, the symplectic structure is conserved for pairs (pn 1/2, qn) and (pn 1/2, qn).
C. Simulation of Flexible Rings
Treatment of flexible rings is a special and inherently difficult task for algorithms that use specific advantages of tree topologies. If such a topology is imposed on a ring, it will be broken once all internal coordinates start to change independently. Several possible ways out of this can be considered. The simplest consists in applying harmonic restraints to the broken ring bonds. In this case, in dynamics, the time step may be limited by the frequencies introduced by these restraints. The rigorous but complex way is to treat some of the internal coordinates as dependent variables and exclude them from equations of motion [52]. However, this involves mass matrix transformations that would be incompatible with the fast inversion algorithms. The third way is to impose ring closure constraints explicitly, similarly to the method of constraints in Cartesian MD. The last possibility has been recently checked, and it gives an acceptable solution [53,54]. This difficulty is most critical for simulations of nucleic acids in which the bases are connected with the sugar– phosphate backbone via five-membered rings, and we now consider this specific example.
Figure 3 shows how the tree is constructed for a sugar ring in a nucleic acid. Ring atoms are numbered 1, . . . , 5 corresponding to C4′, C3′, C2′, C1′, O4′. The base is placed at the 5′ end; the main chain goes along the backbone with branching for bases at C3′ atoms. The ring conformation is determined by five valence and dihedral angles q1, . . . , q5 indicated by arrows. The bond C4′ O4′ shown by the broken line is excluded from the tree and replaced by the distance constraint
C |r5 r1 | l15 0. |
(7) |
126 |
Mazur |
Figure 3 The underlying tree of a furanose ring in nucleic acids. Atoms are numbered 1, . . . , 5 corresponding to the natural tree ordering. All bond lengths are fixed. Arrows illustrate five internal coordinates that determine the ring conformation.
Here and below, ri, lij, and eij, i, j 1, . . . , 5, denote atomic position vectors, atom– atom distances, and the corresponding unit vectors, respectively. In order to construct a correctly closed conformation, variables q1, . . . , q4 are considered independent, and the last valence angle q5 is computed from Eq. (7) as follows. Variables q1, . . . , q4 determine the orientation of the plane of q5 specified by vector e34 and an in-plane unit vector e345 orthogonal to it. In the basis of these two vectors, condition (7) results in
x(e14e34) y(e14e345) |
l 152 l 452 l 142 |
|
(8a) |
|
2l14l45 |
||||
|
|
|||
x2 y2 1 |
(8b) |
where x and y are local coordinates of vector e45. This system is reduced to a square equation and gives a single x 0 solution, which solves the problem.
When equations of motion are integrated, all five generalized coordinates shown in Figure 3 are considered independent. The constraint condition of Eq. (7) means, however, that there is an additional reaction force applied between atoms C4′ and O4′. Such forces in all sugar rings result in a generalized reaction force f that has to be added to other forces in the system. Reactions depend upon both coordinates and velocities, but it appears, fortunately, that their explicit calculation is unnecessary. It is sufficient that the components of velocities along constrained bonds be canceled, which is achieved by projecting the vector of generalized velocities, predicted with constraints ignored, upon a certain multidimensional plane. Integrator (6) is modified as follows:
|
fn f(qn) |
|
|
|
|
|
(9a) |
|||
|
qn 1/2 |
qn q˙n 1/2 |
h |
|
|
|
|
|
(9b) |
|
2 |
|
|
|
|
|
|
||||
|
p˜n 1/2 |
pn 1/2 fnh (wn 1/2 |
wn 1/2) |
h |
fn 1/2 |
h |
(9c) |
|||
|
|
|
||||||||
|
2 |
|||||||||
|
|
|
|
|
2 |
|
|