- •Foreword
- •Preface
- •Contents
- •Introduction
- •Oren M. Becker
- •Alexander D. MacKerell, Jr.
- •Masakatsu Watanabe*
- •III. SCOPE OF THE BOOK
- •IV. TOWARD A NEW ERA
- •REFERENCES
- •Atomistic Models and Force Fields
- •Alexander D. MacKerell, Jr.
- •II. POTENTIAL ENERGY FUNCTIONS
- •D. Alternatives to the Potential Energy Function
- •III. EMPIRICAL FORCE FIELDS
- •A. From Potential Energy Functions to Force Fields
- •B. Overview of Available Force Fields
- •C. Free Energy Force Fields
- •D. Applicability of Force Fields
- •IV. DEVELOPMENT OF EMPIRICAL FORCE FIELDS
- •B. Optimization Procedures Used in Empirical Force Fields
- •D. Use of Quantum Mechanical Results as Target Data
- •VI. CONCLUSION
- •REFERENCES
- •Dynamics Methods
- •Oren M. Becker
- •Masakatsu Watanabe*
- •II. TYPES OF MOTIONS
- •IV. NEWTONIAN MOLECULAR DYNAMICS
- •A. Newton’s Equation of Motion
- •C. Molecular Dynamics: Computational Algorithms
- •A. Assigning Initial Values
- •B. Selecting the Integration Time Step
- •C. Stability of Integration
- •VI. ANALYSIS OF DYNAMIC TRAJECTORIES
- •B. Averages and Fluctuations
- •C. Correlation Functions
- •D. Potential of Mean Force
- •VII. OTHER MD SIMULATION APPROACHES
- •A. Stochastic Dynamics
- •B. Brownian Dynamics
- •VIII. ADVANCED SIMULATION TECHNIQUES
- •A. Constrained Dynamics
- •C. Other Approaches and Future Direction
- •REFERENCES
- •Conformational Analysis
- •Oren M. Becker
- •II. CONFORMATION SAMPLING
- •A. High Temperature Molecular Dynamics
- •B. Monte Carlo Simulations
- •C. Genetic Algorithms
- •D. Other Search Methods
- •III. CONFORMATION OPTIMIZATION
- •A. Minimization
- •B. Simulated Annealing
- •IV. CONFORMATIONAL ANALYSIS
- •A. Similarity Measures
- •B. Cluster Analysis
- •C. Principal Component Analysis
- •REFERENCES
- •Thomas A. Darden
- •II. CONTINUUM BOUNDARY CONDITIONS
- •III. FINITE BOUNDARY CONDITIONS
- •IV. PERIODIC BOUNDARY CONDITIONS
- •REFERENCES
- •Internal Coordinate Simulation Method
- •Alexey K. Mazur
- •II. INTERNAL AND CARTESIAN COORDINATES
- •III. PRINCIPLES OF MODELING WITH INTERNAL COORDINATES
- •B. Energy Gradients
- •IV. INTERNAL COORDINATE MOLECULAR DYNAMICS
- •A. Main Problems and Historical Perspective
- •B. Dynamics of Molecular Trees
- •C. Simulation of Flexible Rings
- •A. Time Step Limitations
- •B. Standard Geometry Versus Unconstrained Simulations
- •VI. CONCLUDING REMARKS
- •REFERENCES
- •Implicit Solvent Models
- •II. BASIC FORMULATION OF IMPLICIT SOLVENT
- •A. The Potential of Mean Force
- •III. DECOMPOSITION OF THE FREE ENERGY
- •A. Nonpolar Free Energy Contribution
- •B. Electrostatic Free Energy Contribution
- •IV. CLASSICAL CONTINUUM ELECTROSTATICS
- •A. The Poisson Equation for Macroscopic Media
- •B. Electrostatic Forces and Analytic Gradients
- •C. Treatment of Ionic Strength
- •A. Statistical Mechanical Integral Equations
- •VI. SUMMARY
- •REFERENCES
- •Steven Hayward
- •II. NORMAL MODE ANALYSIS IN CARTESIAN COORDINATE SPACE
- •B. Normal Mode Analysis in Dihedral Angle Space
- •C. Approximate Methods
- •IV. NORMAL MODE REFINEMENT
- •C. Validity of the Concept of a Normal Mode Important Subspace
- •A. The Solvent Effect
- •B. Anharmonicity and Normal Mode Analysis
- •VI. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Free Energy Calculations
- •Thomas Simonson
- •II. GENERAL BACKGROUND
- •A. Thermodynamic Cycles for Solvation and Binding
- •B. Thermodynamic Perturbation Theory
- •D. Other Thermodynamic Functions
- •E. Free Energy Component Analysis
- •III. STANDARD BINDING FREE ENERGIES
- •IV. CONFORMATIONAL FREE ENERGIES
- •A. Conformational Restraints or Umbrella Sampling
- •B. Weighted Histogram Analysis Method
- •C. Conformational Constraints
- •A. Dielectric Reaction Field Approaches
- •B. Lattice Summation Methods
- •VI. IMPROVING SAMPLING
- •A. Multisubstate Approaches
- •B. Umbrella Sampling
- •C. Moving Along
- •VII. PERSPECTIVES
- •REFERENCES
- •John E. Straub
- •B. Phenomenological Rate Equations
- •II. TRANSITION STATE THEORY
- •A. Building the TST Rate Constant
- •B. Some Details
- •C. Computing the TST Rate Constant
- •III. CORRECTIONS TO TRANSITION STATE THEORY
- •A. Computing Using the Reactive Flux Method
- •B. How Dynamic Recrossings Lower the Rate Constant
- •IV. FINDING GOOD REACTION COORDINATES
- •A. Variational Methods for Computing Reaction Paths
- •B. Choice of a Differential Cost Function
- •C. Diffusional Paths
- •VI. HOW TO CONSTRUCT A REACTION PATH
- •A. The Use of Constraints and Restraints
- •B. Variationally Optimizing the Cost Function
- •VII. FOCAL METHODS FOR REFINING TRANSITION STATES
- •VIII. HEURISTIC METHODS
- •IX. SUMMARY
- •ACKNOWLEDGMENT
- •REFERENCES
- •Paul D. Lyne
- •Owen A. Walsh
- •II. BACKGROUND
- •III. APPLICATIONS
- •A. Triosephosphate Isomerase
- •B. Bovine Protein Tyrosine Phosphate
- •C. Citrate Synthase
- •IV. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Jeremy C. Smith
- •III. SCATTERING BY CRYSTALS
- •IV. NEUTRON SCATTERING
- •A. Coherent Inelastic Neutron Scattering
- •B. Incoherent Neutron Scattering
- •REFERENCES
- •Michael Nilges
- •II. EXPERIMENTAL DATA
- •A. Deriving Conformational Restraints from NMR Data
- •B. Distance Restraints
- •C. The Hybrid Energy Approach
- •III. MINIMIZATION PROCEDURES
- •A. Metric Matrix Distance Geometry
- •B. Molecular Dynamics Simulated Annealing
- •C. Folding Random Structures by Simulated Annealing
- •IV. AUTOMATED INTERPRETATION OF NOE SPECTRA
- •B. Automated Assignment of Ambiguities in the NOE Data
- •C. Iterative Explicit NOE Assignment
- •D. Symmetrical Oligomers
- •VI. INFLUENCE OF INTERNAL DYNAMICS ON THE
- •EXPERIMENTAL DATA
- •VII. STRUCTURE QUALITY AND ENERGY PARAMETERS
- •VIII. RECENT APPLICATIONS
- •REFERENCES
- •II. STEPS IN COMPARATIVE MODELING
- •C. Model Building
- •D. Loop Modeling
- •E. Side Chain Modeling
- •III. AB INITIO PROTEIN STRUCTURE MODELING METHODS
- •IV. ERRORS IN COMPARATIVE MODELS
- •VI. APPLICATIONS OF COMPARATIVE MODELING
- •VII. COMPARATIVE MODELING IN STRUCTURAL GENOMICS
- •VIII. CONCLUSION
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Roland L. Dunbrack, Jr.
- •II. BAYESIAN STATISTICS
- •A. Bayesian Probability Theory
- •B. Bayesian Parameter Estimation
- •C. Frequentist Probability Theory
- •D. Bayesian Methods Are Superior to Frequentist Methods
- •F. Simulation via Markov Chain Monte Carlo Methods
- •III. APPLICATIONS IN MOLECULAR BIOLOGY
- •B. Bayesian Sequence Alignment
- •IV. APPLICATIONS IN STRUCTURAL BIOLOGY
- •A. Secondary Structure and Surface Accessibility
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Computer Aided Drug Design
- •Alexander Tropsha and Weifan Zheng
- •IV. SUMMARY AND CONCLUSIONS
- •REFERENCES
- •Oren M. Becker
- •II. SIMPLE MODELS
- •III. LATTICE MODELS
- •B. Mapping Atomistic Energy Landscapes
- •C. Mapping Atomistic Free Energy Landscapes
- •VI. SUMMARY
- •REFERENCES
- •Toshiko Ichiye
- •II. ELECTRON TRANSFER PROPERTIES
- •B. Potential Energy Parameters
- •IV. REDOX POTENTIALS
- •A. Calculation of the Energy Change of the Redox Site
- •B. Calculation of the Energy Changes of the Protein
- •B. Calculation of Differences in the Energy Change of the Protein
- •VI. ELECTRON TRANSFER RATES
- •A. Theory
- •B. Application
- •REFERENCES
- •Fumio Hirata and Hirofumi Sato
- •Shigeki Kato
- •A. Continuum Model
- •B. Simulations
- •C. Reference Interaction Site Model
- •A. Molecular Polarization in Neat Water*
- •B. Autoionization of Water*
- •C. Solvatochromism*
- •F. Tautomerization in Formamide*
- •IV. SUMMARY AND PROSPECTS
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Nucleic Acid Simulations
- •Alexander D. MacKerell, Jr.
- •Lennart Nilsson
- •D. DNA Phase Transitions
- •III. METHODOLOGICAL CONSIDERATIONS
- •A. Atomistic Models
- •B. Alternative Models
- •IV. PRACTICAL CONSIDERATIONS
- •A. Starting Structures
- •C. Production MD Simulation
- •D. Convergence of MD Simulations
- •WEB SITES OF INTEREST
- •REFERENCES
- •Membrane Simulations
- •Douglas J. Tobias
- •II. MOLECULAR DYNAMICS SIMULATIONS OF MEMBRANES
- •B. Force Fields
- •C. Ensembles
- •D. Time Scales
- •III. LIPID BILAYER STRUCTURE
- •A. Overall Bilayer Structure
- •C. Solvation of the Lipid Polar Groups
- •IV. MOLECULAR DYNAMICS IN MEMBRANES
- •A. Overview of Dynamic Processes in Membranes
- •B. Qualitative Picture on the 100 ps Time Scale
- •C. Incoherent Neutron Scattering Measurements of Lipid Dynamics
- •F. Hydrocarbon Chain Dynamics
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Appendix: Useful Internet Resources
- •B. Molecular Modeling and Simulation Packages
- •Index
12 |
MacKerell |
the most widely used energy functions are those included with the CHARMM [8,9], AMBER [10], and GROMOS [11] programs. Two extensions beyond the terms in Eqs.
(2) and (3) are often included in biomolecular force fields. A harmonic term for improper dihedrals is often used to treat out-of-plane distortions, such as those that occur with aromatic hydrogens (i.e., Wilson wags). Historically, the improper term was also used to maintain the proper chirality in extended-atom models of proteins (e.g., without the Hα hydrogen, the chirality of amino acids is undefined). Some force fields also contain a Urey–Bradly term that treats 1,3 atoms (the two terminal atoms in an angle; see Fig. 1) with a harmonic bond-stretching term in order to more accurately model vibrational spectra.
Beyond the extensions mentioned in the previous paragraph, a variety of terms are included in force fields used for the modeling of small molecules that can also be applied to biological systems. These types of force fields are often referred to as Class II force fields, to distinguish then from the Class I force fields such as AMBER, CHARMM, and GROMOS discussed above. For example, the bond term in Eq. (2) can be expanded to include cubic and quartic terms, which will more accurately treat the anharmonicity associated with bond stretching. Another extension is the addition of cross terms that express the influence that stretching of a bond has on the stretching of an adjacent bond. Cross terms may also be used between the different types of terms such as bond angle or dihedral angle terms, allowing for the influence of bond length on angle bending or of angle bending on dihedral rotations, respectively, to be more accurately modeled [12]. Extensions may also be made to the interaction portion of the force field [Eq. (3)]. These may include terms for electronic polarizability (see below) or the use of 1/r4 terms to treat ion–dipole interactions associated with interactions between, for example, ions and the peptide backbone [13]. In all cases the extension of a potential energy function should, in principle, allow for the system of interest to be modeled with more accuracy. The gains associated with the additional terms, however, are often significant only in specific cases (e.g., the use of a 1/r4 term in the study of specific cation–peptide interactions), making their inclusion for the majority of calculations on biochemical systems unwarranted, especially when those terms increase the demand on computational resources.
D. Alternatives to the Potential Energy Function
The form of the potential energy function in Eqs. (1)–(3) was developed based on a combination of simplicity with required accuracy. However, a number of other forms can be used to treat the different terms in Eqs. (2) and (3). One alternative form used to treat the bond is referred to as the Morse potential. This term allows for bond-breaking events to occur and includes anharmonicity in the bond-stretching surface near the equilibrium value. The ability to break bonds, however, leads to forces close to zero at large bond distances, which may present a problem when crude modeling techniques are used to generate structures [14]. A number of variations in the form of the equation to treat the VDW interactions have been applied. The 1/r12 term used for modeling exchange repulsion overestimates the distance dependence of the repulsive wall, leading to the use of an 1/r9 term [15] or exponential repulsive terms [16]. A more recent variation is the buffered 14-7 form, which was selected because of its ability to reproduce interactions between rare gas atoms [17]. Concerning electrostatic interactions, the majority of potential energy functions employ the standard Coulombic term shown in Eq. (3), with one variation being the use of bond dipoles rather than atom-centered partial atomic charges [16]. As with
Atomistic Models and Force Fields |
13 |
the extensions to the force fields discussed above, the alternative forms discussed in this paragraph generally do not yield significant gains in accuracy for biomolecular simulations performed in condensed phase environments at room temperature, although for specific situations they may.
III. EMPIRICAL FORCE FIELDS
A. From Potential Energy Functions to Force Fields
Equations (1)–(3) in combination are a potential energy function that is representative of those commonly used in biomolecular simulations. As discussed above, the form of this equation is adequate to treat the physical interactions that occur in biological systems. The accuracy of that treatment, however, is dictated by the parameters used in the potential energy function, and it is the combination of the potential energy function and the parameters that comprises a force field. In the remainder of this chapter we describe various aspects of force fields including their derivation (i.e., optimization of the parameters), those widely available, and their applicability.
B. Overview of Available Force Fields
Currently there a variety of force fields that may, in principle, be used for computational studies of biological systems. Of these force fields, however, only a subset have been designed specifically for biomolecular simulations. As discussed above, the majority of biomolecular simulations are performed with the CHARMM, AMBER, and GROMOS packages. Recent publication of new CHARMM [18–20] and AMBER [21] force fields allows for these to be discussed in detail. Although the forms of the potential energy functions in CHARMM and AMBER are similar, with CHARMM including the additional improper and Urey–Bradley terms (see above), significant philosophical and parameter optimization differences exist (see below). The latest versions of both force fields are allatom representations, although extended-atom representations are available [22,23].
To date, a number of simulation studies have been performed on nucleic acids and proteins using both AMBER and CHARMM. A direct comparison of crystal simulations of bovine pancreatic trypsin inhibitor show that the two force fields behave similarly, although differences in solvent–protein interactions are evident [24]. Side-by-side tests have also been performed on a DNA duplex, showing both force fields to be in reasonable agreement with experiment although significant, and different, problems were evident in both cases [25]. It should be noted that as of the writing of this chapter revised versions of both the AMBER and CHARMM nucleic acid force fields had become available. Several simulations of membranes have been performed with the CHARMM force field for both saturated [26] and unsaturated [27] lipids. The availability of both protein and nucleic acid parameters in AMBER and CHARMM allows for protein–nucleic acid complexes to be studied with both force fields (see Chapter 20), whereas protein–lipid (see Chapter 21) and DNA–lipid simulations can also be performed with CHARMM.
A number of more general force fields for the study of small molecules are available that can be extended to biological molecules. These force fields have been designed with the goal of being able to treat a wide variety of molecules, based on the ability to transfer parameters between chemical systems and the use of additional terms (e.g., cross terms) in their potential energy functions. Typically, these force fields have been optimized to
14 |
MacKerell |
treat small molecules in the gas phase, although exceptions do exist. Such force fields may also be used for biological simulations; however, the lack of emphasis on properly treating biological systems generally makes them inferior to those discussed in the previous paragraphs. The optimized potential for liquid simulations (OPLS) force field was initially developed for liquid and hydration simulations on a variety of organic compounds [28,29]. This force field has been extended to proteins [30], nucleic acid bases [31], and carbohydrates [32], although its widespread use has not occurred. Some of the most widely used force fields for organic molecules are MM3 and its predecessors [33]. An MM3 force field for proteins has been reported [34]; however, it too has not been widely applied to date.
The consistent force field (CFF) series of force fields have also been developed to treat a wide selection of small molecules and include parameters for peptides. However, those parameters were developed primarily on the basis of optimization of the internal terms [35]. A recent extension of CFF, COMPASS, has been published that concentrates on producing a force field suitable for condensed phase simulations [36], although no condensed phase simulations of biological molecules have been reported. Another force field to which significant effort was devoted to allow for its application to a wide variety of compounds is the Merck Molecular Force Field (MMFF) [37]. During the development of MMFF, a significant effort was placed on optimizing the internal parameters to yield good geometries and energetics of small compounds as well as the accurate treatment of nonbonded interactions. This force field has been shown to be well behaved in condensed phase simulations of proteins; however, the results appear to be inferior to those of the AMBER and CHARMM models. Two other force fields of note are UFF [38] and DREIDING [14]. These force fields were developed to treat a much wider variety of molecules, including inorganic compounds, than the force fields mentioned previously, although their application to biological systems has not been widespread.
It should also be noted that a force field for a wide variety of small molecules, CHARMm (note the small ‘‘m,’’ indicating the commercial version of the program and parameters), is available [39] and has been applied to protein simulations with limited success. Efforts are currently under way to extend the CHARMm small molecule force field to make the nonbonded parameters consistent with those of the CHARMM force fields, thereby allowing for a variety of small molecules to be included in computational studies of biological systems.
Although the list of force fields discussed in this subsection is by no means complete, it does emphasize the wide variety of force fields that are available for different types of chemical systems as well as differences in their development and optimization.
C. Free Energy Force Fields
All of the force fields discussed in the preceding sections are based on potential energy functions. To obtain free energy information when using these force fields, statistical mechanical ensembles must be obtained via various simulation techniques. An alternative approach is to use a force field that has been optimized to reproduce free energies directly rather than potential energies. For example, a given set of dihedral parameters in a potential energy function may be adjusted to reproduce a QM-determined torsional potential energy surface for a selected model compound. In the case of a free energy force field, the dihedral parameters would be optimized to reproduce the experimentally observed probability distribution of that dihedral in solution. Because the experimentally determined probability
Atomistic Models and Force Fields |
15 |
distribution corresponds to a free energy surface, a dihedral energy surface calculated using this force field would correspond to the free energy surface in solution. This allows for calculations to be performed in vacuum while yielding results that, in principle, correspond to the free energy in solution.
The best known of the free energy force fields is the Empirical Conformational Energy Program for Peptides (ECEPP) [40]. ECEPP parameters (both internal and external) were derived primarily on the basis of crystal structures of a wide variety of peptides. Such an approach yields significant savings in computational costs when sampling large numbers of conformations; however, microscopic details of the role of solvent on the biological molecules are lost. This type of approach is useful for the study of protein folding [41,42] as well as protein–protein or protein–ligand interactions [43].
An alternative to obtaining free energy information is the use of potential energy functions combined with methods to calculate the contribution of the free energy of solvation. Examples include methods based on the solvent accessibilities of atoms [44,45], continuum electrostatics–based models [46–49], and the generalized Born equation [50,51]. With some of these approaches the availability of analytical derivatives allows for their use in MD simulations; however, they are generally most useful for determining solvation contributions associated with previously generated conformations. See Chapter 7 for a detailed overview of these approaches.
D. Applicability of Force Fields
Clearly, the wide variety for force fields requires the user to carefully consider those that are available and choose that which is most appropriate for his or her particular application. Most important in this selection process is a knowledge of the information to be obtained from the computational study. If atomic details of specific interactions are required, then all-atom models with the explicit inclusion of solvent will be necessary. For example, experimental results indicate that a single point mutation in a protein increases its stability. Application of an all-atom model with explicit solvent in MD simulations would allow for atomic details of interactions of the two side chains with the environment to be understood, allowing for more detailed interpretation of the experimental data. Furthermore, the use of free energy perturbation techniques would allow for more quantitative data to be obtained from the calculations, although this approach requires proper treatment of the unfolded states of the proteins, which is difficult (see Chapter 9 for more details). In other cases, a more simplified model, such as an extended-atom force field with the solvent treated implicitly via the use of an R-dependent dielectric constant, may be appropriate. Examples include cases in which sampling of a large number of conformations of a protein or peptide is required [7]. In these cases the use of the free energy force fields may be useful. Another example is a situation in which the interaction of a number of small molecules with a macromolecule is to be investigated. In such a case it may be appropriate to treat both the small molecules and the macromolecule with one of the small-molecule- based force fields, although the quality of the treatment of the macromolecule may be sacrificed. In these cases the reader is advised against using one force field for the macromolecule and a second, unrelated, force field for the small molecules. There are often significant differences in the assumptions made when the parameters were being developed that would lead to a severe imbalance between the energetics and forces dictating the individual macromolecule and small molecule structures and the interactions between those molecules. If possible, the user should select a model system related to the particular