
- •Foreword
- •Preface
- •Contents
- •Introduction
- •Oren M. Becker
- •Alexander D. MacKerell, Jr.
- •Masakatsu Watanabe*
- •III. SCOPE OF THE BOOK
- •IV. TOWARD A NEW ERA
- •REFERENCES
- •Atomistic Models and Force Fields
- •Alexander D. MacKerell, Jr.
- •II. POTENTIAL ENERGY FUNCTIONS
- •D. Alternatives to the Potential Energy Function
- •III. EMPIRICAL FORCE FIELDS
- •A. From Potential Energy Functions to Force Fields
- •B. Overview of Available Force Fields
- •C. Free Energy Force Fields
- •D. Applicability of Force Fields
- •IV. DEVELOPMENT OF EMPIRICAL FORCE FIELDS
- •B. Optimization Procedures Used in Empirical Force Fields
- •D. Use of Quantum Mechanical Results as Target Data
- •VI. CONCLUSION
- •REFERENCES
- •Dynamics Methods
- •Oren M. Becker
- •Masakatsu Watanabe*
- •II. TYPES OF MOTIONS
- •IV. NEWTONIAN MOLECULAR DYNAMICS
- •A. Newton’s Equation of Motion
- •C. Molecular Dynamics: Computational Algorithms
- •A. Assigning Initial Values
- •B. Selecting the Integration Time Step
- •C. Stability of Integration
- •VI. ANALYSIS OF DYNAMIC TRAJECTORIES
- •B. Averages and Fluctuations
- •C. Correlation Functions
- •D. Potential of Mean Force
- •VII. OTHER MD SIMULATION APPROACHES
- •A. Stochastic Dynamics
- •B. Brownian Dynamics
- •VIII. ADVANCED SIMULATION TECHNIQUES
- •A. Constrained Dynamics
- •C. Other Approaches and Future Direction
- •REFERENCES
- •Conformational Analysis
- •Oren M. Becker
- •II. CONFORMATION SAMPLING
- •A. High Temperature Molecular Dynamics
- •B. Monte Carlo Simulations
- •C. Genetic Algorithms
- •D. Other Search Methods
- •III. CONFORMATION OPTIMIZATION
- •A. Minimization
- •B. Simulated Annealing
- •IV. CONFORMATIONAL ANALYSIS
- •A. Similarity Measures
- •B. Cluster Analysis
- •C. Principal Component Analysis
- •REFERENCES
- •Thomas A. Darden
- •II. CONTINUUM BOUNDARY CONDITIONS
- •III. FINITE BOUNDARY CONDITIONS
- •IV. PERIODIC BOUNDARY CONDITIONS
- •REFERENCES
- •Internal Coordinate Simulation Method
- •Alexey K. Mazur
- •II. INTERNAL AND CARTESIAN COORDINATES
- •III. PRINCIPLES OF MODELING WITH INTERNAL COORDINATES
- •B. Energy Gradients
- •IV. INTERNAL COORDINATE MOLECULAR DYNAMICS
- •A. Main Problems and Historical Perspective
- •B. Dynamics of Molecular Trees
- •C. Simulation of Flexible Rings
- •A. Time Step Limitations
- •B. Standard Geometry Versus Unconstrained Simulations
- •VI. CONCLUDING REMARKS
- •REFERENCES
- •Implicit Solvent Models
- •II. BASIC FORMULATION OF IMPLICIT SOLVENT
- •A. The Potential of Mean Force
- •III. DECOMPOSITION OF THE FREE ENERGY
- •A. Nonpolar Free Energy Contribution
- •B. Electrostatic Free Energy Contribution
- •IV. CLASSICAL CONTINUUM ELECTROSTATICS
- •A. The Poisson Equation for Macroscopic Media
- •B. Electrostatic Forces and Analytic Gradients
- •C. Treatment of Ionic Strength
- •A. Statistical Mechanical Integral Equations
- •VI. SUMMARY
- •REFERENCES
- •Steven Hayward
- •II. NORMAL MODE ANALYSIS IN CARTESIAN COORDINATE SPACE
- •B. Normal Mode Analysis in Dihedral Angle Space
- •C. Approximate Methods
- •IV. NORMAL MODE REFINEMENT
- •C. Validity of the Concept of a Normal Mode Important Subspace
- •A. The Solvent Effect
- •B. Anharmonicity and Normal Mode Analysis
- •VI. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Free Energy Calculations
- •Thomas Simonson
- •II. GENERAL BACKGROUND
- •A. Thermodynamic Cycles for Solvation and Binding
- •B. Thermodynamic Perturbation Theory
- •D. Other Thermodynamic Functions
- •E. Free Energy Component Analysis
- •III. STANDARD BINDING FREE ENERGIES
- •IV. CONFORMATIONAL FREE ENERGIES
- •A. Conformational Restraints or Umbrella Sampling
- •B. Weighted Histogram Analysis Method
- •C. Conformational Constraints
- •A. Dielectric Reaction Field Approaches
- •B. Lattice Summation Methods
- •VI. IMPROVING SAMPLING
- •A. Multisubstate Approaches
- •B. Umbrella Sampling
- •C. Moving Along
- •VII. PERSPECTIVES
- •REFERENCES
- •John E. Straub
- •B. Phenomenological Rate Equations
- •II. TRANSITION STATE THEORY
- •A. Building the TST Rate Constant
- •B. Some Details
- •C. Computing the TST Rate Constant
- •III. CORRECTIONS TO TRANSITION STATE THEORY
- •A. Computing Using the Reactive Flux Method
- •B. How Dynamic Recrossings Lower the Rate Constant
- •IV. FINDING GOOD REACTION COORDINATES
- •A. Variational Methods for Computing Reaction Paths
- •B. Choice of a Differential Cost Function
- •C. Diffusional Paths
- •VI. HOW TO CONSTRUCT A REACTION PATH
- •A. The Use of Constraints and Restraints
- •B. Variationally Optimizing the Cost Function
- •VII. FOCAL METHODS FOR REFINING TRANSITION STATES
- •VIII. HEURISTIC METHODS
- •IX. SUMMARY
- •ACKNOWLEDGMENT
- •REFERENCES
- •Paul D. Lyne
- •Owen A. Walsh
- •II. BACKGROUND
- •III. APPLICATIONS
- •A. Triosephosphate Isomerase
- •B. Bovine Protein Tyrosine Phosphate
- •C. Citrate Synthase
- •IV. CONCLUSIONS
- •ACKNOWLEDGMENT
- •REFERENCES
- •Jeremy C. Smith
- •III. SCATTERING BY CRYSTALS
- •IV. NEUTRON SCATTERING
- •A. Coherent Inelastic Neutron Scattering
- •B. Incoherent Neutron Scattering
- •REFERENCES
- •Michael Nilges
- •II. EXPERIMENTAL DATA
- •A. Deriving Conformational Restraints from NMR Data
- •B. Distance Restraints
- •C. The Hybrid Energy Approach
- •III. MINIMIZATION PROCEDURES
- •A. Metric Matrix Distance Geometry
- •B. Molecular Dynamics Simulated Annealing
- •C. Folding Random Structures by Simulated Annealing
- •IV. AUTOMATED INTERPRETATION OF NOE SPECTRA
- •B. Automated Assignment of Ambiguities in the NOE Data
- •C. Iterative Explicit NOE Assignment
- •D. Symmetrical Oligomers
- •VI. INFLUENCE OF INTERNAL DYNAMICS ON THE
- •EXPERIMENTAL DATA
- •VII. STRUCTURE QUALITY AND ENERGY PARAMETERS
- •VIII. RECENT APPLICATIONS
- •REFERENCES
- •II. STEPS IN COMPARATIVE MODELING
- •C. Model Building
- •D. Loop Modeling
- •E. Side Chain Modeling
- •III. AB INITIO PROTEIN STRUCTURE MODELING METHODS
- •IV. ERRORS IN COMPARATIVE MODELS
- •VI. APPLICATIONS OF COMPARATIVE MODELING
- •VII. COMPARATIVE MODELING IN STRUCTURAL GENOMICS
- •VIII. CONCLUSION
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Roland L. Dunbrack, Jr.
- •II. BAYESIAN STATISTICS
- •A. Bayesian Probability Theory
- •B. Bayesian Parameter Estimation
- •C. Frequentist Probability Theory
- •D. Bayesian Methods Are Superior to Frequentist Methods
- •F. Simulation via Markov Chain Monte Carlo Methods
- •III. APPLICATIONS IN MOLECULAR BIOLOGY
- •B. Bayesian Sequence Alignment
- •IV. APPLICATIONS IN STRUCTURAL BIOLOGY
- •A. Secondary Structure and Surface Accessibility
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Computer Aided Drug Design
- •Alexander Tropsha and Weifan Zheng
- •IV. SUMMARY AND CONCLUSIONS
- •REFERENCES
- •Oren M. Becker
- •II. SIMPLE MODELS
- •III. LATTICE MODELS
- •B. Mapping Atomistic Energy Landscapes
- •C. Mapping Atomistic Free Energy Landscapes
- •VI. SUMMARY
- •REFERENCES
- •Toshiko Ichiye
- •II. ELECTRON TRANSFER PROPERTIES
- •B. Potential Energy Parameters
- •IV. REDOX POTENTIALS
- •A. Calculation of the Energy Change of the Redox Site
- •B. Calculation of the Energy Changes of the Protein
- •B. Calculation of Differences in the Energy Change of the Protein
- •VI. ELECTRON TRANSFER RATES
- •A. Theory
- •B. Application
- •REFERENCES
- •Fumio Hirata and Hirofumi Sato
- •Shigeki Kato
- •A. Continuum Model
- •B. Simulations
- •C. Reference Interaction Site Model
- •A. Molecular Polarization in Neat Water*
- •B. Autoionization of Water*
- •C. Solvatochromism*
- •F. Tautomerization in Formamide*
- •IV. SUMMARY AND PROSPECTS
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Nucleic Acid Simulations
- •Alexander D. MacKerell, Jr.
- •Lennart Nilsson
- •D. DNA Phase Transitions
- •III. METHODOLOGICAL CONSIDERATIONS
- •A. Atomistic Models
- •B. Alternative Models
- •IV. PRACTICAL CONSIDERATIONS
- •A. Starting Structures
- •C. Production MD Simulation
- •D. Convergence of MD Simulations
- •WEB SITES OF INTEREST
- •REFERENCES
- •Membrane Simulations
- •Douglas J. Tobias
- •II. MOLECULAR DYNAMICS SIMULATIONS OF MEMBRANES
- •B. Force Fields
- •C. Ensembles
- •D. Time Scales
- •III. LIPID BILAYER STRUCTURE
- •A. Overall Bilayer Structure
- •C. Solvation of the Lipid Polar Groups
- •IV. MOLECULAR DYNAMICS IN MEMBRANES
- •A. Overview of Dynamic Processes in Membranes
- •B. Qualitative Picture on the 100 ps Time Scale
- •C. Incoherent Neutron Scattering Measurements of Lipid Dynamics
- •F. Hydrocarbon Chain Dynamics
- •ACKNOWLEDGMENTS
- •REFERENCES
- •Appendix: Useful Internet Resources
- •B. Molecular Modeling and Simulation Packages
- •Index

326 |
Dunbrack |
and data sample means. The weight of the prior variance is represented by the degree of freedom, ν0, while the weight of the data variance is n 1.
F. Simulation via Markov Chain Monte Carlo Methods
In practice, it may not be possible to use conjugate prior and likelihood functions that result in analytical posterior distributions, or the distributions may be so complicated that the posterior cannot be calculated as a function of the entire parameter space. In either case, statistical inference can proceed only if random values of the parameters can be drawn from the full posterior distribution:
p(θ|y) |
p(y|θ)p(θ) |
(18) |
|
Θ p(y|θ)p(θ)dθ |
|||
|
|
We can also calculate expected values for any function of the parameters:
E[ f(θ|y] |
Θf(θ)p(y|θ)p(θ)dθ |
|
|
(19) |
|
|
||
|
Θ p(y|θ)p(θ)dθ |
If we could draw directly from the posterior distribution, then we could plot p(θ|y) from a histogram of the draws on θ. Similarly, we could calculate the expectation value of any function of the parameters by making random draws of θ from the posterior distribution and calculating
n
E[ f(θ)] 1n f(θt) (20)
t 1
In some cases, we may not be able to draw directly from the posterior distribution. The difficulty lies in calculating the denominator of Eq. (18), the marginal data distribution p(y). But usually we can evaluate the ratio of the probabilities of two values for the parameters, p(θt |y)/p(θu |y), because the denominator in Eq. (18) cancels out in the ratio. The Markov chain Monte Carlo method [40] proceeds by generating draws from some distribution of the parameters, referred to as the proposal distribution, such that the new draw depends only on the value of the old draw, i.e., some function q(θt |θt 1). We accept the new draw with probability
π(θt |θt 1) min 1, |
p(θt |y)q(θt 1 |θt) |
|
p(θt 1 | y)q(θt |θt 1) |
(21) |
and otherwise we set θt θt 1. This is the Metropolis–Hastings method, first proposed by Metropolis and Ulam [45] in the context of equation of state calculations [46] and further developed by Hastings [47]. This scheme can be shown to result in a stationary distribution that asymptotically approaches the posterior distribution.
Several variations of this method go under different names. The Metropolis algorithm uses only symmetrical proposal distributions such that q(θt |θt 1) q(θt 1 |θt). The expression for π(θt |θt 1) reduces to

Bayesian Statistics |
|
|
327 |
|
p(θt |y) |
|
|
π(θt |θt 1) min 1, |
|
|
(22) |
p(θt 1 |y) |
This is the form that chemists and physicists are most accustomed to. The probabilities are calculated from the Boltzmann equation and the energy difference between state t and state t 1. Because we are using a ratio of probabilities, the normalization factor, i.e., the partition function, drops out of the equation. Another variant when θ is multidimensional (which it usually is) is to update one component at a time. We define θt, i {θt,1, θt,2, . . . , θt,i 1, θt 1,i 1, . . . , θt 1,m}, where m is the number of components in θ. So θt, i contains all of the components except θ ,i and all the components that precede the ith component have been updated in step t, while the components that follow have not yet been updated. The m components are updated one at a time with this probability:
π(θt,i |θt, i) min 1, |
p(θt,i |y, θt, i)q(θt 1,i |θt,i, θt, i) |
|
p(θt 1,i |y, θt, i)q(θt,i |θt 1,i, θt, i) |
(23) |
If draws can be made from the posterior distribution for each component conditional on values for the others, i.e., from p(θt,i |y, θt, i), then this conditional posterior distribution can be used as the proposal distribution. In this case, the probability in Eq. (23) is always 1, and all draws are accepted. This is referred to as Gibbs sampling and is the most common form of MCMC used in statistical analysis.
G.Mixture Models
Mixture models have come up frequently in Bayesian statistical analysis in molecular and structural biology [16,28] as described below, so a description is useful here. Mixture models can be used when simple forms such as the exponential or Dirichlet function alone do not describe the data well. This is usually the case for a multimodal data distribution (as might be evident from a histogram of the data), when clearly a single Gaussian function will not suffice. A mixture is a sum of simple forms for the likelihood:
n |
|
p(y|θ) qi p(y|θi) |
(24) |
i 1
where ∑ni 1 qi 1 for the n components of the mixture. For instance, if the terms in Eq. (24) are normal, then each term is of the form (for a single data point yj)
p(yj |θi) |
1 |
exp |
(yj µi)2 |
|
(25) |
|
√ |
|
2σi2 |
||||
2πσi |
so each θi {µi, σ2i }.
Maximum likelihood methods used in classical statistics are not valid to estimate the θ’s or the q’s. Bayesian methods have only become possible with the development of Gibbs sampling methods described above, because to form the likelihood for a full data set entails the product of many sums of the form of Eq. (24):
N |
i 1 |
|
|
|
n |
|
|
p({y1, y2, . . . , yN}|θ) j 1 qi p(yj |θi) |
|
(26) |

328 |
Dunbrack |
Because we are dealing with count data and proportions for the values qi, the appropriate conjugate prior distribution for the q’s is the Dirichlet distribution,
p(q1, q2, . . . , qk) Dirichlet(α1, α2, . . . , αk)
where the α’s are prior counts for the components of the mixture. A simplification is to associate each data point with a single component, usually the component with the nearest location (i.e., µi). In this case, it is necessary to associate with each data point yj a variable cj that denotes the component that yj belongs to. These variables cj are unknown and are therefore called ‘‘missing data.’’ Equation (26) now simplifies to
|
N |
|
|
p({y1, y2, . . . , yN}|θ) |
|
p(yj |θcj) |
(27) |
j 1
A straightforward Gibbs sampling strategy when the number of components is known (or fixed) is as follows [48].
Step 1. From a histogram of the data, partition the data into N components, each roughly corresponding to a mode of the data distribution. This defines the cj. Set the parameters for prior distributions on the θ parameters that are conjugate to the likelihoods. For the normal distribution the priors are defined in Eq. (15), so the full prior for the n components is
n |
|
|
p(θ1, θ2, . . . , θk) i 1 |
N(µ0i, σ02i/κ0) Inv χ2(ν0i, σ02i) |
(28) |
The prior hyperparameters, µ0i, etc., can be estimated from the data assigned to each component. First define Ni ∑Nj 1 I(cj i), where I(cj i) 1 if cj i and is 0 otherwise. Then, for instance, the prior hyperparameters for the mean values are defined by
N
µ0i 1 I(cj i)yj (29)
Ni j 1
The parameters of the Dirichlet prior for the q’s should be proportional to the counts for each component in this preliminary data analysis. So we now have a collection of prior parameters {θ0i (µ0i, κ0i, σ20i, ν0i)} and a preliminary assignment of each data point to a component, {cj}, and therefore the preliminary number of data points for each component, {Ni}.
Step 2. Draw a value for each θi {µi, σ2i } from the normal posterior distribution for Ni data points with average yi,
p(θi |{yi}) N(µN |
, σN2 |
) Inv χ2(νN , σN2 |
i |
) |
(30) |
||||
|
|
i |
|
|
i |
i |
|
|
|
where [as in Eq. (17)] |
|
|
|
|
|||||
µN |
1 |
(κ0i µ0i Ni |
|
i) |
|
|
(31a) |
||
y |
|
|
|||||||
|
|
|
|||||||
i |
κNi |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
κNi κ0i Ni, |
νNi |
ν0i Ni |
|
|
(31b) |

Bayesian Statistics |
|
|
|
|
|
|
|
|
|
329 |
||
νN σN2 |
|
|
ν0i σ02i (Ni 1)si2 |
Niκ0i |
( |
|
i µ0i)2 |
(31c) |
||||
|
|
y |
||||||||||
|
|
|
||||||||||
i |
i |
|
|
|
|
|
κNi |
|
||||
|
|
|
|
|
|
|
|
|
|
|||
|
|
|
|
|
N |
|
||||||
si2 |
|
1 |
(yk |
|
i)2 |
(31d) |
||||||
|
|
y |
||||||||||
Ni |
1 |
|||||||||||
|
|
|
|
|
k 1 |
|
Draw (q1, q2, . . . , qn) from Dirichlet (α1 N1, α2 N2, . . . , αn Nn), which is the posterior distribution with prior counts αi and data counts Ni.
Step 3. Reset the cj by drawing a random number uj between 0 and 1 for each cj and set cj to i if
|
i′ 1 |
n |
|
|
1 |
qi p(yj|θi) uj |
1 |
qi p(yj |θi) |
(32) |
Z |
Z |
|||
|
i 1 |
i i′ 1 |
|
where Z ∑ni 1 qi p(yj |θi) is the normalization factor.
Step 4. Sum up the Ni and calculate the averages yi from the data and the values of cj.
Step 5. Repeat steps 2–4 until convergence.
The number of components necessary can usually be judged from the data, but the appropriateness of a particular value of n can be judged by comparing different values of n and calculating the entropy distance, or Kullback–Leibler divergence,
ED(g, h) ∫g(x) ln |
g(x) |
dx |
(33) |
|
|||
|
h(x) |
|
where, for instance, g might be a three-component model and h might be a two-component model. If ED(g, h) 0, then the model g is better than the model h.
H.Explanatory Variables
There is some confusion in using Bayes’ rule on what are sometimes called explanatory variables. As an example, we can try to use Bayesian statistics to derive the probabilities of each secondary structure type for each amino acid type, that is p(µ|r), where µ is α, β, or γ (for coil) secondary structures and r is one of the 20 amino acids. It is tempting to write p(µ|r) p(r|µ)p(µ)/p(r) using Bayes’ rule. This expression is, of course, correct and can be used on PDB data to relate these probabilities. But this is not Bayesian statistics, which relate parameters that represent underlying properties with (limited) data that are manifestations of those parameters in some way. In this case, the parameters we are after are θµ(r) p(µ|r). The data from the PDB are in the form of counts for yµ(r), the number of amino acids of type r in the PDB that have secondary structure µ. There are 60 such numbers (20 amino acid types 3 secondary structure types). We then have for each amino acid type a Bayesian expression for the posterior distribution for the values of θµ(r):
p( (r)|y(r)) p(y(r)| (r))p( (r)) |
(34) |
where and y are vectors of three components α, β, and γ. The prior is a Dirichlet distribution with some number of counts for the three secondary structure types for amino acid type r, i.e., Dirichlet (nα(r), nβ(r), nγ(r)). We could choose the three nµ(r) to be equal to some small number, say 10. Or we could set them equal to 100 pµ, where pµ is the