Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
Скачиваний:
68
Добавлен:
15.08.2013
Размер:
5.59 Mб
Скачать

12

MacKerell

the most widely used energy functions are those included with the CHARMM [8,9], AMBER [10], and GROMOS [11] programs. Two extensions beyond the terms in Eqs.

(2) and (3) are often included in biomolecular force fields. A harmonic term for improper dihedrals is often used to treat out-of-plane distortions, such as those that occur with aromatic hydrogens (i.e., Wilson wags). Historically, the improper term was also used to maintain the proper chirality in extended-atom models of proteins (e.g., without the Hα hydrogen, the chirality of amino acids is undefined). Some force fields also contain a Urey–Bradly term that treats 1,3 atoms (the two terminal atoms in an angle; see Fig. 1) with a harmonic bond-stretching term in order to more accurately model vibrational spectra.

Beyond the extensions mentioned in the previous paragraph, a variety of terms are included in force fields used for the modeling of small molecules that can also be applied to biological systems. These types of force fields are often referred to as Class II force fields, to distinguish then from the Class I force fields such as AMBER, CHARMM, and GROMOS discussed above. For example, the bond term in Eq. (2) can be expanded to include cubic and quartic terms, which will more accurately treat the anharmonicity associated with bond stretching. Another extension is the addition of cross terms that express the influence that stretching of a bond has on the stretching of an adjacent bond. Cross terms may also be used between the different types of terms such as bond angle or dihedral angle terms, allowing for the influence of bond length on angle bending or of angle bending on dihedral rotations, respectively, to be more accurately modeled [12]. Extensions may also be made to the interaction portion of the force field [Eq. (3)]. These may include terms for electronic polarizability (see below) or the use of 1/r4 terms to treat ion–dipole interactions associated with interactions between, for example, ions and the peptide backbone [13]. In all cases the extension of a potential energy function should, in principle, allow for the system of interest to be modeled with more accuracy. The gains associated with the additional terms, however, are often significant only in specific cases (e.g., the use of a 1/r4 term in the study of specific cation–peptide interactions), making their inclusion for the majority of calculations on biochemical systems unwarranted, especially when those terms increase the demand on computational resources.

D. Alternatives to the Potential Energy Function

The form of the potential energy function in Eqs. (1)–(3) was developed based on a combination of simplicity with required accuracy. However, a number of other forms can be used to treat the different terms in Eqs. (2) and (3). One alternative form used to treat the bond is referred to as the Morse potential. This term allows for bond-breaking events to occur and includes anharmonicity in the bond-stretching surface near the equilibrium value. The ability to break bonds, however, leads to forces close to zero at large bond distances, which may present a problem when crude modeling techniques are used to generate structures [14]. A number of variations in the form of the equation to treat the VDW interactions have been applied. The 1/r12 term used for modeling exchange repulsion overestimates the distance dependence of the repulsive wall, leading to the use of an 1/r9 term [15] or exponential repulsive terms [16]. A more recent variation is the buffered 14-7 form, which was selected because of its ability to reproduce interactions between rare gas atoms [17]. Concerning electrostatic interactions, the majority of potential energy functions employ the standard Coulombic term shown in Eq. (3), with one variation being the use of bond dipoles rather than atom-centered partial atomic charges [16]. As with

Atomistic Models and Force Fields

13

the extensions to the force fields discussed above, the alternative forms discussed in this paragraph generally do not yield significant gains in accuracy for biomolecular simulations performed in condensed phase environments at room temperature, although for specific situations they may.

III. EMPIRICAL FORCE FIELDS

A. From Potential Energy Functions to Force Fields

Equations (1)–(3) in combination are a potential energy function that is representative of those commonly used in biomolecular simulations. As discussed above, the form of this equation is adequate to treat the physical interactions that occur in biological systems. The accuracy of that treatment, however, is dictated by the parameters used in the potential energy function, and it is the combination of the potential energy function and the parameters that comprises a force field. In the remainder of this chapter we describe various aspects of force fields including their derivation (i.e., optimization of the parameters), those widely available, and their applicability.

B. Overview of Available Force Fields

Currently there a variety of force fields that may, in principle, be used for computational studies of biological systems. Of these force fields, however, only a subset have been designed specifically for biomolecular simulations. As discussed above, the majority of biomolecular simulations are performed with the CHARMM, AMBER, and GROMOS packages. Recent publication of new CHARMM [18–20] and AMBER [21] force fields allows for these to be discussed in detail. Although the forms of the potential energy functions in CHARMM and AMBER are similar, with CHARMM including the additional improper and Urey–Bradley terms (see above), significant philosophical and parameter optimization differences exist (see below). The latest versions of both force fields are allatom representations, although extended-atom representations are available [22,23].

To date, a number of simulation studies have been performed on nucleic acids and proteins using both AMBER and CHARMM. A direct comparison of crystal simulations of bovine pancreatic trypsin inhibitor show that the two force fields behave similarly, although differences in solvent–protein interactions are evident [24]. Side-by-side tests have also been performed on a DNA duplex, showing both force fields to be in reasonable agreement with experiment although significant, and different, problems were evident in both cases [25]. It should be noted that as of the writing of this chapter revised versions of both the AMBER and CHARMM nucleic acid force fields had become available. Several simulations of membranes have been performed with the CHARMM force field for both saturated [26] and unsaturated [27] lipids. The availability of both protein and nucleic acid parameters in AMBER and CHARMM allows for protein–nucleic acid complexes to be studied with both force fields (see Chapter 20), whereas protein–lipid (see Chapter 21) and DNA–lipid simulations can also be performed with CHARMM.

A number of more general force fields for the study of small molecules are available that can be extended to biological molecules. These force fields have been designed with the goal of being able to treat a wide variety of molecules, based on the ability to transfer parameters between chemical systems and the use of additional terms (e.g., cross terms) in their potential energy functions. Typically, these force fields have been optimized to

14

MacKerell

treat small molecules in the gas phase, although exceptions do exist. Such force fields may also be used for biological simulations; however, the lack of emphasis on properly treating biological systems generally makes them inferior to those discussed in the previous paragraphs. The optimized potential for liquid simulations (OPLS) force field was initially developed for liquid and hydration simulations on a variety of organic compounds [28,29]. This force field has been extended to proteins [30], nucleic acid bases [31], and carbohydrates [32], although its widespread use has not occurred. Some of the most widely used force fields for organic molecules are MM3 and its predecessors [33]. An MM3 force field for proteins has been reported [34]; however, it too has not been widely applied to date.

The consistent force field (CFF) series of force fields have also been developed to treat a wide selection of small molecules and include parameters for peptides. However, those parameters were developed primarily on the basis of optimization of the internal terms [35]. A recent extension of CFF, COMPASS, has been published that concentrates on producing a force field suitable for condensed phase simulations [36], although no condensed phase simulations of biological molecules have been reported. Another force field to which significant effort was devoted to allow for its application to a wide variety of compounds is the Merck Molecular Force Field (MMFF) [37]. During the development of MMFF, a significant effort was placed on optimizing the internal parameters to yield good geometries and energetics of small compounds as well as the accurate treatment of nonbonded interactions. This force field has been shown to be well behaved in condensed phase simulations of proteins; however, the results appear to be inferior to those of the AMBER and CHARMM models. Two other force fields of note are UFF [38] and DREIDING [14]. These force fields were developed to treat a much wider variety of molecules, including inorganic compounds, than the force fields mentioned previously, although their application to biological systems has not been widespread.

It should also be noted that a force field for a wide variety of small molecules, CHARMm (note the small ‘‘m,’’ indicating the commercial version of the program and parameters), is available [39] and has been applied to protein simulations with limited success. Efforts are currently under way to extend the CHARMm small molecule force field to make the nonbonded parameters consistent with those of the CHARMM force fields, thereby allowing for a variety of small molecules to be included in computational studies of biological systems.

Although the list of force fields discussed in this subsection is by no means complete, it does emphasize the wide variety of force fields that are available for different types of chemical systems as well as differences in their development and optimization.

C. Free Energy Force Fields

All of the force fields discussed in the preceding sections are based on potential energy functions. To obtain free energy information when using these force fields, statistical mechanical ensembles must be obtained via various simulation techniques. An alternative approach is to use a force field that has been optimized to reproduce free energies directly rather than potential energies. For example, a given set of dihedral parameters in a potential energy function may be adjusted to reproduce a QM-determined torsional potential energy surface for a selected model compound. In the case of a free energy force field, the dihedral parameters would be optimized to reproduce the experimentally observed probability distribution of that dihedral in solution. Because the experimentally determined probability

Atomistic Models and Force Fields

15

distribution corresponds to a free energy surface, a dihedral energy surface calculated using this force field would correspond to the free energy surface in solution. This allows for calculations to be performed in vacuum while yielding results that, in principle, correspond to the free energy in solution.

The best known of the free energy force fields is the Empirical Conformational Energy Program for Peptides (ECEPP) [40]. ECEPP parameters (both internal and external) were derived primarily on the basis of crystal structures of a wide variety of peptides. Such an approach yields significant savings in computational costs when sampling large numbers of conformations; however, microscopic details of the role of solvent on the biological molecules are lost. This type of approach is useful for the study of protein folding [41,42] as well as protein–protein or protein–ligand interactions [43].

An alternative to obtaining free energy information is the use of potential energy functions combined with methods to calculate the contribution of the free energy of solvation. Examples include methods based on the solvent accessibilities of atoms [44,45], continuum electrostatics–based models [46–49], and the generalized Born equation [50,51]. With some of these approaches the availability of analytical derivatives allows for their use in MD simulations; however, they are generally most useful for determining solvation contributions associated with previously generated conformations. See Chapter 7 for a detailed overview of these approaches.

D. Applicability of Force Fields

Clearly, the wide variety for force fields requires the user to carefully consider those that are available and choose that which is most appropriate for his or her particular application. Most important in this selection process is a knowledge of the information to be obtained from the computational study. If atomic details of specific interactions are required, then all-atom models with the explicit inclusion of solvent will be necessary. For example, experimental results indicate that a single point mutation in a protein increases its stability. Application of an all-atom model with explicit solvent in MD simulations would allow for atomic details of interactions of the two side chains with the environment to be understood, allowing for more detailed interpretation of the experimental data. Furthermore, the use of free energy perturbation techniques would allow for more quantitative data to be obtained from the calculations, although this approach requires proper treatment of the unfolded states of the proteins, which is difficult (see Chapter 9 for more details). In other cases, a more simplified model, such as an extended-atom force field with the solvent treated implicitly via the use of an R-dependent dielectric constant, may be appropriate. Examples include cases in which sampling of a large number of conformations of a protein or peptide is required [7]. In these cases the use of the free energy force fields may be useful. Another example is a situation in which the interaction of a number of small molecules with a macromolecule is to be investigated. In such a case it may be appropriate to treat both the small molecules and the macromolecule with one of the small-molecule- based force fields, although the quality of the treatment of the macromolecule may be sacrificed. In these cases the reader is advised against using one force field for the macromolecule and a second, unrelated, force field for the small molecules. There are often significant differences in the assumptions made when the parameters were being developed that would lead to a severe imbalance between the energetics and forces dictating the individual macromolecule and small molecule structures and the interactions between those molecules. If possible, the user should select a model system related to the particular