Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Becker O.M., MacKerell A.D., Roux B., Watanabe M. (eds.) Computational biochemistry and biophysic.pdf
Скачиваний:
71
Добавлен:
15.08.2013
Размер:
5.59 Mб
Скачать

452

MacKerell and Nilsson

supercoiled forms was observed. Another approach is the use of internal coordinates combined with the implicit treatment of solvent, as in the program JUMNA [117]. This method is basically atomistic, but the movements of the system are described entirely in internal coordinates, greatly facilitating the locating of minima and the sampling of conformational space via Monte Carlo methods [118]. The internal coordinate method has also been used

˚

with a minimal hydration model where a 5 A shell of explicit water molecules was used to hydrate the DNA, allowing for an integration time step of 10 fs [119]. Results showed the structures to be close to the B form of DNA; however, variations in RMS differences were smaller than occur in simulations using full solvent representations with periodic boundary conditions. Another ‘‘low resolution’’ method involves treating individual basepairs as three-point representations. The degrees of freedom between the individual ‘‘basepairs’’ can then be sampled to investigate the structural properties of extended DNA or RNA duplexes [120]. This approach can be combined with atomistic models to allow for both the overall fold of the oligonucleotide and specific interactions in small portions of the structure to be modeled, an approach that has been used to study portions of the 16S rRNA. A method based on the use of a segmented rod model along with Brownian dynamics allows for studies of DNA molecules hundreds of basepairs in length [121,122]. Although these methods sacrifice varying levels of detail, they extended computational approaches to significantly larger oligonucleotides, allowing for access to a wide variety of biological processes, such as the winding of DNA into supercoils and mechanisms associated with nucleosome and, ultimately, chromatin formation.

IV. PRACTICAL CONSIDERATIONS

Performing successful calculations on nucleic acids requires selection of the appropriate models for the goals of the calculations followed by determination of the proper starting configuration. When designing a computational study one should carefully consider the type of information desired from the calculations along with the available resources. In many instances, atomic details of interactions between oligonucleotides and the environment or with a bound protein are desired, making the use of atomistic models appropriate. These methods, however, require significant computational resources for the generation, storage, and analysis of the MD simulations. Continuing increases in computational power with the simultaneous decrease in computer costs makes the required facilities accessible to most laboratories. An alternative is the use of supercomputing centers. For systems larger than about 50 basepairs or where atomistic details of interactions between the nucleic acid and the solvent are not required, the methods discussed in the preceding section are appropriate.

The remainder of this chapter focuses on practical aspects of the preparation and implementation of atomistically based computations of nucleic acids. A flow diagram of the steps involved in system preparation and the performance of MD studies of nucleic acids is presented in Figure 1. Additional details on many of the procedures described here may be found in books by Allen and Tildesly [123] and Frenkel and Smit [124].

A. Starting Structures

A significant advantage of computational studies on nucleic acids is that reasonable guesses of the starting geometries can be made. When studying duplexes, these are typi-

Nucleic Acid Simulations

453

Figure 1 Flow diagram of the parameter optimization process. Loops I–II represent iterative stages of the optimization process as discussed in the text.

cally based on the canonical forms of DNA and RNA [2,125–130]. A number of available modeling and graphics packages have the ability to generate canonical structures for a given sequence. Alternatively, experimental structures from crystal or NMR studies, obtained from the nucleic acid [114] or protein databanks [131], can be used. While with DNA and RNA duplexes, crystal and NMR structures generally do not differ significantly from canonical structures, in cases where there are loops, bulges, hairpins, or unstacked bases, as in tRNA, the use of experimental structures is helpful. Alternatively, if the helical and nonhelical regions are known, reasonable guesses for a starting geometry, followed by relaxation of the structure via MD simulations, can be applied. This approach is useful when low resolution data on a nucleic acid structure are available [132–136]. A useful alternative is the program NAB, which generates structures of both helical and nonhelical regions of oligonucleotides [137] and is accessible via the Internet. When creating starting models of RNA or DNA, efforts should be made to check that the model is consistent with available biophysical and biochemical experimental data.

454

MacKerell and Nilsson

B. System Configuration, Solvation, and Ion Placement

Essential for MD simulations of nucleic acids is a proper representation of the solvent environment. This typically requires the use of an explicit solvent representation that includes counterions. Examples exist of DNA simulations performed in the absence of counterions [24], but these are rare. In most cases neutralizing salt concentrations, in which only the number of counterions required to create an electrically neutral system are included, are used. In other cases excess salt is used, and both counterions and co-ions are included [30]. Though this approach should allow for systematic studies of the influence of salt concentration on the properties of oligonucleotides, calculations have indicated that the time required for ion distributions around DNA to properly converge are on the order of 5 ns or more [31]. This requires that preparation of nucleic acid MD simulation systems include careful consideration of both solvent placement and the addition of ions.

As a first step in setting up an MD study of nucleic acids in solution, the overall configuration of the system must be considered. This configuration is defined by the boundary conditions to be used in the solvent simulation. Boundary conditions are required to maintain the proper density of the system as well as to minimize edge effects if the system is set up so that the condensed phase environment is finite, thereby interacting directly with vacuum. The most commonly used and most rigorously correct are periodic boundary conditions (PBCs). In this approach the system (nucleic acid, surrounding solvent, and ions) is created in, typically, a cubic or rectangular shape. The edges of the system, however, do not see a vacuum, but the edge on the opposite side of the cube or rectangle, allowing for interaction of the solvent on each edge of the cube or rectangle with that on the opposing edge. In addition to cubic or rectangular systems, PBC simulations may also be performed using octahedral or rhombic dodecahedral symmetries, which are appropriate for spherical molecules and minimize the total number of solvent molecules required to properly solvate the spherical macromolecule [138,139]. An advantage with the PBC approach is that it can be used with Ewald methods [22,123], which are currently considered the most rigorous methods for treating long-range electrostatic interactions (see Chapter 5). Alternative approaches for the treatment of boundaries are reaction field based methods [140,141], which include a potential energy barrier that keeps the solvent molecules from diffusing away from the simulation system and reaction field terms that account for the absence of water (or the presence of vacuum) outside the barrier. If a reaction field is not present, then the water molecules at the surface will tend to interact to a greater extent with each other than with interior water molecules, leading to problems with the solvent density and solvent transport problems at the surface of the system that may adversely effect the properties of the entire simulation system. These approaches are typically used on spherical systems, although cylinders and planes have also been used. For treatment of long-range electrostatics, reaction field methods can use atom truncation [26], extended electrostatic [142], or fast multipole methods [143]. Reaction field methods are most useful for systems too large to treat via PBC, such as protein–nucleic acid complexes. In all cases it is important that care be taken to ensure that the boundary conditions being used do not adversely affect the properties of the systems under study. This can typically be checked by performing simulations on the system and comparing properties calculated from the system with available experimental data, the most accessible being structural properties from X-ray or NMR experiments.

When preparing a PBC or reaction field calculation, the total size of the system is important. In general, the larger the system, as judged by the amount of solvent, the better,

Nucleic Acid Simulations

455

in that there are less likely to be adverse contributions from the boundary condition on the nucleic acid. The minimum amount of solvent surrounding the nucleic acid is dictated by the treatment of long-range electrostatics. With PBC the solvent box should be of such a size that the nucleic acid molecules in adjacent cells are further apart than (1) the real space cutoff for Ewald-based methods or (2) the atom truncation interaction cutoff distance. For reaction field based calculations, the distance from the nucleic acid to the edge of the solvation shell should be greater than the atom truncation distance. Concerning

˚

these distances, for Ewald methods a real space cutoff of 10 A or greater and for atom

˚

truncation a cutoff distance of 12 A or more are suggested. In all cases the simulator should perform tests to ensure that the applied boundary methods and treatment of longrange electrostatic interactions do not adversely affect the calculated result.

Once the geometry and size of the system to be studied are determined, a pure solvent system (i.e., no DNA or RNA) of those dimensions should be built. This can typically be done via standard procedures included with the various modeling packages. These systems should then be subjected to MD simulations using the identical methods for treatment of the nonbonded interactions to be used in the final calculation. This will

(1) allow the solvent to properly equilibrate with respect to itself and any ions included at this stage and (2) offer a test of the proposed methodology by ensuring that water density and transport properties are in satisfactory agreement with experiment. Once the solvent is equilibrated, it is overlaid onto the nucleic acid molecule, and all solvent molecules with nonhydrogen atoms within a given distance of solute nonhydrogen atoms (typi-

˚

cally 1.8 A) are then deleted. At this stage ions can be added to the system as required. Ion placement in simulations of nucleic acids can have a significant impact on the computed results, consistent with the role of water activity on the structure of DNA. A comparison of the influence of ion placement on MD simulations was reported by Young et al. [31]. Methods applied included a Monte Carlo based method that places counterions at low energy positions around DNA using a sigmoidal distance-dependent dielectric function for calculation of the interaction energy of the ion with the environment. The second method used was based on calculation of the electrostatic potential around the DNA followed by the placement of ions at the most favorable locations in the potential. This is performed in an iterative fashion such that subsequent ions take into account previously placed ions. The final method used involved the placement of sodium counterions ‘‘6

˚

A from the P atoms along the bisector of the backbone OPO groups.’’ All three methods yielded similar results when the electrostatic interactions were treated with the Ewald method. Additional methods include the replacement of water oxygens in a previously solvated system with counterions, with the selection criteria based on the interaction with the surrounding water molecules and the oligonucleotide [144,145]. This can also be done in an iterative fashion that allows for the positions of ions to be sensitive to the presence of other ions. In this approach, the omission of water hydrogens from the interaction energy calculation eliminates orientational problems that would require energy minimization or dynamics, thereby significantly decreasing the computational requirements. Other simulators simply replaced randomly selected water molecules with ions [146]. A final approach is to initially overlay the DNA or RNA with a solvent box or sphere that already contains ions at the desired concentration and has been previously equilibrated [24]. In this method all ions and solvent molecules that overlap the solute are removed and additional ions are added at random positions or deleted, based on those furthermost from the DNA or RNA, to obtain electrical neutrality. The last two methods are particularly well suited for the placement of excess salt (counterand co-ions beyond those needed to