Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
55
Добавлен:
17.08.2013
Размер:
510.17 Кб
Скачать

414

Figur

e 12.23

Patterns of hybridization seen when an immobilized tar

 

get sequence and a control sequence are probed successi

v e ly with adjacent

radiola-

beled

octanucleotides chosen as complements to the tar

get sequence.

T a ken from Strezoska et al. (1991).

 

 

 

 

 

 

DATA ACQUISITION

AND ANALYSIS

415

a comparable negative control sequence. Oligonucleotides were selected on the

basis

of

 

 

 

the known sequence; others were added to serve as

negative

controls. The

results

are

 

 

fairly convincing. As shown in Figure

12.23, the

discrimination between

true positives

 

 

and negatives is quite good in most of the individual hybridizations. Of course the

obvi-

 

 

ous criticism of this experiment is that with a sequence known in advance, the test is not a

 

 

truly objective one.

 

 

 

 

 

 

 

 

 

 

To address these concerns, Drmanac and Crkvenjakov performed a second pilot test of

 

 

SBH on three closely related unknown sequences containing a total of 343 bases. The de-

 

 

sign of the test was based on an uninvolved third party who analyzed these sequences and

 

 

 

designed a set of oligonucleotides in which only

about half

corresponded

to

the se-

 

 

quences in the target samples. In addition the challenge was to determine all three un-

 

 

known samples and not generate erroneous composites of them by errors in reconstruc-

 

 

 

tion. The test was a total success—all three

unknown sequences

were

correctly

 

determined. However, one caveat needs to be considered. Because all 65,536 8-mers were

 

not provided, this automatically supplies enormous amounts of information about the true

 

 

 

sequence. Any compound omitted from the set provided is automatically a true negative.

 

 

 

 

Just this information alone restricts the possible sequences tremendously, even before a

 

 

single experiment has been done. Thus, while the experimental results that have been

 

 

 

achieved are impressive, it cannot yet

be said that

a

definitive test of SBH for

de

novo

 

DNA sequencing has been done. Indeed, in defense of all who work in this field, it will

 

 

probably not be possible to test the

methods definitively until the gamble

is

taken to

 

make, directly on chips or in bulk for distribution, all of the 65,536 8-mers.

 

 

DATA ACQUISITION AND ANALYSIS

 

 

 

 

 

 

 

 

 

 

Three different methods have been used thus far to detect hybridization in pilot SBH ex-

 

 

periments. In each case quantitative data are needed so that positive signals can be dis-

 

 

criminated as clearly as possible from

background. Southern used image plate analyzers

 

 

to examine radioisotope decay for the results shown in Figure 12.22. Others have used

 

 

autoradiograms quantitated with a CCD camera. These approaches were discussed

in

 

 

 

 

Chapter 9. Fluorescent probes have been used by Fodor and by Mirzabekov. Here a CCD

 

 

 

camera can be used in conjunction with a fluorescence microscope to record quantitative

 

 

 

signals. Alternatively, a confocal scanning fluorescence microscope can be used. Other

 

 

approaches such as mass spectrometry (see Chapter 11) are under development. The very

 

 

 

 

notion of an oligonucleotide or sample chip raises the expectation that it should be possi-

 

 

ble to find a way to read out the amount of hybridization by a direct electronic method.

 

 

Kenneth Beattie and Mitchell Eggers

have developed one approach to this by

detecting

 

 

 

the mass of bound sample as it changes the local impedance on a silicon surface. In prin-

 

 

 

ciple, one ought to be able to enhance such detection by providing the DNA probes or tar-

 

 

 

gets with attachments that generate more dramatic effects through altered conductivity, as

 

 

 

a source of electrons or holes, or through magnetic properties. Perhaps the ultimate no-

 

 

tion, as shown by the purely hypothetical example in Figure 12.24, would be to use the

 

 

stability of the duplex formed in hybridization to directly manipulate

elements

of

a

 

nanoscale chip and thus lead to a detectable electrical signal.

 

 

 

 

 

 

 

However the data are obtained, current methods for analyzing data are already quite advanced. While it is difficult to convince people to synthesize 65,536 compounds before a method has proved itself, it is much easier to ask people to simulate the results of these

416 FUTURE DNA SEQUENCING WITHOUT LENGTH FRACTIONATION

Figure 12.24

Possible

future direct reading oligonucleotide hybridization

chip. Figure also ap-

pears in color insert.

 

 

 

experiments and design

software

to reconstruct sequences from imperfect

n -tuple word

content. We have already indicated that these simulations are very encouraging, and they suggest that SBH will be a very powerful method, especially if the branch point ambiguities can somehow be dealt with. Two different proposals to handling branch points have

been discussed. In the first, shown in Figure 12.25, one takes advantage of the fact that it should be possible to make a sample that consists of a dense set of small overlapping

Figure 12.25

Overcoming branch point ambiguities by the simultaneous analysis of clones from a

 

 

dense overlapping library. Recurrent sequences are shown as hollow bars. Unique hybridization

 

 

probes are indicated by

a, b, c.

Known clone order implies that

b, and not

c, follows

a.

OBSTACLES TO SUCCESSFUL SBH

417

Figure 12.26

Overcoming branch point ambiguities by the

use

of

several

homologous

but

not

identical DNA sequence targets.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

clones. This is what is done for ordinary shotgun ladder sequencing, except that for SBH

 

the clones would probably have to be even smaller. In these clones unique sequences will

 

 

lie outside and between the repeats that cause branching ambiguities. Matching up these

 

 

unique sequences not only places the clones in the proper order, it also resolves the am-

 

biguous internal

arrangement of

sequences on a clone with three

repeats,

since

the

order

 

is determined by the identity of these sequences on the flanking clones. This looks like a

powerful approach, but it requires a great deal of experimental redundancy

with

little

 

overall gain.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A second strategy for resolving branch point ambiguities

is shown

in

Figure

12.26.

Here the notion is to determine the DNA sequence of several similar but not identical

 

samples. Because of sequence variations among the samples, exact recurrences in

 

one

 

 

sample will not necessarily be

exact in all the others. Any

imperfections

in

the

repeats

 

will break the branch point ambiguities in all of the samples because they can be aligned

 

by homology. In principle, one could use different individuals of the same species and

 

take advantage of natural sequence polymorphism. However, simulations show that

the

 

 

most effective application of this approach would use samples that have about 10% diver-

 

 

gence on average. In practice, this may mean that it would

be more useful to compare

 

three to five similar species, like human and

chimp,

rather

than

compare

individuals

within a species. Here, as in the previous method, the cost of resolving branch point am-

 

biguities is a considerable increase in the number of samples that have to be examined.

 

However, the additional information that will be obtained will be highly interesting if the

 

species are well chosen.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

OBSTACLES

TO SUCCESSFUL

SBH

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The base composition dependence of the melting temperature,

 

 

 

 

 

 

 

 

 

T m poses a very serious

challenge to simple and effective implementations of SBH. If a temperature is chosen that

 

allows effective discrimination between perfect matches and mismatches in G

 

 

 

 

C-rich

compounds,

many

A

T-rich sequences may not

form

enough

duplex

to

be

detected.

Alternatively, if one chooses a low enough temperature to stabilize the weakest A

 

 

 

T-rich

duplexes, there will not be enough discrimination against mispairing in G

 

 

 

 

 

C-rich com-

pounds, and many false positives will result. There are many possible ways to circumvent

 

 

this problem; quite a few of them are being

tested,

but no

generally

acceptable

solution

 

has yet been demonstrated in practice.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Ed Southern has been

experimenting

with

the

use

of

high

concentrations

of

tetra-

methylammonium salts (TMA) instead of more usual low to moderate ionic strength NaCl solutions. These salts have the undesirable feature of slowing down the kinetics of

hybridization, but this can be compensated for, if necessary, by adding other agents that

418

FUTURE DNA SEQUENCING WITHOUT LENGTH FRACTIONATION

 

 

 

 

speed up hybridizations, such as dextrans which increase the effective concentration of

 

 

nucleic acids. It has been known for a long time that TMA at the proper concentration can

 

 

almost

equalize the

T m of

polynucleotides that are pure A

 

 

 

T and

those

that are pure

G C. However, when Southern tried

TMA in oligonucleotide hybridization, he found

 

 

that while the

T m ’s of compounds with extreme base compositions were equalized, a very

 

 

large effect of DNA sequence on

T m of compounds

with

intermediate

base compositions

 

emerged. Unless this turns out to be an idiosyncracy caused by the use

of pure homo-

 

 

purine sequences, it probably means that TMA will have to be abandoned.

 

 

 

 

 

 

 

An alternative way to even out base composition effects is to use base analogs (Fig.

 

 

12.27). One can substitute 2,6-diamino purine for A (an analog that makes three hydrogen

 

 

bonds with T) and 5-bromoU for T (an analog that has increased

vertical stacking en-

 

 

ergy). This will raise the relative stability of A

T-rich sequences considerably. The base

analog 7-deaza G can be used instead of G to lower the stability of G

 

 

 

C-rich sequences.

Many more analogs exist that could be tested. The problem is that one really wants to test

 

 

their effect across the full spectrum of 65,536 8-mers, and there is simply no way to do

 

 

this efficiently until we have developed much more effective ways

to

make

oligonu-

 

 

cleotide chips. Such devices not only provide a way to do SBH, they provide a source of

 

 

samples that allow the accumulation of massive amounts of duplex

 

 

 

 

 

T m

data. In model ex-

periments Southern was able to characterize the

 

 

T m ’s of all of the 256 possible homo-

purine-homopyrimidine 8-mer duplexes under a wide set of experimental conditions. This

 

 

 

single set of experiments undoubtedly provided more

 

 

 

T m

data than a

decade

of previous

work by several different laboratories.

 

 

 

 

 

 

 

 

An alternative approach for compensating for

 

T m

differences has been demonstrated by

Mirzabekov. This takes advantage of the fact that chips made of thin gels can rebind sig-

 

 

nificant amounts of released sample at low temperatures. The rate of this rebinding will

 

 

depend on the concentration of oligonucleotide, since renaturation shows second-order

 

 

 

kinetics or pseudo–first-order kinetics (Chapter 3). To reveal

these

kinetic

effects, one

 

 

first hybridizes a sample to the immobilized probe and then allows a fraction of the du-

 

 

plexes to dissociate with a washing step. By adjusting the relative concentrations of dif-

 

 

ferent

compounds, one

can bring their

T m ’s

very

close

to the

same value. An

example is

shown in Figure 12.28. These results are very impressive. However, the two samples involved had to be used at a 300-fold concentration difference to achieve them. It is not immediately obvious that this can be done, in general, without leading to serious complications in the detection system used to monitor the hybridization. One will need a system with a very wide dynamic range. It will also be a major effort to try to equalize the melting properties of not just two compounds but 65,536.

Figure 12.27

Base analogs useful in decreasing

the differences in stability between A–T-rich and

G–C-rich sequences: (

a ) 2-Aminoadenine. (

b ) 5-Bromouracil. (

c ) 7-Deazaguanine.

OBSTACLES TO SUCCESSFUL SBH

419

Figure 12.28

Adjusting the concentration of different oligonucleotides can compensate for differ-

ence in their melting temperatures. Adapted from Mirzabekov et al.

 

Instead of attempting to compensate

for effects of sequence on the stability of du-

plexes, one can just measure the hybridization across a range of temperatures. This does

not increase the number of samples needed. Instead, one would effectively be recording a

melting profile for each sample with the

entire set of oligonucleotides. This

would in-

crease the experimental time by a factor

of ten or more, which is tolerable. In the long

run, once extensive data on the thermal

stability of each of the 8-mer complexes are

known, it may be possible to use a much

simpler approach. The set of compounds could

be split into

groups, each studied at a

different optimal temperature. In

principle, this

could still involve a single chip, except that different regions would be kept at different temperatures. The manufacture of such a split chip would require custom placement of

each compound, so simple masking strategies like that employed by Southern are unlikely to suffice. However, this is really not a serious additional manufacturing problem. Ultimately a combination of split chips, base analogs, and special solvents may all be needed for the most effective SBH throughput.

Secondary structure in the target is another potential complication in SBH. This is

probably

easily circumvented in the sample chip strategy.

Here the target could be at-

tached

at

random but frequent places

to the surface under

denaturing conditions. This

would

not

be expected to interfere

with oligonucleotide

hybridization very much. It

should effectively remove all but the most stable short sample hairpins (Fig. 12.29). The

problem of secondary structure is likely to

be

more serious

when

oligonucleotide chips

are used. The effect of such structures will

be

to cause a gap

in

the readable sequence.

This is a serious problem, but since the gaps will be small, they can be filled rather easily by PCR-based cycle sequencing, using the sequence flanking the gaps to design appropriate primers. Thus the real issue is how frequent will such gaps be. If one occurs on each target sample, it will be best to forget SBH and just do the entire project by standard cycle sequencing. Presumably conditions will be found where the problem of secondary struc-

ture can be reduced to a much lower level. One way to do this would be to place base analogs in the sample that destabilize intramolecular base pairing more than intermolecular base pairing. These might, for example, be bulky groups where one could be tolerated in the groove of a duplex when the target binds to the probe, but two cannot be tolerated,

420 FUTURE DNA SEQUENCING WITHOUT LENGTH FRACTIONATION

Figure 12.29

 

An example of a hairpin that is too stable to be detected in SBH.

 

if the target tries to pair

with itself. There is undoubtedly

room

for much

development

 

here and much clever chemistry. A second approach would be to

 

use probes with un-

 

charged backbones. Then low ionic strength conditions can be used to suppress target

 

secondary structure without affecting target-probe interactions. One example of such

 

compounds is polypeptide nucleic acids (PNAs; see Chapter 14). Another example is

 

phosphotriesters in which the oxygen that is normally charged in natural nucleic acids is

 

esterified with an alkyl group. However, this creates an addition optically active center at

 

each phosphorous, which leads to

severe stereochemical

complexities

unless

optically

 

pure phosphotriesters are available.

 

 

 

 

 

 

 

 

The effects of secondary

structure or unusual DNA structures

are

significant but not

 

yet known in any great depth. In Chapter 2 we discussed the peculiar features of a cen-

 

tromere-associated repeat where the single strands may have a more stable secondary

 

structure than the duplex. In Chapter 10

we illustrated the

abnormally stable

hairpin

 

formed by a particular short

DNA sequence. Whether these cases are representive of 1%

 

of all the DNA sequences, or more or less, is simply unknown at the present time. About

 

the only way we will be able to uncover such idiosyncratic behavior, understand it, and

 

learn to deal with it, is to make large oligonucleotide arrays and start to study them.

 

Unfortunately, this appears to be one of those cases in science where a timid approach is

 

likely to be misleading. At some point we will have to dive in.

 

 

 

 

 

 

SBH IN COMPARATIVE DNA SEQUENCING

 

 

 

 

 

 

 

 

Some of the difficulties just described with full de novo SBH approaches have led some

 

experts to doubt that SBH will ever mature into a widespread user-friendly method. For

 

this reason much effort has been concentrated on developing SBH for comparative (or

 

differential) DNA sequencing where one assumes that a reference sequence is known and

 

 

the objective is to compare it with another sample and look for any potential differences.

 

Comparative sequencing is needed in checking existing sequence data for errors. It is the

 

type of sequencing required for horizontal studies in which many members of a popula-

 

tion are examined. This is needed in genetic map construction, genetic diagnostics, the

 

search for disease genes, in mutation detection, and for more biological objectives includ-

 

ing ecology, evolution, and profiling gene expression. Some of these applications are dis-

 

cussed in Chapters 13 and 14.

 

 

 

 

 

 

 

 

 

 

When SBH is considered in the context of sequence comparisons, two problems of the

 

method for de novo sequencing are immediately resolved. It is

not necessary to have a

 

probe array consisting of all possible 4

 

n

oligonucleotodes of

length

n. Instead the array

can be customized to look

for

the desired

target and

simple

sequence variations

of that

 

 

 

 

 

 

OLIGONUCLEOTIDE

STACKING

HYBRIDIZATION

421

target. Second, since a reference sequence is known, issues of

branch point ambiguities

 

 

 

are virtually always resolvable by use of the information in that sequence. A particularly

 

 

 

powerful version of SBH for comparative sequence has been developed by Affymetrix,

 

 

 

 

 

Inc. Here a probe array is made that corresponds to all possible strings of length

 

 

 

 

n

con-

tained in the original sequence (for a target with

 

 

 

L

 

base

pairs,

L

n 1 substrings are

 

required). For each substring four variants are made corresponding to the expected se-

 

 

 

 

quence at the middle position of the substring and all three possible single-base variants

 

 

 

there. Thus the array of probes will have 4(

 

 

 

L

n

1)

elements. This

is quite manage-

 

able with current photolithographic syntheses for targets in the range of 10 kb.

 

 

 

 

 

 

 

 

In actual practice this approach was tested on 16.6-kb human mitochondrial DNA us-

 

 

 

ing arrays containing up to 130,000 elements, each of which is a 15to 25-base probe.

 

 

 

(Chee et al., 1996). For convenience these nested targets are arranged, serially, horizon-

 

 

tally in the array as shown schematically in Figure 12.30

 

 

 

 

 

 

a, with the four possible variants

 

for each central eighth base located vertically. The target is randomly sheared into short

 

 

 

fragments (but longer than the length of the probes). A perfectly matched target will hy-

 

 

 

 

bridize strongly to one member of each vertical set of four probes. A target with a single

 

 

 

mismatch will show strong hybridization only to one particular probe in which the central

 

 

 

 

base variant matches the sequence perfectly. For all

possible flanking probes, there

will

 

 

 

be one or two internal mismatches between that target and the probe; hence hybridization

 

 

 

 

will be weak or undetectable. A sample

of

the actual data seen

using this

approach

is

 

 

 

shown in Figure 12.30

b.

It

is impressive. In practice, in most cases a two-color competi-

 

tive hybridization is used. This allows a

sample of the normal sequence

(in one

color)

to

 

 

 

be compared with a potential variant (in

another

color)

with most

differences

in

se-

 

 

 

quence-dependent hybridization efficiency nulled out.

 

 

 

 

 

 

 

 

 

 

 

 

OLIGONUCLEOTIDE STACKING

HYBRIDIZATION

 

 

 

 

 

 

 

 

 

 

 

 

 

There are a number of ways

that could potentially increase the length

of sequence

that

 

 

 

 

can be read with a fixed length oligonucleotide. This is one major way to improve the ef-

 

 

 

 

 

ficiency of SBH, since the longer the effective word length, the higher the sequencing

 

 

 

throughput and also the smaller the number of branch point ambiguities. One approach,

 

 

 

 

 

specifically designed by Mirzabekov to

help resolve branch

point

ambiguities, is

shown

 

 

 

in Figure 12.31. It is based on the fact that once a duplex has been formed by hybridiza-

 

 

 

tion of the target with an 8-mer, it becomes thermodynamically quite favorable to bind a

 

 

 

 

second oligomer immediately adjacent to the 8-mer. The extra thermodynamic stabiliza-

 

 

 

 

tion comes from the stacking between the two adjacent duplexes. This same principle was

 

 

 

 

 

discussed earlier in schemes for directed primer walking (Chapter 11). In practice,

 

 

 

Mirzabekov uses pools of ninety 5-mers, chosen specifically

to try to resolve known

 

 

 

branch points. A test of this approach, with a single perfectly matched 5-mer or various

 

 

 

mismatches, is shown in Figure 12.31. It is apparent that the discrimination power of

 

 

 

oligonucleotide stacking hybridization is considerable.

 

 

 

 

 

 

 

 

 

 

 

Some calculated

T m ’s for perfect and mismatched

duplexes are given in Table 12.1.

 

These are based on average

base compositions. The calculations were

performed

using

 

 

 

 

 

 

the equations given in Chapter 3. In the case of oligonucleotide

stacking,

it

is

assumed

 

 

 

that the first duplex is fully formed under the conditions where the second oligomer is be-

 

 

 

 

ing tested; in practice, this may not always be the case. It is, however, approximately true

 

 

 

for the conditions used for the experiments shown in Figure 12.32. The calculations re-

 

 

 

veal a number of interesting

features

about stacking

hybridization. Note that

the

binding

 

 

 

422 FUTURE DNA SEQUENCING WITHOUT LENGTH FRACTIONATION

Figure 12.30

Use of SBH for comparative hybridization. (

a

) Schematic layout of 15 base probes

(b ) Example of actual data probing for differences in human mitochondrial DNA. Top panel shows

 

hybridization with the same sequence as used to design the array. Bottom panel shows

hybridiza-

tion with a sequence with a single T to C transition in position 16,493. (

 

c ) Example of hybridization

to a full array. Panels (

b ) and ( c ) from Chee et al. (1996).

 

 

OLIGONUCLEOTIDE STACKING HYBRIDIZATION

423

Figure 12.31

Basic strategy in oligonucleotide stacking hybridization

TABLE 12.1 Calculated Thermodynamic Stabilities of Some

Ordinary Oligonucleotide Complexes and Other Complexes

Involved in Stacking Hybridization

Energetics of Stacking Hybridization

Structure

a

 

n

8

7

6

5

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

38

33

25

15

 

 

 

 

 

33

25

15

3

 

 

 

 

 

25

15

3

14

 

 

 

 

 

51

46

40

31

 

 

 

 

 

 

 

 

 

 

46

40

31

21

 

 

 

 

 

 

 

 

 

 

40

31

21

11

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Note: Calculated

 

T m (° C, average base composition).

 

 

 

a Structures consist of a long target and a probe of length

 

 

n . The top three samples

are ordinary hybridization; the bottom three are stacking hybridization.

 

 

 

Figure 12.32 Example of the ability of oligonucleotide stacking hybridization to discriminate against mismatches. Taken from Mirzabekov et al.

Соседние файлы в папке genomics11-15