Добавил:

fench Опубликованный материал нарушает ваши авторские права? Сообщите нам.

Вуз:

Сумский государственный университет

Предмет:

Генетика

Файл:

Genomics- The Science and Technology Behind the Human Genome Project. Charles R. Cantor, Cassandra L / genomics1-10 / 8

.pdf

Скачиваний:

Добавлен:

17.08.2013

Размер:

634.83 Кб

Скачать

☆

<<< < Предыдущая 12 / 62 3 4 5 6 > Следующая >>>

TABLE		8.1 Restriction Enzymes Useful for Genomic Mapping
Enzyme a					Recognition Site (5		–3 )b	Source a,c

				Enzymes with Extended Recognition Sites
I-Sce		I			TAGGGATAA/CAGGGTAAT			B
VDE					TATSYATGYYGGTGY/			O
					GGRGAARKMGKKAAWGAAWG
I-Ceu		I			TAACTATAACGGTCCTA/AGGTAGCGA		N
I-Tli		I			GGTTCTTTATGCGGACAC/TGACGGCTTTATG		N
I-Ppo		I			CTCTCTTAA/GGTAGC			P
					Enzymes with	6-bp Recognition Site
Pac		I			TTAAT/TAA			N
Pme		I			GTTT/AAAC			N
Swa		I			ATTT/AAAT			B
Sse	83888t1			CCTGCA/GC				T
				Enzymes with	6-bp Recognition that Cut in CpG Islands
Rsr		II (Csp	*)		CG/GWCCG			N, P*
SgrA		I			CR/CCGGYG			B
Not		I			GC/GGCCGC			B, N, S, P
SrfI					GCCC/GGGC			P, S
Fse					GGCCGGCC			B
Sﬁ	Id				GGCCNNNN/NGGCC			B, N, P
Asc		I			GG/CGCGCC			N
				Enzymes that cut in CpG Islands: Fragments Average				200 kb
Mlu		I			A/CGCGT			B, N, P, S
Sal		I			G/TCGAC			N
Nru		I			TCG/CGA			N
Bss		HII			G/CGCGC
Sac		II			CCGC/GG			N
Eag		I (EcI	XI*,Xma	III)	C/GGCCG			B*, N
				Enzymes that cut in CpG Islands: Fragments Average				200 kb
Nar		I			GG/CGCC			N, P
Sma		I			CCC/GGG			N
Xho		I			C/TCGAG			N
Pvu		I			CGAT/CG			N
Apa		I			GGGCC/C			N
				Enzymes with TAG in their Recognition Sequence
Avr		II (Bln	I*)		C/CTAGG			N, T*
Nhe		I			G/CTAGC			N, P, S
Xba		I			T/CTAGA			B, N, P, S
Spe		I			A/CTAGT			B, N, P, S, T
Nhe		I			G/CTAGC			P
Dra		I			TTT/AAA			B, P, S
Ssp		I			AAT/ATT			P

aAsterisk indicates preferred enzyme and source.

bR, A, or G; Y, C, or T; M, A, or C; K, G, or T; S, G, or C; W, A, or T; N, A, or C or G or T.

cB, Boehringer Mannheim; N, New England Biolabs; O, not commercially available; P, Promega; S, Stratagene; T, Takara.

dTwo sites are needed in order for cleavage to occur at both of them.

						HTF ISLANDS			245
Note that the recognition site for				Sﬁ	I contains no CpG’s, while the site recognized by
Not I contains 2 CpG’s. In Chapter 1 we showed that the frequency of occurrence of this
dinucleotide sequence is reduced in mammalian DNA to about 1/4 of the level expected
statistically. On this basis			Not	I can be expected to behave like a ten-base speciﬁc enzyme
rather than an eight-base speciﬁc enzyme; as a result the expected fragment sizes are pre-
dicted to lie around 1 Mb, in agreement with experimental results. However, this is an
oversimpliﬁed argument for it ignores the effect of DNA methylation. Overall, in the hu-
man genome about 80% of the CpG sequences are methylated to 5-meC, and								Not	I (and
most other restriction nucleases) is unable to cleave at sites that are methylated. If we fac-
tor this effect into the calculation, we can now predict that for random methylation only
about 1/25 of the	Not	I	sites	have no	m C and thus are cleavable. This yields	an average
DNA fragment size of 25 Mb, making					Not I all but useless for conventional macrorestric-
tion mapping. Fortunately the distribution of CpG methylation is not random. In practice,
it appears that about 90% of the				Not I sites in the human genome are not methylated, and
so they are accessible to cleavage by the enzyme. This explains why						Not	I can	generate
fragments that average about 1 Mb in size.

HTF

ISLANDS

The peculiar distribution of methylated CpG, which has such a dramatic effect on the

frequency of

Not I cutting sites, is a reﬂection of a more general statistical unevenness

in mammalian genomes. This was ﬁrst discovered when genomic digestions were car-

ried out with a much less speciﬁc restriction enzyme,

Hpa

II. A typical genomic digest

generated by this

enzyme that

recognizes

the sequence CCGG is shown in

Figure

8.9.

A statistically random genome would be expected to give a roughly Gaussian distribu-

tion of fragment sizes. Instead, the two-phase distribution observed in practice is strik-

ing.

Hpa

is inhibited by DNA methylation. It

cannot cut the sequence C

m CGG. To

as good approximation, the fragment sizes shown in Figure 8.9 can be ﬁt by assuming

that

the

genome

is divided

into

regions

where no methylation occurs and

regions

where most of the CpG’s are methylated. The large fragments in the latter regions are

what were expected from an

Hpa

II digest. The small

fragments

were

unexpected.

They

were

named

Hpa

II tiny fragments (HTFs), and the

regions that

contain

them

have

been

called

HTF islands.

As we have learned more about

the properties

these

regions, many researchers have preferred to call them

CpG islands,

but for others,

the

original term has stuck.

Figure 8.9	Distribution of the sizes of the DNA fragments generated by a complete	Hpa	II digest
of a typical mammalian genome.

246

PHYSICAL

MAPPING

HTF islands have a number of very interesting properties. They tend to be located near

genes, most often at the 5

-edge of genes. They are very rich in G

C. The frequency

CpG is as expected from binomial statistics—that is, there is no suppression of CpG in

these regions and no elevation of TpG (which results from

CpG

mutagenesis as de-

scribed in Chapter 1.) In HTF islands the CpG sequences are unmethylated. These results

are self-consistent: If there is no methylation in these regions, there should be no progres-

sive loss of CpG by mutation, and thus the frequency of this sequence should just reﬂect

the

local

G content. More than 90% of the known

Not

I sites appear to be located in

HTF islands; this is understandable. To produce a cleavable

Not

I site requires two nearby

unmethylated CpG’s. This is an event most unlikely to occur in the bulk of the genome

where CpG’s are both very rare and methylated.

Many HTF islands have been studied by sequencing the DNA ﬂanking

Not

I sites.

This is relatively easy to do because, as we will discuss later in the chapter, there are

straightforward ways to clone DNA that contains a cutting site for this enzyme (or al-

most any other enzyme for that matter). Two representative human DNA sequences

ﬂanking

Not

sites on chromosome 21 are described in

Figure 8

.10, which shows

the

local base composition as a function of distance from the

Not I site. The first of these

examples is extraordinarily G

C rich throughout a 600–800 base region. The second

example

shows a

transition from an HTF island to more ordinary genomic DNA; the

Not

I site is near the edge of the island. The distribution of CpG, GpC, and TpG se-

quences in the

ﬁrst

of these examples

is plotted as a function of position

in Figure

8.11. It is evident that GpC’s and GpC’s are extraordinarily prevalent; more signiﬁ-

cantly, their prevalence is roughly equivalent, showing the lack of any signiﬁcant CpG

suppression.

ORDERING

RESTRICTION

FRAGMENTS

The simplest way to view top-down

mapping is by projecting a low-resolution

map

with ill-deﬁ

ned

distances

onto

a higher-resolution map with more carefully

deﬁned

distances. Suppose that a genetic

or cytogenetic map already exists for a chromosome

of interest, and this is actually the case today for almost all regions

of the

human

genome, at least at 2 Mb resolution (Chapter 6). Each genetically mapped or cytoge-

netically

mapped

DNA marker can be radiolabeled and used as a hybridization probe

identify the

corresponding large DNA fragments that it resides on, as

shown

Figure 8.12. If DNA fragments can be generated that are comparable in size to

the

density of available DNA markers, then most of the construction of a restriction map

would be accomplished with a relatively

small number of direct DNA hybridizations.

In reality, for almost all regions

of the genome, the probes available

today

are not

dense enough to order a complete set of restriction fragments. Instead one must utilize

procedures that allow the restriction map to be extended outward from the position of

known fragments

into

neighboring regions. A second problem is that the DNA frag-

ment sizes generated by total digestion with available rare-cutting enzymes are quite

diverse. Thus, although the average fragment size seen with

Not I is about 1 Mb, many

fragments

are 3

in size,

and many are 0.2 Mb or smaller. Available

DNA probes

are

very

likely

be found that

recognize the 3-Mb fragments; it is much less

likely

that any preexisting probes will correspond to the 0.2-Mb fragments at typical probe densities of 1 per 1 to 2 Mb.


Figure 8.10		Base composition of two cloned DNA segments derived from		HTF islands on human
chromosome 21. Plotted is the local average base composition as a function of the				position	within the
clone. A	Not	I site	that allowed the selective cloning of these DNA pieces is also		indicated. (	a ) Clone
centered	in an HTF island. (		b ) Clone at one edge of an HTF island. (Adapted from Zhu et al., 1993.)

247

248 PHYSICAL MAPPING

Figure 8.11 Distribution of three dinucleotide sequences within the clone described in Figure 8.10 a.

To carry out macrorestriction mapping projects efﬁciently, a number of needs must be met that allow one to circumvent the general problems raised above:

1.Probes must be isolated that are not preferentially located on large restriction fragments.

2.DNA must be cut less frequently to generate large fragments in any desired area of the genome.

3.Isolated probes must correspond to fragments of interest, those fragments where probes are needed to complete a map.

4.Neighboring fragments must be unequivocally identiﬁed.

5. Any tiny fragments generated	by chance, by enzymes that on average yield very
large fragments, must not be ignored.
Methods now exist that deal with all	of these problems reasonably efﬁciently. Most will
be described in this chapter; a few	will be deferred to Chapter 14 where we deal with

some of the specialized methods that have been developed to manipulate particular DNA sequences.

Figure 8.12 The genetic or cytogenetic map provides a set of anchor points to place selected large DNA fragments in order.

IDENTIFYING THE DNA FRAGMENTS GENERATED BY A

RARE-CUTTING RESTRICTION ENZYME

The restriction digestion pattern generated by any enzyme that yields pieces large enough to be useful in genomic or chromosomal restriction mapping must be analyzed by PFG. Since this separation technique has a limited window of high resolution, under any ﬁxed

DNA FRAGMENTS GENERATED BY A RARE-CUTTING RESTRICTION ENZYME

249

experimental conditions it is usually necessary to fractionate the digest by PFG using a

range of three or four different pulse times: 30-second pulses for fragments 0 to 200 kb,

60-second pulses for fragments 200	to 1000 kb, 1000-second pulses at lower ﬁeld
strengths for 1 to 3 Mb pieces, and 3600-second pulses or secondary PFG for fragments
larger than this.
The entire distribution of DNA fragments can be visualized by staining the PFG frac-
tionated material in the gel with ethidium bromide. This dye binds in between every other
base pair (total stoichiometry 0.5 dye per base pair), and shows a more than 25-fold ﬂuo-
rescence enhancement when bound. Ethidium bromide appears to have no signiﬁcant
speciﬁcity for any particular base compositions or base sequences. It is sensitive enough
for all routine use. Recently a number of	other dyes have been reported that may offer
the promise of higher-sensitivity DNA detection than ethidium bromide. However, the
interactions of these dyes with very large DNA molecules have not yet been fully
characterized.
If the genome under analysis is a relatively simple one, ethidium staining allows all the
pieces to be visualized, as we showed earlier in Figure 8.5. Since the amount of ethidium
bound is proportional to the size of the DNA	fragment, the resulting distribution of ﬂuo-

rescence

intensity

in a

result like Figure 8.5 gives

the weight

average distribution

DNA. Small fragments are very hard to see. In general, a monotonic increase in fragment

intensity is expected as fragment size increases. Deviations from this pattern indicate

multiplets: size fractions that contain two or more unresolved DNA pieces, or heterogene-

ity; DNA fragments present in substoichiometric amounts because they arise from a con-

taminant

in the

sample,

from a restriction site that has

been only partially cleaved, or

from DNA partially degraded (by nuclease or even by electrophoresis itself). These com-

plications aside, a quantitative analysis of the pattern of ethidium staining will usually al-

low an accurate analysis of the number of DNA fragments in the genome. When these

sizes are summed, an estimate for the total genome size is produced. Indeed, when ethid-

ium staining is carried out very carefully, the stained intensity of DNA bands is not just a

monotonic function of their size; it is very close to a linear function of their size

to a

few Mb. Thus from the relative staining intensity alone it is sometimes possible to make

reliable

estimates

of DNA

sizes,

even when

proper size

standards, for

some

reason,

are

not present or useable.

With complex

genomes

only a smear of ethidium staining

intensity

is generally

seen.

the highest

obtainable

PFG resolution, the entire human

and mouse

genome actually

show a very discrete and reproducible banding pattern in ethidium-stained gels of DNA

digested

with

Not

I or other rare-cutting restriction enzymes (Fig. 8.13). The signiﬁcance

this

pattern

has never

been

explained—but

does

allow

DNA from

different

species

to be identiﬁed. In those unusual cases

where

major

repeating

sequence

has

rare-

cutting site in it, a bright DNA band will sometimes be seen

above

the

background ethid-

ium smear.

develop

restriction

map

chromosome

from

complex

genome,

the

human genome, it is best to

start

with

that chromosome in a hybrid cell. Even

better,

for

the

larger

mammalian

chromosomes,

are

hybrid

cells

that

contain

only

50 Mb fragment of the chromosome of interest. There are two major advantages for starting with such a hybrid cell. First, the chromosome of interest will almost certainly have a

unique genotype. Even if multiple copies are present, these are likely to be identical. In effect the sample is homozygous. This eliminates any confusion that would otherwise re-

sult if two different polymorphic structures of a region were merged into the same map.

250 PHYSICAL MAPPING


Figure 8.13	Ethidum-stained PFG fractionation of		Not	I digested human and mouse DNA from
different cell lines. Fragments resolved in this gel range		in size from 50 kb to 1 Mb. The distinct
pattern of banding seen depends on the particular species and		enzyme used.

An example of such a useful cell line is WAV17 which has two to three identical copies of
human chromosome 21 in a mouse background.
The second advantage of using a hybrid cell is that it is possible to view most or all of
the human component above the background of DNA from the rodent. This is accom-
plished by hybridizing a blot of a PFG-fractionated restriction enzyme digest of the cellu-
lar DNA with various human-speciﬁc repeating DNA sequences. For example, the most
common human-speciﬁc repeat is the			Alu	sequence.		This will	be	described in much
greater detail in Chapter 14. Here, however, it is sufﬁcient to note that this sequence oc-
curs on average about once every 3 kb in some regions of the human genome and about
once every 10 kb in others. Probes for			Alu		exist that show			little signiﬁcant cross-
hybridization with rodent DNA. These can then be used as a human-speciﬁc stain to de-
tect the presence of human restriction fragments in a mouse			or hamster	cell. An		example
of such an analysis is shown in Figure 8.14.
The human	Alu	sequence should be seen on	almost every large fragment of				human
DNA if our genome were well ﬁt by a statistically random model. This can be shown by
using a simple Poisson model for the distribution of						Alu ’s. From the Poisson distribution
(Chapter 6) we can estimate that the probability of an						Alu	not occurring on a fragment of
interest will be

P (no

Alu ) e ( average fragment size/

Alu spacing)

DNA FRAGMENTS GENERATED BY A RARE-CUTTING RESTRICTION ENZYME

251

Figure 8.14	PFG fractionation of	Not	I-digested DNA from a mouse cell line that contains chro-
mosome 21 as the only human component. After electrophoresis			the gel was blotted, and the blot
was hybridized with the highly repeated human-speciﬁc			Alu sequence. Shown in some of the panels
are size standards used to estimate the length of particular human DNA fragments. (Taken from
Sainz et al., 1992.)

The smallest fragments of much interest in human macrorestriction mapping are around

50 kb; the least

Alu

-dense

regions of

the

genome

have a 10-kb spacing

between

Alu ’s;

thus in these regions the chance that a 50-kb fragment has no

Alu

is exp(

5). The chance,

at random, that a 1 Mb fragment should lack an

Alu

is inﬁnitesimal. We have actually

characterized the

distribution of

Alu

’s

than

Not

I fragments of

human chro-

mosome 21. In practice, all but one contain

Alu

as detected by hybridization experiments.

The sole exception, however, is a 2.3 Mb fragment.

The probability

of seeing

such

event at

random is

e 230 assuming it

derives from

Alu

-poor region of the chromosome.

Clearly we will have to reﬁne our statistical picture of human DNA sequences quite con-

siderably as more cases like this surface.

When human DNA fragments are detected by hybridization with human-speciﬁc re-

peated sequences the resulting distribution of intensity should still be generally propor-

tional to fragment size. Note, however, that considerable variations around this mean will

occur because repeated sequences tend to cluster and because small fragments with small

numbers of repeats on average will show typical small number ﬂuctuations. One way to

generate

labeling

intensity

that is

independent

fragment

size

label

the

ends.

252	PHYSICAL MAPPING
This can	be done	with enzymes like polynucleotide kinase that place a	phosphate on the
5 -end	of a DNA	chain, or it can be done by ﬁlling in any inset 3	-ends of duplex DNA

with DNA polymerase and radiolabeled dppN’s. This procedure is readily applied within agarose gels, and it has been useful in the analysis of small genomes. For mammalian genomes the procedure is not useful because there is no speciﬁc way to label just the ends of the human DNA fragments in a rodent hybrid cell.

MAPPING IN CASES WHERE FRAGMENT LENGTHS CAN BE

MEASURED DIRECTLY

The classical approach to constructing restriction maps of small DNAs like plasmids and viruses is illustrated in Figure 8.15. Two or more restriction enzymes are used separately and in double digests to fragment the DNA of interest. The sizes of all pieces seen in an ethidium-stained gel are measured. Usually the pattern of sizes allows alignment of the different cutting sites in a single map. This procedure is clearly not a rigorous one, and it breaks down severely once the maps become complex, or when many similar-sized frag-

ments are involved. In principle, each fragment could be isolated by electrophoretic fractionation, radiolabeled, and used as a probe. This would allow all overlapping fragments from digests with other enzymes to be unambiguously identiﬁed. In practice, however, it is usually easier to employ partial digestion strategies with end-labeled probes, as we will describe later for macrorestriction mapping.

With mammalian DNAs, multiple restriction enzyme digest mapping is much less effective. Although one can determine the size of the fragments generated by each enzyme, by using repeated sequence hybridizations, there is an annoying tendency for many of the restriction enzymes that yield large DNA fragments to cut in the same regions. This is because most such enzymes prefer HTF islands. Thus the usefulness of double digestion in most regions of mammalian genomes is far less than illustrated by the example in Figure 8.15. A more serious problem is that with large numbers of fragments in single-enzyme

digests, double	digests become hopelessly complicated to analyze.		With	mammalian
DNA, in contrast to DNA from simple genomes, one cannot		easily access	each	puriﬁed
fragment because it	is contaminated by other human or rodent	fragments. PCR can help

circumvent this problem, as we will demonstrate in Chapter 14. In general, though, one must rely on hybridization with single-copy probes in order to simplify the pattern of

DNA fragments to the point where it can be analyzed. The example in Figure 8.15 shows clearly that fragments from the end of large DNA pieces are particularly useful hybridization probes. The ﬁgure also indicates that with only a limited set of probes from the region, double digests are frequently impossible to analyze because many of the fragments in these digests will not correspond to any of the available probes. This discussion should make it clear that new strategies had to be developed to simplify the construction of macrorestriction maps of segments of complex genomes.

Figure 8.15 Schematic illustration of the double digestion procedure used to assemble simple restriction maps.

GENERATION

OF LARGER

DNA

FRAGMENT

SIZES

253

GENERATION OF LARGER DNA FRAGMENT SIZES

Some of the problems illustrated by the example in Figure 8.15

would

alleviated

there were a systematic way to generate large DNA fragments in a region of interest. One

general approach for doing this will be discussed here; others, like the RARE method, are

deferred to Chapter 14.

has

already

been

mentioned

that

most

restriction

enzymes

are

inhibited

DNA methylation. One can take advantage of this to increase the speciﬁcity of certain re-

striction enzymes by methylating a subset of their cutting sites. Methylation can also be

used as a general way to promote partial digests by using a methylase that recognizes all

the cutting sites but does not allow the reaction to go to completion. Here, however, we

deal with methylation reactions that are carried to completion.

Consider the DNA se-

quence shown below. It contains a

Not

site ﬂanked

additional G–C

pairs

shown

boldface:

CGGCCG

: m CG

CGGC

CGCG

GCCGGC

CCGGC

m C

Roughly one-quarter of all

Not

I sites

will

have

an extra

their

-end; an additional

quarter will have an extra G at their 3

-end

as shown

the

above example. These

extra

residues

generate a

recognition site

for

the

Fnu

methylase that converts CGCG to

m CGCG. Since some of the methylation is within the

Not I cutting site, this inhibits any

subsequent attempts to cleave the site with

Not I. Thus by methylation one can inactivate

about half of all the

Not

I cleavage sites and double the average

fragment size generated

by this enzyme. Many variations on this theme exist.

Methylation also plays a key role in a whole set of potential schemes for site-selective

cleavage of DNA employing the unusual restriction endonuclease

Dpn

I. This enzyme

recognizes the sequence GATC, but it requires that the A

methylated

order

for

cleavage to occur. The preferred substrate is

G m

ATC

A major complication is that the monomethyl

derivative

also

cut,

but

much more

slowly. Hence, in the schemes described below, it is essential to expose the DNA sub-

strate to

Dpn

for

the minimum

time

needed

cleave

the desired

sites before

the

background of undesired additional cleavages becomes overwhelming.

Dpn

I is converted to an infrequently cutting enzyme by starting with unmethylated

target DNA,

which

Dpn

I cannot

cleave

all, and

selectively

introducing

methyl

groups

by treatment with methylases with recognition sequences that overlap

part

the

Dpn

site. The simplest example, shown below, employs the

Taq

I methylase. This enzyme rec-

ognizes the sequence TCGA and converts it to TCG

A. If

two

such

sequences

lie adja-

cent to each other in a genome, the result, once both are methylated is

TCG

m ATCG

m AGCT

AGCT

<<< < Предыдущая 12 / 62 3 4 5 6 > Следующая >>>

Соседние файлы в папке genomics1-10

#
17.08.2013343.56 Кб463.pdf
#
17.08.2013296.13 Кб464.pdf
#
17.08.2013326.85 Кб455.pdf
#
17.08.2013406.31 Кб456.pdf
#
17.08.2013277.57 Кб467.pdf
#
17.08.2013634.83 Кб478.pdf
#
17.08.2013475.69 Кб469.pdf
#
17.08.2013192.47 Кб48booktext[1].pdf