Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
50
Добавлен:
17.08.2013
Размер:
435.19 Кб
Скачать

Genomics: The Science and Technology Behind the Human Genome Project.

Charles R. Cantor, Cassandra L. Smith

 

Copyright © 1999 John Wiley & Sons, Inc.

 

ISBNs: 0-471-59908-5 (Hardback); 0-471-22056-6 (Electronic)

2

A

Genome

Overview

 

at

the

Level

 

of

Chromosomes

BASIC PROPERTIES OF CHROMOSOMES

Chromosomes were first seen by light microscopy, and their name reflects the deep color they take on with a number of commonly used histological stains. In a cell a chromosome

consists

of a single

piece

of DNA packaged

with various accessory

proteins.

Chromosomes are the fundamental elements of inheritance, since it is they that are passed

from cell to daughter cell, from parent to progeny. Indeed a cell that did not need to re-

produce would not have to keep its DNA organized into specific large molecules. Some

single-cell organisms, like the ciliate Tetrahymena, actually fragment a working copy of

their DNA

into gene-sized pieces for expression while maintaining an unbroken master

copy for reproductive purposes.

 

 

 

BACTERIAL

CHROMOSOMES

 

 

 

 

Bacteria

generally have

a single

chromosome. This

is usually a circular DNA

duplex.

As shown in Figure 2.1, the chromosome has at least three functional elements. The repli-

cation origin (ori) is the location of the start of DNA synthesis. The termination (ter) re-

gion provides a mechanism for stopping DNA synthesis of the two divergent replication

 

forks. Also present are

 

par

sequences which ensure that chromosomes are partitioned rel-

atively uniformly between daughter cells. A description of a bacterial chromosome is

complicated by the fact that bacteria

are continually

replicating

and

transcribing

their

DNA.

 

 

 

 

 

 

 

 

In rapidly growing organisms, a round of replication is initiated

before the previous

round is

completed. Hence

the

number

of copies of genomic DNA depends on how

 

rapidly the organism is growing and where in the genome one looks. Genes near the ori-

 

gin are often present at several times the copy number of genes near the terminus. In gen-

eral, this seems to have little effect on the bacterium. Rather, bacteria appear to take ad-

vantage of

this fact. Genes

whose

products are required

early in the

replication cycle,

or

in large amounts (like the ribosomal RNAs and proteins), are located in the early replicated regions of the chromosomes. Although the bacterial chromosome is relatively toler-

ant of deletions (involving nonessential genes) and small insertions, many large rearrangements involving inversion or insertions are lethal. This prohibition appears to be related to conflicts that rise between convergent DNA replication forks and the transcrip-

tion machinery of highly expressed genes.

29

30 A GENOME OVERVIEW AT THE LEVEL OF CHROMOSOMES

Figure 2.1 Basic functional elements in chromosomes: ori (replication origin), tel (telomere), cen (centromere), ter (termination region). Little is known about eukaryotic termination regions.

Bacteria will frequently harbor additional DNA molecules smaller than their major chromosome. Some of these may be subject to stringent copy number control and to orderly partitioning, like the major chromosome. These low copy number plasmids use the

same DNA replication machinery as the

chromosome, and they are present in a copy

number that is equal to the chromosomal copy number. An example is the naturally oc-

curring F

DNA of

E. coli

which represents about 2% of the DNA present in the cell.

In some bacteria essential genes are located on two genetic elements. For instance, in

Rhodobacteria sphaeroides

 

the genes encoding the ribosomal RNAs (rRNAs) are located

on a 0.9

Mb chromosome,

whereas the

remainder of the genome is located on a 3 Mb

chromosome. In Pseudomonas species, many genes that code for catabolic (degradative) enzymes are located on large extrachromosomal plasmids. Plasmids containing antibiotic resistance genes have been isolated from many species. In part, it is the rapid transfer of plasmids, sometimes even between different genera, that accounts for the rapid development of large populations of antibiotic resistant bacteria.

The control of replication of other, usually smaller plasmids, is more relaxed. These plasmids can have a very high intracellular copy number, and they are usually the focus

of recombinant DNA cloning experiments. These plasmids use a DNA replication mecha-

nism that is distinct from that used by the chromosomes. In fact selective inhibition of chromosomal replication machinery focuses the cell replication machinery on producing

more plasmid such that it is possible to increase the copy number of these plasmids to about 1000 copies per cell. Selection of growth conditions, which depend on genes carried by the plasmid, can also be used to increase or decrease its copy number. Some plasmids do not contain a par functioning region and are not partitioned in an orderly fashion

to daughter cells. This means that their inheritance is subject to statistical fluctuations that may lead to significant instabilities especially with low copy plasmids. Since the genome

BACTERIAL CHROMOSOMES

31

is defined as all of the DNA

in a cell, plasmids and other extrachromosomal DNA ele-

ments must be counted as part of it.

 

Bacterial chromosomes contain bound protein molecules that are essential for normal

growth. Some of these promote an organization or packaging of DNA similar to that seen

in higher

organisms. However,

the proteins that appear to be responsible

for packaging

are present in such small amounts that can only interact with about 20% of the genomic

DNA. Furthermore the packaging does not seem to be as orderly or as stable as the pack-

aging of

DNA in eukaryotic

chromosomes. Bacterial chromosomal DNA is

organized

into topological constrained domains (Box 2.1) that average 75 kb in size. The way in which this occurs, and its functional consequences, are not yet understood in a rigorous

way.

BOX 2.1

TOPOLOGICAL PROPERTIES OF DNA

Because the two strands of DNA twist around a common axis, for unbent DNA, each

turn of the helix is equivalent to twisting one strand 360° around the other. Thus, when the DNA is circular, if one strand is imagined to be planar, the other is wrapped around it once for each helix turn. The two circular strands are thus linked topologically. They cannot be pulled apart except by cutting one of the strands. A single nick anywhere in either strand removes this topological constraint and allows strand separation. The topological linkage of the strands in circular DNA leads to a number of fascinating phenomena, and the interested reader is encouraged to look elsewhere for

detailed descriptions of how these are studied experimentally and how they are analyzed mathematically. (Cantor and Schimmel, 1980; Cozzarelli and Wang, 1990).

For the purpose of this book, the major thing the reader must bear in mind is that physical interactions between DNA strands can be blocked by topological constraints.

If a linear strand of DNA contacts a surface at two points, the region between these points is topologically equivalent to a circle that runs through the molecule and then through the surface. Hybridization of a complementary strand to this immobilized strand will require twisting the former around the latter. This may be difficult if the latter is close to the surface, and it will be impossible if the former is circular itself. Cells deal with the topological constraints of DNA double helices by having a se-

ries of enzymes that relax these constraints. Type I topoisomerases make a transient single-stranded nick in DNA, which allows one strand to rotate about the other at that point. Type II topoisomerases make a transient double-strand nick and pass an intact segment of the duplex through this nick. Though it is less obvious, this has the effect of allowing the strands to rotate 720° around each other. When DNA is replicated or transcribed, the double helix must be unwound ahead of and rewound behind the mov-

ing polymerases. Topoisomerases are recruited to enable these motions to occur.

 

In condensed chromatin, loops of 300 Å fiber are attached to a scaffold. Although

the

DNA of

mammalian chromosomes is linear,

these

frequent attachment

points

make

each

constrained

loop topologically into a

circle. Thus topoisomerases are

needed for

the DNA in

chromatin to function.

Type

II topoisomerases are

a major

component of the proteins that make up the chromosome scaffold.

32 A GENOME OVERVIEW AT THE LEVEL OF CHROMOSOMES

CHROMOSOMES OF EUKARYOTIC ORGANISMS

All higher organisms usually have linear chromosomal DNA molecules, although circles

 

can be produced under special circumstances. These molecules have a minimum of three

 

functional features, as shown in Figure 2.1. Telomeres are specialized

structures

at the

ends of the chromosome. These serve at

least two functions. They provide a mechanism

 

by which the ends of the linear chromosomes can be replicated. They stabilize the ends.

Normal double-strand DNA ends are very

unstable in eukaryotic cells. Such ends could

 

be the result of DNA damage, such as that caused by X rays, and could be lethal events.

Hence very efficient repair systems exist that rapidly ligate ends not containing telomeres

together. If several DNAs are broken simultaneously in a single cell, the correct fragment

pairs are unlikely to be reassembled,

and one or more translocations will result. If the

ends are not repaired fast enough by ligation, they may invade duplex DNA in order to be

repaired by recombination. The result, even for a single, original DNA break, is a re-

arranged genome.

 

 

 

 

 

Centromeres are DNA regions necessary for precise segregation of chromosomes to

 

daughter cells during cell division. They are the binding site for proteins that make up the

kinetochore, which in turn serves as the attachment site for microtubules, the cellular or-

ganelles that pull the chromosomes apart during cell division.

 

 

 

Another feature of eukaryotic chromosomes is replication origins. We know much less

 

about the detailed structural properties of eukaryotic origins than prokaryotic ones. What

is clear from inspecting the pattern of DNA synthesis along chromosomes is that most

chromosomes have many active replication origins. These do not necessarily all initiate at

the same time, but a typical replicating chromosome will have many active origins. The

presence of multiple replication origins allows for complete replication of entire human

genome in only 8 hours. It is not known how these replication processes terminate.

 

CENTROMERES

 

 

 

 

In

the yeast,

S. cerevisiae,

the centromere has been defined by genetic and molecular ex-

periments to reside in a small DNA region, about 100 bp in length. This region contains

several A–T rich sequences that may be

of key importance for function. The centromere

of

S. cerevisiae

is very different in size and characteristics than

the centromeres of more

advanced organisms, or even the yeast

S. pombe.

This is perhaps not too surprising in

view of the key role that the centromere plays in cell division. Unlike these other species,

S. cerevisiae

does not undergo symmetrical cell division. Instead, it buds and exports one

copy of each of its chromosomes to the daughter cell. In species that produce two nomi-

nally identical daughter cells, centromeres appear to be composed mostly of DNA with

 

tandemly repeating sequences. The most striking feature of these repeats is that the num-

ber of copies can vary widely. A small repeating sequence, (GGAAT)

 

n , has recently been

found to be conserved across a wide range of species. This conservation suggests that the

sequence may be a key functional element in centromeres. While not yet proved, physical

 

studies on this DNA sequence reveal it

to have rather unusual helical properties that at

least make it an attractive candidate for a functional element. As shown in Figure 2.2,

the

G-rich strand of the repeat has a very stable helical structure of its own, although the detailed nature of this structure is not yet understood.

CENTROMERES 33

Figure 2.2 Evidence for the presence of some kind of unusual helical structure in a simple centromeric repeating sequence. The relative amount of absorbance of 260 nm UV light is measured as

a function of temperature. Shown are results for the two separated strands of a duplex, and the duplex itself. What is unusual is that one of the separated strands shows a change in absorbance that is almost as sharp and as large as the intact duplex. This indicates that it, alone, can form some sort of helical structure. (From Moyzis et al., 1988.)

The other repeats in the centromeres are much longer tandemly repeated structures. In

the human these were originally termed

satellite DNA

because they were originally seen

as shoulders or side bands when genomic DNA fragments were fractionated by base com-

 

position by equilibrium ultracentrifugation on CsCl gradients (see Chapter 5). For exam-

ple, the alpha satellite of the african green

monkey is a 171 base pair tandem

repeat.

Considerable effort has gone into determining the lengths and sequences of some of these satellites, and their organization on the chromosome. The results are complex. The repeats are often not perfect; they can be composed of blocks of different lengths or different orientation. The human alpha satellite, which is very similar in composition (65%

identity) to

the african

green monkey satellite, does

not

form

a

characteristic

separate

band during

density

ultracentrifugation.

Thus

the

term

satellite

has

evolved

to

now

include

tandemly

repeated

DNA

sequences

which

may

be

of

the

same composition

 

 

of the

majority

of

genome.

An

example

of a satellite

sequence

is

shown

in Figure

2.3.

Figure 2.3 Example of a tandemly repeating DNA sequence in centromeric DNA.

34

A GENOME OVERVIEW AT THE LEVEL OF CHROMOSOMES

 

 

 

The implications of the specific sequences of the repeats, and their organization, for cen-

tromere function are still quite cloudy.

 

 

 

 

 

 

 

 

 

 

The size of centromeres, judged by the total length of their simple sequence blocks,

varies enormously among species, and even among the different chromosomes contained

 

 

 

in one cell, without much indication that this is biologically important. Centromeres

in

S.

pombe

are only 10

4 to 10 5

bases in size, while those

in the human can be several Mb.

Centromeres

on different human chromosomes appear to be able

to

vary

in

size

widely,

 

and within the population there is great heterogeneity in the apparent size of some cen-

tromeres. It is as though nature, having found a good thing, doesn’t care how much of it

there is as long as it is more than a certain minimum.

 

 

 

 

 

 

 

 

TELOMERES

 

 

 

 

 

 

 

 

 

 

 

Telomeres in almost all organisms with linear chromosomes are strikingly similar. They

 

consist of

two components

as shown in Figure

2.4. At the very end of

the

chromosome

is

 

a long stretch of tandemly repeating sequence. In most organisms the telomere is domi-

 

nated by the hexanucleotide repeat (TTAGGG)

 

 

 

n . The repeating pattern is not perfect; sev-

eral other sequences can be interspersed. In a few species the basic repeat is different, but

it always has the characteristic that one strand is T and G-rich. The

G

 

 

 

T rich strand is

longer than the complementary strand, and thus the ends of the chromosome have a pro-

 

truding 3

-end strand of some considerable length. This folds back on itself to make a sta-

ble helical structure. The best evidence suggests that this structure is four stranded, in

analogy to the four-strand helix made by aggregates of G itself. Pairs of telomeres might

have to associate in order to make this structure. Alternatively, it could be made by loop-

ing

the 3

-end of one chromosome back on itself three times. The details not withstand-

ing, these

structures are apparently effective in protecting the ends of the

chromosomes

from attack by most common nucleases.

 

 

 

 

 

 

 

 

 

 

Next

to the simple sequence telomeric repeats, most

chromosomes

have

a

series

of

more complex repeats. These frequently occur in blocks of a few thousand base pairs.

Within the blocks there may be some tandemly repeated sequence more complex than the

 

hexanucleotide telomeric repeat. The blocks themselves

are

of

a number

of

different

types, and these are distributed in different ways on different chromosomes. Some unique

 

sequences, including genes, may occur between the repeating sequences. It is not clear if

 

any chromosome really has a unique telomere, or if this matters. Some

researchers

feel

 

that the sub-telomeric repeats may play a role in positioning the ends of the chromosomes

 

at

desired

places within

the nucleus. Whether and how this

information

might

be

coded

 

by the pattern of blocks on a particular chromosome remains

to

be determined. At

least

 

some subtelomeric sequences vary widely from species to species.

 

 

 

 

 

 

 

Figure 2.4 Structure of a typical telomere.

 

 

 

DYNAMIC BEHAVIOR OF TELOMERES

35

DYNAMIC BEHAVIOR

OF TELOMERES

 

 

 

The actual length of the simple telomeric repeating DNA sequence is highly variable both

 

within species and between species. This appears to be a consequence, at least in part, of

 

the way in which telomeres are synthesized and broken down. Telomeres are not static

 

structures. If the

growth

of cells is monitored through successive generations, telomeres

 

are observed to gradually shrink and then sometimes lengthen considerably. Nondividing

 

 

cells and some cancer cells appear to have abnormal telomere lengths. At least two mech-

 

 

anisms are known that can lead to telomere degradation. These are shown in Figure 2.5

 

a.

In one mechanism the single-strand extension is subject to some nuclease cleavage. This

 

shortens that strand. The alternate mechanism is based on the fact that the 5

 

-ended strand

must serve as a starting position for DNA replication. This presumably occurs by the gen-

 

eration of an RNA primer

that is then extended inward by DNA

replication. Because of

 

the tandem nature of the repeated sequence, the primer can easily be displaced inward be-

 

fore synthesis continues. This will shorten the 5

-strand. Both of these mechanisms seem

 

likely to occur in practice.

 

 

 

A totally different mechanism exists to synthesize telomeres and to lengthen existing

 

telomeres (Fig. 2.5

b ). The enzyme telomerase is present in all cells with telomeres. It is a

 

ribonucleoprotein. The RNA component is used as a template to

direct the synthesis

of

 

the 3 -overhang of the telomere. Thus the telomere is lengthened by integral numbers of

 

repeat units. It is not known how the complex and subtle variations seen in this simple se-

 

quence arise in practice. Perhaps there is a family of telomerases with different templates.

 

More likely, some of the sequences are modified after synthesis, or the telomerase

may

 

just be sloppy.

 

 

 

 

 

Figure 2.5 Dynamics of telomeric simple repeating sequences. Mechanisms for telomere shrink-

age: (a)Nuclease cleavage;

(b)downstream priming by telomerase;

(c)mechanism of telomere

growth.

 

 

36

A GENOME OVERVIEW AT THE LEVEL OF CHROMOSOMES

 

 

 

 

 

 

The total length of telomeric DNA in the human ranges from 10 to 30 kb. Because of

 

its heterogeneity, fragments of DNA cut from any particular telomere by restriction en-

 

 

 

zymes have a broad size range and appear in gel electrophoretic size

fractionations

as

 

 

broad, fuzzy bands. This is sufficiently characteristic of telomere behavior to constitute

 

reasonable evidence that the band in question is telomeric. In mice, the length of telom-

 

 

 

eres is much longer, typically 100 kb. We don’t know what this means. In general, the

 

message from both telomeres and centromeres is that eukaryotic cells apparently feel no

 

 

 

 

pressure to minimize the size of their genome or the sizes of these important functional

 

 

elements. Unlike viruses, which must fit into small packages for cell escape and reinfec-

 

 

 

tion, and bacterial cells which are under selection pressure

in

rich

media

to

replicate

as

 

 

fast as they can, eukaryotic cells are not mean and lean.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CHROMATIN

AND THE HIGHER-ORDER

STRUCTURE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

OF CHROMOSOMES

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A eukaryotic chromosome is only about half DNA by weight. The remainder is protein,

 

 

 

 

 

mostly histones. There are five related, very basic

histone

proteins

that

bind

tightly

to

 

DNA. The rest is a complex mixture called, loosely,

 

 

 

 

 

 

 

 

 

 

nonhistone

chromosomal proteins.

This mixture consists of proteins needed to mediate successive higher-order packaging of

 

 

 

the DNA, proteins needed for chromosome segregation, and proteins involved in gene ex-

 

 

 

 

pression

and regulation. An enormous effort has gone into characterizing

these

proteins

 

 

 

and the structures they form. Nevertheless, for the most part their role in higher-order

 

 

structure or function is unknown.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

At the lowest level of chromatin folding, 8 histones (two each of four types) assem-

 

ble into a globular core structure that binds about 140 bp of DNA

forming

it

into

a

 

coiled structure called the nucleosome (Fig. 2.6). Not only

 

are

nucleosomes

very

 

similar

in

all

organisms

from

yeast

to

humans

but

in

addition

 

the four core his-

 

tones are among the most evolutionarily conserved proteins known. Nucleosomes pack

 

 

 

 

 

together to form a filament that is 100 Å in diameter and is known by this name. The

 

details

of the filament are different in different species

because

the

lengths

of

the

 

spacer DNA between the nucleosomes varies.

In

turn the

100

Å

 

filament

is

coiled

 

upon itself to make a thicker structure, called the 300 Å fiber. This appears to be

solenoidal in shape. Stretches of solenoid containing on average 50 to 100 kb of DNA

 

 

are attached to a protein core. In condensed metaphase chromosomes,

this

core

ap-

 

 

 

pears as a central scaffold of the chromosome; it can be seen when the chromosome is

 

 

 

largely

stripped

of other

proteins and

examined

by

electron

microscopy (Fig.

2.7).

A major component of this scaffold is the enzyme topoisomerase II (Box 2.1), which

 

probably serves a role analogous to DNA gyrase in

 

 

 

 

 

 

 

 

 

 

 

 

E.

coli

of acting as a swivel to

circumvent any topological problems caused by

the

interwound

nature

of

DNA

 

 

 

strands.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

At other stages in the cell cycle, the scaffold proteins may actually attach to the nu-

 

clear envelope. The chromosome is then suspended from this envelope into the

interior

 

 

 

of the nucleus. During mitosis, if this picture is correct, the chromosomes are then

 

essentially

turned inside

out,

separated into

daughters,

and

reinverted.

The

topologi-

 

cal domains created by DNA attachment to the scaffold at 50 to 100 kb intervals are

 

similar

in size to those seen in bacteria, where they

appear

to

 

be

formed

by

much

 

simpler structures.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figur

e 2.6 Hierarchy

of structural

elements

in

chromatin

and

chromosomes:

(a)The

nucleosome;

(b)a 100 Å fila-

ment;

(c)a 300 Å solenoid.

(d)chromosome

loop

anchored

to

a protein

scaf fold;

(e)successi

ve

loops

stack

ed

along

the

scaf fold; (f)the appearance

of a condensed

metaphase

 

chromosome.

 

The

path of

the scaf

fold

is

unkno

wn,

b ut

it is

not

straight.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

38 A GENOME OVERVIEW AT THE LEVEL OF CHROMOSOMES

The most compact form of chromosomes occurs in metaphase. This is reasonable be-

 

cause it is desirable to pass chromosomes efficiently to daughter cells. Surely this is facil-

 

itated by having more compact structures than the unfolded objects seen in Figure 2.7.

 

Metaphase

chromosomes appear to consist of stacks of packed 300 Å chromatin fiber

 

loops. Their structures are quite well controlled. For example, they have a helical polarity

 

that is opposite

in the two sister chromatids (the chromosome pairs about to separate in

 

cell division). When the location of specific genes on metaphase chromosomes is exam-

 

ined (as described later), they appear to have a very fixed position within the morphologi-

 

cally visible structure of the chromosome. A characteristic feature of all metaphase chro-

 

mosomes is

that

the

centromere region appears to be constricted. The reason for this is

 

not known.

Also

for

unknown reasons some eukaryotic organisms, such as the yeast

S.

cerevisiae

 

do not have characteristic condensed metaphase chromosomes.

 

The higher-order structure of chromatin and chromosomes poses an extraordinary

 

challenge for structural biologists because they are so complex and because these struc-

 

tures are so large. The effort needed to reveal the details of these structures may not be

 

worthwhile. It remains to be shown whether the details of much of the structure actually

 

matter for

any

particular biological function. One extreme possibility is that this is all

 

cheap packaging; that most of it is swept away whenever the underlying DNA has to be

 

uncovered for function in gene expression or in recombination. We do not yet know if this

 

is the case, but our ability to manipulate DNAs and chromosomes has grown to the point

 

where it should soon be possible to test such notions explicitly. It would be far more ele-

 

gant and satisfying

if we uncover sophisticated mechanisms that allow DNA packaging

 

and unpackaging to be used to modulate DNA function.

Figure 2.7 Electron micrograph of a single mammalian chromosome, denatured to remove most of the protein and allow the DNA to expand. The X-shaped structure is the protein scaffold that de-

fines

the shape of the condensed chromosome. The DNA appears as a barely resolvable mass of

fiber

covering almost the entire field.

Соседние файлы в папке genomics1-10