

274 PHYSICAL MAPPING
Figure 8.42 Fingerprinting a clone by hybridization with different repeated DNA sequences. See Chapter 14 for a description of these sequences.
Figure |
8.43 |
An example of how re- |
|||
peated |
sequence |
hybridization simplifies |
|
||
the identification of two contiguous cos- |
|
||||
mid clones. |
( |
a ) Two overlapping |
clones. |
||
(b ) Restriction fragments and Southern |
|||||
blot. |
(Adapted |
from |
Stallings et |
al., |
|
1990.) |
|
|
|
|
|
274

|
MEASUREMENTS OF PROGRESS IN BUILDING ORDERED LIBRARIES |
275 |
||||||
MEASUREMENTS OF |
PROGRESS |
IN |
BUILDING ORDERED |
LIBRARIES |
|
|
|
|
The process of assembling contigs by fingerprinting clones can be treated in relatively |
|
|
||||||
straightforward mathematical ways. One makes the key assumption that the genome is |
|
|
||||||
being sampled uniformly, and sets, as a parameter, the degree of |
overlap between |
two |
|
|||||
clones necessary to constitute positive evidence that they are contiguous. Lander and |
|
|||||||
Waterman have modeled this process of clone ordering. The sorts of results they obtained |
|
|
||||||
are shown in Figure 8.44. It is assumed that clones are fingerprinted one at a time, and the |
|
|||||||
number of clones assembled into contigs of two or more clones is plotted as a function of |
|
|
||||||
the number of clones fingerprinted. At early times in the project, there are almost no con- |
|
|
||||||
tigs because the odds of picking overlapping clones, chosen at |
random, |
are |
small. |
|
||||
Eventually overlaps start to build up, but most contigs contain just two clones. These be- |
|
|||||||
gin to coalesce into larger contigs as the genome is sampled deeper and deeper. However, |
|
|
||||||
the effectiveness of the contig building begins to saturate long before all clones are as- |
|
|||||||
sembled into a single contig. This saturation is partly determined by the lack of complete- |
|
|
||||||
ness of the library; if any regions are not represented at all, contigs cannot be built across |
|
|||||||
them. The saturation is also a function of the effectiveness of overlap detection; due to |
|
|||||||
chance, some clones that |
actually are contiguous may not overlap enough |
to be counted |
|
|
||||
as a positive score. Several ongoing programs in contig building have been evaluated |
by |
|
|
|||||
the Lander-Waterman approach. Actual progress on these projects is in remarkably good |
|
|
||||||
agreement with predictions. |
|
|
|
|
|
|
|
|
Eventually the pure bottom-up approach must be abandoned if a complete ordered li- |
|
|
||||||
brary is desired. The point at which a switch in strategy is profitable is said to be some- |
|
|||||||
where between 60% and 90% coverage, when almost all progress in typical bottom-up |
|
|||||||
mapping stops. The early stages of bottom-up mapping are very efficient. DNA prepara- |
|
|
|
|||||
tions, fingerprinting, |
and data analysis have all |
been completely automated for |
some |
of |
|
|||
the schemes we have described. Contig assembly is |
also done by computer software. |
|
|
|||||
Once the saturation point is reached, a typical project will still |
have hundreds or thou- |
|
||||||
sands of separate contigs. The challenge is to close the gaps between them in an efficient |
|
|
||||||
way. Several different approaches are useful at this stage. The contigs can be ordered by |
|
|
||||||
FISH localization of individual clone representatives from each contig. Once one knows |
|
|||||||
that two contigs |
are very |
close |
to each other, frequently overlap data |
that were marginal |
|
Figure 8.44 Progress in a pure bottom-up clone-ordering strategy, as calculated from the LanderWaterman model. Plotted is the number of contigs as a function of randomly chosen clones exam-
ined.

276 PHYSICAL MAPPING
Figure 8.45 |
Two strategies for finishing the construction of contig maps. ( |
a ) Walking by probing |
existing or new libraries of clones with the ends of existing contigs. ( |
b ) Attempting to PCR across |
|
the gaps between two |
contigs suspected of being adjacent on the map (generally hinted |
at from |
other data such as FISH results). |
|
before can now be used to fuse the contigs. The easiest way to fill major gaps where they are suspected is to switch to another library. Here regional assignment of clones from that
library (or microdissection, Chapter 7) can help to focus on clones most likely to lie in re-
gions where contiguity is not yet established. |
|
|
A generally useful endgame strategy is to use existing |
contigs to screen a library of |
|
clones and subtract |
out those that have already been found. |
This greatly improves the |
odds of finding new, useful clones, once additional random picking from the remainder is |
||
reinitiated. Perhaps the single most useful method, once a dense set of contigs exists, is |
||
walking (Fig. 8.45 |
a ). Here one takes clones from the ends of existing tiling path contigs |
|
and uses them to screen libraries. Both the original library and totally new libraries can be |
||
used. The goal is to identify new clones that allow the contig to be extended. It is often |
||
particularly useful to change from one type of library to another in the walking process. |
||
Frequently a gap will exist because the sequence within it is not cloneable, say in cos- |
||
mids, but it may be easily cloneable in YACs, and vice versa. Multiplex walking methods |
||
have been described that allow the simultaneous walking from many contig ends. |
||
A final useful endgame strategy is to sequence the ends of contigs. Sequence informa- |
||
tion is much more robust than any other kind of fingerprinting. Even if two clones overlap |
||
by as few as 15 base pairs, sequence information can determine that they actually overlap. |
||
Sequence information at the ends of contigs can also be used to design PCR primers that |
||
face outward from the |
contigs (Fig. 8.45 |
b ). These primers can be used to test systemati- |
cally whether two contigs suspected of being located near enough to each other are actu-
ally within a few kb apart. This technique turns out to be extremely powerful, in practice, because in actual projects, thus far, many of the hardest to close gaps turn out to be very small, and PCR can be carried out across them.

|
SURVEY OF RESTRICTION MAP AND ORDERED LIBRARY CONSTRUCTION |
277 |
|||||||
SURVEY OF RESTRICTION MAP AND ORDERED |
|
|
|
|
|
|
|||
LIBRARY |
CONSTRUCTION |
|
|
|
|
|
|
|
|
Complete macrorestriction maps have been produced for |
a |
number |
of |
prokaryotic |
|
||||
genomes, some simpler eukaryotic genomes, and sections of complex genomes. The first |
|
||||||||
of these |
maps, |
a |
Not I map of |
E. coli, |
is shown in Figure 8.46. The most complex of all |
|
|||
these maps, that for human chromosome 21q, is shown in Figure 8.47. A number of fea- |
|
||||||||
tures of this map are of interest. Note that small |
|
|
|
Not I fragments and large |
Not I frag- |
||||
ments tend to cluster. This must eflect wide oscillations |
in the density of HTF islands |
|
|||||||
along the chromosome, since |
Not I sites occur almost exclusive in these islands. |
|
|||||||
The |
Not |
I |
map of human chromosome |
21 was actually executed, |
not in a single cell |
|
|||
line but in a set of eight cell lines. Polymorphisms among these lines were helpful in es- |
|
||||||||
tablishing the map as described earlier in the chapter. The |
full |
pattern |
of |
polymorphisms |
|
||||
is illustrated in Figure 8.48. While the extent of polymorphism is considerable, almost all |
|
||||||||
of it is consistent with varying degrees of methylation in the cell lines studied. There is |
|
||||||||
little or no compelling evidence for major shifts in the lengths of DNA between existing |
|
||||||||
Not I sites. Most important, there is no evidence |
that |
any significant amounts of DNA |
|
||||||
have been rearranged or lost in these cell lines. |
|
|
|
|
|
Figure 8.46 Not I restriction map of |
E. coli. (Adapted from Smith et al., 1987.) |

278 PHYSICAL MAPPING
Figure 8.47 Not I restriction map of the long arm of human chromosome 21. (Taken from Wang and Smith, 1994.)

SURVEY OF RESTRICTION MAP AND ORDERED LIBRARY CONSTRUCTION |
279 |
Figure 8.48 |
Polymorphisms seen in the |
Not I map of human chromosome 21q in nine different |
cell lines (lanes 1 to 9). (Taken from Wang and Smith, 1994.)

280 |
|
PHYSICAL MAPPING |
|
|
|
|
|
|
|
|
|
A number of successful projects have been reported that have produced complete, or |
|
||||||||
almost complete ordered clone libraries. The first of these was the ordered bacteriophage |
|
|||||||||
lambda library covering the |
|
E. coli |
genome. Other model organisms now mapped include |
|
||||||
the |
yeasts |
S. cerevisiae, |
and |
S. pombe, |
and the nematode |
C. elegans. |
Extensive map data |
|||
also exists for Drosophila and for the human genome. A relatively complete YAC map |
|
|||||||||
covering the informative part of the |
Y chromosome has been reported, and a complete |
|
||||||||
YAC maps exist that cover most |
human chromosomes. Extensive cosmid ordering |
|
||||||||
projects on chromosomes 16 and 19 are virtually complete. Gaps not covered in cosmids |
|
|||||||||
are mostly covered in YACs or BACs. |
|
|
|
|
|
|
|
|
||
|
An example of some of the data used to construct the chromosome YAC 21 map is shown |
|
|
|||||||
in Figure 8.49. It is apparent that at the present stage some of the overlap evidence would be |
|
|||||||||
strengthened by interpolating results from additional clones, and some YACs used show evi- |
|
|||||||||
dence of rearrangements that are potential sources of error. Indeed, when the YAC contig for |
|
|||||||||
chromosome 21 is compared with the |
|
|
|
Not |
I restriction map, several YACs appear to be as- |
|||||
signed to the wrong locations on the chromosome (Fig. 8.50). This is almost certainly partly |
|
|||||||||
the result of YAC chimeras which can seriously confuse |
clone |
ordering (see Chapter |
9). |
|
||||||
Other discrepancies appear to result from the use of several probes with confused identi- |
|
|||||||||
ties. Nevertheless, a remarkable amount of information and a goodly number of useful |
|
|||||||||
clones are now available for this chromosome. |
|
|
|
|
|
|
|
|||
|
A complete YAC map and three complete cosmid maps are available for the yeast |
|
S. |
|||||||
pombe. |
The tiling path YACs |
from this map are shown in Figure 8.51, alongside the |
|
|||||||
Not |
I |
restriction map of this |
organism and a sketch of the genetic map. This view, |
|
||||||
which presents a very simple looking map, hides |
|
the complex |
process that |
actually |
|
|||||
went |
into |
the construction of the |
map. Figure 8.52 illustrates |
the actual YAC clones |
|
Figure 8.49 A contiguous section of YACs from human chromosome 21. The contig is about 2.3 Mb long; 18 probes (STSs) were needed to assemble it. Note that several of the YACs appear to have internal deletions.

SURVEY OF RESTRICTION MAP AND ORDERED LIBRARY CONSTRUCTION |
281 |
Figure 8.50 Comparison of marker order in the |
Not I restriction map of human chromosome 21 |
and the chromosome 21 YAC contig map. (Taken from Wang and Smith, 1994.)

282 |
|
|
PHYSICAL MAPPING |
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
Chromosome I |
|
|
|
Chromosome II |
|
|
Chromosome III |
|
|||||||
|
|
|
5.7 Mbp |
|
|
|
|
|
|
4.6 Mbp |
|
|
3.5 Mbp |
|
|
||
|
|
|
8H1 |
rad8 |
|
M |
IF8 |
rad11 |
|
|
|
rDNA |
|||||
|
L |
|
|
|
rad6 |
P |
|
2C4 |
|
||||||||
|
|
|
|
|
|
|
|
|
|
|
|
ura4 |
|||||
|
|
|
|
|
|
cdc25 |
|
N |
|
|
|
|
|
|
|||
|
O |
3F3 |
3H7 |
mei3 |
|
|
|||||||||||
|
|
|
|
|
3B6 |
|
|||||||||||
|
J |
|
|
|
|
|
|
|
cut1 |
||||||||
|
|
|
|
|
|
|
|
|
|
||||||||
|
|
6E12 |
|
|
|
|
|
|
|
|
|
|
|||||
|
|
|
|
rad15 |
|
|
|
|
11H7 |
cde2 |
|
11C3 |
wcc1 |
||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|||||
D |
|
6E11 |
rad14 |
|
B |
10F4 |
cen2 |
|
cen3 |
||||||||
|
|
|
|
|
|||||||||||||
|
|
|
|
|
|
|
|
3A12 |
top2 |
A |
|
|
ade6 |
||||
|
|
|
|
6D7 |
|
|
|
|
|
|
mat3 |
14A6 |
|||||
|
|
|
|
|
|
|
|
|
|
|
|
||||||
|
|
|
|
6E9 |
|
|
|
|
|
|
|
rad13 |
|
arg1 |
|||
H |
|
|
|
|
|
|
|
10D11 |
|
103D |
|||||||
|
|
|
crm1 |
|
G |
|
cdc10 |
|
|
||||||||
|
|
|
|
|
|
|
|
top1 |
|
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|
|
|
ade5 |
|||||
|
|
|
4F11 |
|
|
|
|
|
|
|
|
|
|
||||
|
F |
|
|
|
|
|
|
8B10 |
ade1 |
|
2F12 |
|
|||||
|
|
|
|
cen1 |
|
|
|
|
|
|
|
rDNA |
|||||
|
|
|
3G9 |
|
|
|
|
|
nda3 |
|
|
||||||
|
|
|
|
C |
|
|
|
|
|
||||||||
K |
|
|
|
pma1 |
|
|
|
Not I YAC clones |
|
||||||||
10G7 |
|
|
|
|
|
7D9 |
pho1 |
|
|||||||||
|
|
|
nuc2 |
|
|
|
|
|
|
|
|
|
|||||
|
|
|
|
|
|
|
|
|
|
|
|
nda2 |
|
|
|
|
|
|
|
|
|
|
|
mei2 |
|
|
|
|
|
|
|
|
|
|
|
|
E |
|
7E6 |
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
rad1 |
|
|
|
|
|
|
|
|
|
|
|
|||
|
14E5 |
|
Not I |
YAC clones |
|
|
|
|
|
||||||||
|
|
|
|
|
|
|
|
|
|||||||||
|
I |
rad17 |
|
|
|
|
|
|
|
|
|
|
|
||||
|
|
|
|
rad2 |
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
13G3 |
|
|
|
|
|
|
|
|
|
|
|
||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
Not I |
YAC clones |
|
|
|
|
|
|
|
|
|
|
|
|
||||
Figure 8.51 Three maps of |
the fission yeast |
|
|
|
|
S. pombe. |
Plotted are the |
|
Not I restriction map, the |
||||||||
26-clone tiling set of a YAC contig map, and markers from the genetic map. Dotted lines indicate |
|
|
|
|
|||||||||||||
genetic markers and cosmids, which were hybridized to |
|
Not I digests of |
S. pombe |
and |
to YACs. |
||||||||||||
(Taken from Maier et al., 1992.) |
|
|
|
|
|
|
|
|
|
|
|
|
and probes studied along the way |
to map completion, and the selection of a simple |
||||||||||||
tiling set. The large number of |
samples required, even for a simple organism, can |
||||||||||||
barely |
be displayed |
as |
a legible |
figure. |
This |
should |
make it clear that any map- |
||||||
ping |
project, with |
contemporary |
technology, is |
not |
to be undertaken lightly. The cos- |
||||||||
mid |
maps of |
S. pombe |
|
are even more complex and hard to display visually. Some de- |
|||||||||
tails about the procedures that were |
used to |
construct |
one |
of |
these maps |
will be |
given |
||||||
in Chapter 9. |
|
|
|
|
|
|
|
|
|
|
|
||
|
An |
issue that still |
leads to |
considerable debate |
is |
when |
to end a |
mapping |
project. |
How important is it |
to close the last gap, that is, to confirm the relative order within a |
||
contig to beyond any doubt? The |
simplest way to deal with this question is to recall |
||
the |
purpose of maps. We need them |
to access the genome, both for biological studies |
|
and |
for eventual DNA |
sequencing. A |
map that is 70% complete has seen only the be- |
ginning of the effort required to make a fully finished map—but it already provides ac-
cess to 70% of the chromosome. A 90% map is frankly, for most purposes, almost as useful as a fully completely map, unless one is so unfortunate as to need clones or sequence data in some of the regions that are still in small fragments or contigs. In gen-
eral, the usefulness of mapping projects grows |
very rapidly in the early |
stages and |
|
then begins to increase much more slowly as the maps near completion. It is important |
|||
to consider this in deciding how much effort should be devoted to fitting in |
the |
last |
|
contig, as opposed to breaking out into new, |
uncharted territory on another |
chromo- |
|
some or in another genome. |
|
|
|

Figure 8.52 Actual sets of YACs and probes needed to generate the YAC tiling set in Figure 8.51. YAC clones are shown on the vertical axis, where a subset of 26 clones spanning the entire genome
is indicated. Probes are drawn on the horizontal axis; some of the genetic markers used are identified. Vertical gray bars separate the three chromosomes. Positive signal outside the contructed contigs indicates the locations of repeats. (Taken from Maier et al., 1992.)
283