
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
INTERVAL MAPPING |
185 |
|
Thus the average contribution from one child to the ELOD is the |
sum |
of these two |
|
|
|
|
|||||||||||
cases weighted by their expected frequency. |
Since |
recombination |
across |
10 |
|
cM |
|
occurs |
|
|
|
|
|||||
only 10% of the time, |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ELOD(0.1) |
0.9 log |
|
0.9 |
0.1 log |
|
0.1 |
|
|
||||||||
|
0.5 |
0.5 |
|
||||||||||||||
|
ELOD(0.1) |
0.23 0.07 0.16 |
|
||||||||||||||
Thus observation of cosegregation of A and D adds to the probability of linkage, while |
|
|
|
|
|||||||||||||
observation of separation of A and D subtracts from the evidence for linkage. |
|
|
|
|
|
|
|
|
|
||||||||
What we need to do is develop the tools to assess the statistical significance of a par- |
|
|
|
|
|||||||||||||
ticular ELOD score. Since some markers |
will appear to cosegregate by |
chance |
in |
any |
|
|
|
|
|||||||||
study with a relatively small number of affected individuals, there is always a chance of |
|
|
|
|
|||||||||||||
seeing a significantly positive LOD score, simply because of the random fluctuations. A |
|
|
|
|
|||||||||||||
near consensus in human genetics is that an observed LOD of 3.0 or higher is required |
|
|
|
|
|||||||||||||
before the probability of purely accidental linkage can be reduced to the point where few |
|
|
|
|
|||||||||||||
errors are made. For the example just described, the number of individuals segregating D |
|
|
|
|
|||||||||||||
with unambiguous pedigrees that would have to be combined to generate a LOD score of |
|
|
|
|
|
||||||||||||
3.0 can be estimated as 3/0.16 |
18. For common inherited diseases this is not a problem, |
|
|||||||||||||||
but for very rare diseases it may be extremely difficult to find 18 genetically informative |
|
|
|
|
|||||||||||||
individuals for a particular marker with an unambiguous diagnosis. |
|
|
|
|
|
|
|
|
|
|
|
|
|
||||
Note that several constraints apply to the linkage analysis described above. One must |
|
|
|
|
|||||||||||||
have access to a parent with known phase between A |
and D. The marker A to |
be tested |
|
|
|
|
|||||||||||
for linkage must have useful heterozygosity. The diagnosis of D must be unambiguous in |
|
|
|
|
|||||||||||||
all the individuals tested. Note that failing to diagnose an individual who is carrying D (a |
|
|
|
|
|||||||||||||
false negative) does not hurt the analysis, |
since in this case the individual and |
the |
parent |
|
|
|
|
||||||||||
are not scored. However, misclassifying an individual as carrying D instead of d (a false |
|
|
|
|
|||||||||||||
positive) causes serious problems because it will weaken the evidence about which alleles |
|
|
|
|
|||||||||||||
at other loci are cosegregating with D. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
INTERVAL |
MAPPING |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Once a |
genetic map is available for a region of |
interest, |
the |
process |
of |
|
linkage |
analysis |
|
|
|
|
|||||
can be made more powerful by examining several markers simultaneously. We will con- |
|
|
|
|
|
||||||||||||
sider the simplest possible case, illustrated in Figure 6.19 |
|
|
|
|
|
|
|
|
a . As in the previous discussion |
||||||||
of simple linkage analysis, we will calculate the average contribution of the LOD |
score |
|
|
|
|
||||||||||||
from a single, informative individual inheriting a disease allele D. We wish to |
test |
a re- |
|
|
|
|
|||||||||||
gion of the genome containing two linked loci with markers A and B to see if the disease |
|
|
|
|
|||||||||||||
allele D lies between them or is unlinked. (Here we ignore the case that it might be linked |
|
|
|
|
|||||||||||||
to A and B but lie outside them rather than between them.) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
Suppose that the loci containing A and B are 20 cM apart. This is a reasonable model |
|
|
|
|
|||||||||||||
for how human genetic maps are used in average regions of the genome. |
|
|
|
|
|
|
|
|
|
|
AB |
0.2. First |
|||||
we calculate the possible contributions from |
a parent carrying D to a child, |
also |
|
carrying |
|
|
|
|
|||||||||
D, if there is no linkage between A and B with D (Fig. 6.19 |
|
|
|
|
|
|
|
|
|
b ). Since A and B are on |
the |
||||||
same chromosome, D, if unlinked, they must |
lie on a different chromosome. Assuming |
|
|
|
|
|
|||||||||||
that the parent is heterozygous and informative at all these loci, there are four possible |
|
||||||||||||||||
contributions from the parent to the child (Fig. 6.19 |
|
|
|
|
|
|
c ). |
|
|
|
|
|

186

|
|
|
|
|
|
|
|
|
|
|
|
|
INTERVAL |
MAPPING |
187 |
|
If no recombination between A and B occurs (80% probability for markers 20 cM |
|
|
|
|||||||||||||
apart), the child will either inherit ABD (0.4 odds) or abD (0.4 odds). If recombination |
|
|
|
|||||||||||||
between A and B occurs (20% probability), the child will inherit either AbD (0.1 odds) or |
|
|
|
|||||||||||||
aBD (0.1 odds). |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
If D is linked and |
located between A and B, |
assuming |
the phase of the parent is |
|
|
|
||||||||||
known, the two homologous chromosomes of the parent |
carry alleles ADB and adb, as |
|
|
|
|
|
||||||||||
shown in Figure 6.19 |
d . In principle, D may lie anywhere between A and B and the actual |
|
|
|
||||||||||||
position of D is a variable that must be included in the calculations. Here we will consider |
|
|
|
|||||||||||||
the simple case where D lies midway between A and B. Assuming that the recombination |
|
|
|
|
|
|||||||||||
frequency is uniform in this |
region of the chromosome, we can then place D 10 cM from |
|
|
|
|
|
||||||||||
A and 10 cM from B (Fig. 6.19 |
|
e ). There are four possible sets of alleles that can be |
|
|||||||||||||
passed from this parent to a child who inherits D (Fig. 6.19 |
|
|
|
|
|
f ). These are as follows: |
|
|||||||||
ADB: resulting from no recombination between A and D, and no recombination be- |
|
|
|
|||||||||||||
tween D and B (odds are 0.9 |
|
0.9). |
|
|
|
|
|
|
|
|
|
|||||
ADb: resulting from no recombination between A and D but recombination has oc- |
|
|
|
|||||||||||||
curred between D and B (odds are 0.9 |
|
|
|
|
0.1). |
|
|
|
|
|
|
|
||||
aDB: recombination |
has occurred between A |
and D, but |
no recombination has oc- |
|
|
|
||||||||||
curred between D abd B (odds are 0.1 |
|
|
|
|
|
0.9). |
|
|
|
|
|
|
|
|||
aDb (a double crossover event): |
|
|
recombination has occurred both between A and D |
|
|
|
||||||||||
and between D and B (odds are 0.1 |
|
|
|
|
0.1). |
|
|
|
|
|
|
|
|
|||
Thus the same four possible genotypes can |
arise either with or without linkage. |
|
|
|
||||||||||||
However, the odds of particular genotypes vary |
considerably in |
the two |
cases. |
For |
the |
|
|
|
||||||||
four possible offspring: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alleles |
|
|
|
ADB |
ADb |
aDB |
|
aDb |
|
|
|
|||||
Odds (linked/unlinked) |
0.81/0.4 |
|
0.09/0.1 |
0.09/0.1 0.01/0.4 |
|
|||||||||||
The ELOD for a single statistically representative child can be calculated from these re- |
|
|
|
|||||||||||||
sults by realizing that if there is linkage, the probabilities of seeing the four patterns of al- |
|
|
|
|||||||||||||
leles are 0.81, 0.09, 0.09, and 0.01, respectively. Thus the ELOD is given by |
|
|
|
|||||||||||||
ELOD(0.2) |
0.81 log |
|
0.81 |
0.09 log |
|
0.09 |
0.09 log |
|
0.09 |
|
||||||
0.4 |
0.1 |
0.1 |
||||||||||||||
|
|
0.01 log |
|
0.01 |
|
|
|
|
|
|
|
|
|
|||
|
|
0.4 |
|
|
|
|
|
|
|
|
||||||
ELOD(0.2) |
0.25 0.004 0.004 0.02 0.23 |
|
|
|
Figure 6.19 |
Interval mapping to test the hypothesis that a disease allele D is located equidistant |
|
||
between two linked markers A and B, separated by 20 cM. |
|
(a) Map of the test region. |
(b) Parental |
|
chromosomes if D is unlinked to A and B. |
(c) Possible chromosomes inherited by an offspring car- |
|
||
rying the disease allele in the absence of linkage. |
|
(d) Parental chromosomes if D lies between A and |
|
|
B. (e) Map location assumed for D for the example calculated in the text. |
(f) Possible parental con- |
|||
tributions to an offspring inheriting the disease allele. |
|
|
|

188 |
|
GENETIC |
ANALYSIS |
|
|
|
|
|
|
|
|
||||
Note that this ELOD is larger in the case of interval mapping than in the simple case of |
|
||||||||||||||
linkage analysis we considered earlier. The number of informative individuals that would |
|
|
|||||||||||||
have to be examined to achieve a LOD score of 3 would be 3/0.23 |
|
|
14. |
||||||||||||
FINDING GENES BY GENETIC MAPPING |
|
|
|
|
|
|
|
||||||||
What is done, in practice, is to repeat the kinds of calculations previously described with |
|
|
|||||||||||||
all |
possible |
values of |
|
as |
a variable using actual genotype data from real |
families. For |
|||||||||
simple linkage analysis the sorts of results obtained are shown schematically in Figure |
|
|
|||||||||||||
6.20. These yield |
the expected LOD score as a function of |
|
|
. The critical results are the |
|||||||||||
maximum |
LOD |
value |
and the confidence limits on possible values of |
|
|
|
. With interval |
||||||||
mapping, the results are more complex, but the basic kind of information obtained is sim- |
|
|
|||||||||||||
ilar, as shown by the example in Figure 6.21. For details, see Ott (1991) and Lalouel and |
|
||||||||||||||
White (1966). |
|
|
|
|
|
|
|
|
|
|
|
||||
|
In a typical case, no a priori information exists about the putative location of a gene of |
|
|||||||||||||
interest. To have a reasonable chance of finding it, one must test the hypothesis that it lies |
|
||||||||||||||
near (or between) any of about 150 informative markers. This will subdivide the genome |
|
|
|||||||||||||
into |
intervals |
spaced |
about 20 cM apart. Each marker must be tested with a sufficient |
|
|||||||||||
number of informative individuals to achieve a LOD score of 3.0 or higher if that particu- |
|
||||||||||||||
lar |
marker is |
linked |
to the gene. With present technology this search is often carried out |
|
|||||||||||
one |
marker |
and |
one |
individual |
at |
a |
time. It is easy to estimate that around 150 |
||||||||
markers |
|
40 |
to |
60 individuals |
(parents and offspring) must be tested in |
ideal |
cases |
||||||||
where parental phase is known and markers are very informative. If the analysis is carried |
|
|
|||||||||||||
out by ordinary Southern blotting (Chapter 3) of DNA bands 6000 to 12,000 gel elec- |
|||||||||||||||
trophoresis lanes have to be examined by hybridization to afford a reasonable chance of |
|
|
|||||||||||||
finding a gene, and this is an ideal case! Schemes have recently been described that can |
|
|
|||||||||||||
reduce the workload by an order of magnitude through the use of pools of samples |
|
|
|||||||||||||
(Churchill et al., 1993; see also Chapter 9 for examples of the power of pooling). |
|
|
|||||||||||||
|
If a LOD score of 3.0 or greater is achieved, there is |
a reasonable chance |
that |
the |
|||||||||||
correct |
location |
of |
the |
disease |
gene |
of |
interest has been |
found. |
What |
is usually done |
|
||||
is to celebrate, publish a preliminary report, and fend off overoptimistic members of the |
|
|
|||||||||||||
press or families segregating the disease |
of interest who confuse the first sighting of |
a |
|
||||||||||||
gene location with the identification of the actual disease gene itself. Knowing the loca- |
|
||||||||||||||
tion of a disease gene |
does provide improved diagnostics for |
the disease |
but, initially, |
|
|||||||||||
only in those families where the phase of the disease allele and nearby markers is known. |
|
|
Figure 6.20 LOD score for linkage of two genes, with a particular recombination frequency, that would be seen in a typical set of family inheritance data.

FINDING GENES BY GENETIC MAPPING |
189 |
Figure 6.21 Example of interval mapping data. Shown is the expected LOD score [log(odds)] as a function of the possible location of a gene within the interval mapped by two known linked genes.
(Adapted from Leppert et al., 1987).
Furthermore, at a 10 cM distance, |
the amount of recombination between the marker and |
|
the disease allele in each meiosis |
is still 10%, so the |
accuracy of any genetic testing is |
quite limited. More accurate approaches are outlined in Box |
6.3 and Box 6.4. |
BOX 6.3
MULTIPOINT MAPPING
More accurate genetic maps can be constructed by considering all the loci simultane-
ously rather than just dealing with pairs of loci. In this case what one establishes, primarily, is the order of the loci and the relative odds in favor of that order based on the sum of all the available data. In principle, one can write down all possible genetic maps and calculate the relative likelihood of each being correct in the context of the
available data. In practice, it is usually quite tedious to do this. Instead, as shown in Figure 6.22, one usually plots the most likely map, and gives the relative odds that the order of each successive pair of markers is reversed from the true order.
Figure 6.22 Typical map data by multipoint analysis. Shown are the relative odds in favor of two orderings of the markers A, B, C, and D.

190 |
GENETIC |
ANALYSIS |
|
|
|
|||
|
The |
next goals are to strengthen the evidence for linkage and narrow |
the |
putative |
||||
location of the gene. Additional examples of affected individuals can be examined using |
||||||||
only |
the |
closest |
known |
markers. If this increases the LOD score, there |
is |
little doubt |
||
that the gene location has been correctly identified. Once the interval containing the gene |
||||||||
is known, |
one can |
look |
for additional markers in the region of interest. Various |
methods |
||||
to find polymorphic markers in selected DNA regions will be described later. These meth- |
||||||||
ods |
are quite powerful |
so long as the region is actually polymorphic in the population. |
||||||
Note that once the approximate gene location is found, the markers used to refine that lo- |
||||||||
cation need not be informative in all patients in the sample. What is key is to find particu- |
||||||||
lar individuals who demonstrate recombination between the disease gene and |
nearby |
|||||||
markers. Until linkage was established such individuals actually weakened the search be- |
||||||||
cause there was no way |
of knowing a priori that they were recombinants, and |
thus, |
as |
|||||
shown in earlier examples, they subtracted from the expected LOD score. Once the gene |
||||||||
is known |
to be nearby, |
such individuals can be recognized as recombinants and |
properly |
scored as shown by the example in Figure 6.23. Just two informative individuals with recombination events defined by their haplotypes (patterns of alleles on a single chromosome) are sufficient to pinpoint the location of the disease gene, barring the unlikely occurrence of a gene conversion or double crossover.
MOVING FROM WEAK LINKAGE CLOSER TO |
A |
GENE |
|
|
|
|
|
|||
Failure |
to find a linked marker in an |
initial test does not mean that |
no marker |
is |
linked |
|||||
to |
the |
gene. A disease gene must lie |
somewhere |
in the genome. A |
possibility |
is |
that |
|||
the model for inheritance used in |
the linkage study was wrong. |
One |
must |
consider |
||||||
dominant |
and recessive |
inheritance as |
well more complex cases where |
multiple alleles |
||||||
or |
even |
multiple genes |
are involved. |
It is very |
tempting in cases |
where |
the |
maximum |
LOD score obtained is less than 3.0 to review individual families contributing to the LOD score and ask if the score can be improved by dropping some of the families. This implicitly challenges the diagnosis in these families or presumes that the disease is heteroge- neous—that it is influenced by other factors in addition to the particular gene in question.
Figure 6.23 Examples of two recombinant genotypes seen from a parent with known phase. Once the disease allele D is known to lie in this region, the genotype of the two recombinants restricts the possible location of the disease gene to between markers b and c.

|
|
|
|
|
LINKAGE DISEQUILIBRIUM |
191 |
This is a very dangerous practice, |
since if one starts with a sufficient number of families, |
|
||||
it will almost always to possible to achieve an alluring LOD score by selectively choosing |
|
|||||
among them. Clearly the appropriate statistical tests must be employed to discount the re- |
|
|||||
sulting LOD score against such selective manipulation of the data. The real issue is not |
|
|||||
whether one can increase a LOD score by dropping a family with a negative contribution. |
|
|||||
The issue is whether the magnitude of the increase in LOD is sufficient to justify the ad- |
|
|||||
ditional parameterization implicit in dropping this family. A much safer procedure is to |
|
|||||
collect more families and try additional markers near the ones that have already shown a |
|
|||||
hint of linkage if not yet compelling evidence for linkage. When this has been done, some |
|
|||||
LODs of 2.0 eventually have produced the desired gene; others have faded into oblivion. |
|
|||||
Eventually genetic linkage studies may narrow down the location of a gene to a 2 cM re- |
|
|||||
gion. However, in such an interval of the genome, there may be a single gene or more than |
|
|||||
80. It is very difficult to use conventional linkage analysis to narrow the location further. The |
|
|||||
available families are likely to have only a limited number of recombination events in the re- |
|
|||||
gion of interest because they represent just a few generations, which means any recombina- |
|
|||||
tions seen must have occurred recently. A 2 cM localization means that already 50 informa- |
|
|||||
tive meioses have been found. It is |
usually not efficient to keep gathering more families at |
|
||||
this point, although it is efficient to keep trying to find additional informative markers, since |
|
|||||
these can narrow down the location of any recombination events. |
|
|
|
|
||
LINKAGE |
DISEQUILIBRIUM |
|
|
|
|
|
In fortunate cases, a variant on linkage analysis can be used to home in on the likely loca- |
|
|||||
tion of a disease gene once it has |
been assigned to a mapped |
region of |
a |
chromosome. |
|
|
Suppose |
that most affected individuals |
have the same disease allele |
D. This |
is |
the case, for |
|
example, with the Huntington’s disease individuals who live near Lake Maricaibo in
Venezuela; it |
is also the case with individuals affected with |
sickle cell disease, and with |
most individuals |
of northern European descent afflicted with severe |
cystic fibrosis. In such |
cases it is possible that the disease is the result of a founder effect: all affected individuals have inherited the same disease allele-carrying chromosome from a common progenitor.
(When no evidence for a single disease allele exists, but phenotypic variation in the disease is evident, one can try to subtype the disease by severity, age of onset, particular symptoms, and test the presumption that, for this subtype, a founder effect may exist.)
If |
a disease |
allele |
arose once |
by mutation on |
a single chromosome, it will be created |
||
in the |
context of |
a particular |
haplotype |
(Fig. |
6.24). The chromosome that first carries |
||
the disease will have a particular set of polymorphic |
markers. It will have a particular ge- |
||||||
netic background. As |
this |
chromosome |
is |
passed |
through |
many generations of offspring, it |
Figure 6.24 Generation of a disease allele by a mutation on a founder haplotype sets the stage for linkage disequilibrium.

192 |
GENETIC ANALYSIS |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
will suffer frequent meiotic recombination events. These will tend to blur the memory of the |
|
|
|
|
|||||||||||||||
original haplotype of the chromosome; they will average out the original genetic background |
|
|
|
|
|
|
|||||||||||||
with the general distribution of markers in the human population. However, those |
markers |
|
|
|
|
||||||||||||||
very close to the disease gene will tend, more likely than average, to retain the haplotype of |
|
|
|
|
|||||||||||||||
the original chromosome because, as the distance to the disease gene shrinks, it becomes less |
|
|
|
|
|
||||||||||||||
likely that recombination events will have occurred in this particular location. |
|
|
|
|
|
|
|
|
|
|
|||||||||
|
Humans are an outbred population. |
Most alleles |
were |
established when |
the |
|
species |
|
|
|
|
||||||||
was established, and a sufficient number of generations |
has |
passed |
since |
then |
that |
fre- |
|
|
|
|
|||||||||
quent recombination events have occurred between any pair |
of |
neighboring |
loci resolved |
|
|
|
|
|
|||||||||||
on our genetic maps. For this reason the distribution of particular haplotypes in neighbor- |
|
|
|
|
|||||||||||||||
ing loci in the population (as opposed to particular families) should be close to random. |
|
|
|
|
|||||||||||||||
Consider the case shown in Figure 6.25, for two neighboring loci with two alleles each. |
|
|
|
|
|||||||||||||||
Within the population, the frequencies |
|
|
|
|
X of |
the |
alleles |
at |
a |
particular |
locus must sum |
to |
|
||||||
1.0. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
X a |
X A |
1.0 |
|
X b |
X B |
1.0 |
|
|
|
|
|||||||
The frequencies of particular haplotypes, |
|
|
|
|
|
f, should be given by simple binomial statistics: |
|
||||||||||||
|
fAB X A X B |
fAb |
|
X A X b |
faB X a X B |
fab X a X b |
|
|
|||||||||||
Deviations from these results, measured, |
for example, |
as |
|
|
|
|
|
|
|
|
|
|
fAB |
observed |
fAB |
calculated, |
|||
are evidence for linkage disequilibrium, and they indicate that |
the |
individuals |
examined |
|
|
|
|
||||||||||||
are not a random sample of the population. Note, however, that deviation of allele fre- |
|
|
|
|
|||||||||||||||
quencies from those expected by binomial statistics may |
have other causes besides ge- |
|
|
|
|
||||||||||||||
netic linkage. Deviations can reflect improper sampling of the population, or they can re- |
|
|
|
|
|||||||||||||||
flect actual functional association between specific alleles. The latter process could occur, |
|
|
|
|
|||||||||||||||
for example, if the protein products of the two genes in question actually interacted bio- |
|
|
|
|
|||||||||||||||
chemically. (For further discussion see Ott, 1991.) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|||
|
To search for a gene by linkage disequilibrium, one does not examine families segre- |
|
|
|
|
||||||||||||||
gating a disease allele D. Instead, one looks across a broad spectrum of the population for |
|
|
|
|
|||||||||||||||
unrelated individuals who have the disease allele D. If evidence |
for |
linkage |
disequilib- |
|
|
|
|
||||||||||||
rium is found, it reflects recombinations |
along the chromosome all the way |
back |
|
in |
time |
|
|
|
|
||||||||||
to the original founder. Since this may extend back hundreds of years, more than ten gen- |
|
|
|
|
|||||||||||||||
erations may be involved, and thus the number of recombination |
events |
viewed |
will |
be |
|
|
|
|
|||||||||||
much greater than possible with any contemporary family. In the case of linkage disequi- |
|
|
|
|
|||||||||||||||
librium, we expect to see the general results shown in Figure 6.26. There will be a gradi- |
|
|
|
||||||||||||||||
ent of increasing deviation from equilibrium as the neighborhood of the disease gene is |
|
|
|
|
|||||||||||||||
reached because of the diminishing likelihood of recombination |
events |
occurring |
in |
an |
|
|
|
|
|||||||||||
ever-shrinking region. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Figure 6.25 Possible haplotypes in a two-allele system used to examine whether loci are at equilibrium.

COMPLICATIONS IN LINKAGE DISEQUILIBRIUM AND GENETIC MAPS IN GENERAL |
193 |
Figure 6.26 Gradient of linkage disequilibrium seen near a disease allele in a case where a founder effect occurred.
COMPLICATIONS IN LINKAGE DISEQUILIBRIUM AND
GENETIC MAPS IN GENERAL
The human genome is a potential minefield of uncharted genetic events, hidden rearrangements, new mutations, and genetic heterogeneity. Failure to see linkage disequilibrium near a gene does not mean that the gene is far away. Two of the most plausible potential complications are the existence of more than one founder or the existence of a significant fraction of alleles in the population that have arisen by new mutations. For example, in the case of dominant lethal diseases (those in which, nominally, the affected individuals have no offspring), one must expect that most disease alleles will be new mutations. Multiple founders can occur in distinct geographical populations, and they can be tested for by subdividing the linkage disequilibrium analysis accordingly. However, our increasingly mobile population, at least in developed countries, will make such analyses increasingly difficult.
Two other reasonable explanations for a failure to see linkage disequilibrium near a disease gene of interest are shown in Figure 6.27. The first of these is the possible presence of recombination hot spots. If the recombination pattern in the region of interest is punctate, then an even gradient of linkage disequilibrium will not be seen. Instead, mark-
ers that lie within a pair of hot spots will appear to be in disequilibrium, while those that lie on opposite sides of a hot spot will appear to have equilibrated. The occurrence of any disequilibrium in the region is presumptive evidence that a disease gene is there, since this is the basis for selection of the particular set of individuals to be examined. However, the complex pattern of allele statistics in the region will make it difficult to narrow in on the location of the disease gene.
Figure 6.27 |
complications |
that can obscure evidence for linkage disequilibrium. |
(a) Recombina- |
tion hotspots near the disease gene. |
(b) Mutation hot spots near the disease gene. |
|

194 GENETIC ANALYSIS
A second potential source of confusion is the presence of mutation hotspots. These are quite common in the human genome. For example, the sequence CpG is quite mutagenic
in those regions of the genome where the C is methylated, as discussed in Chapter 1. When mutation hotspots are present, these alleles appear to have equilibrated with their neighbors, while more distant pairs of alleles may still show deviations from equilibrium.
As in the case of recombination hotspots, disequilibrium indicates that one has not sampled the population randomly. This is presumptive evidence for a disease gene nearby, but
mutation hotspots weaken the |
power of the disequilibrium approach to actually focus in |
on the location of the desired gene. |
|
DISTORTIONS IN THE GENETIC |
MAP |
We have already discussed briefly the occurrence of recombination hot spots and their deleterious effect on attempts to find genes by linkage disequilibrium. Some hot spots are inherited; in the mouse Major Histocompatibility Complex (MHC), a set of genes that
regulates |
immune response, a |
hot spot allele |
has been |
found |
that raises the local fre- |
quency of recombination by a |
hundredfold. All |
of the recombination events caused by |
|||
this hot |
spot have been mapped |
within the second intron |
of the |
E |
While we are not sure what has caused this hot spot, the region has been sequenced, and one peculiarity is the occurrence of four sequences with 9/11 bases equal to a consensus sequence TGGAAATCCCC. Such sequences have also been found in regions associated
with other recombination hot spots.
The genetic map of the human, and other organisms is not uniform. Recombination is generally higher near the telomers and lower near centromeres. The map is strikingly different in males and females—that is, meiosis in males and females appears to display a very different pattern of recombination hot spots. A typical example is shown for a selected region of human chromosome 1 in Figure 6.28. Note that some regions that have short genetic distances
in the female have long distances in the male, and vice versa. Genetic linkage analysis is more powerful in regions where recombination is prevalent because, the more recombinants
per Mb, the more finely the genetic data will serve to subdivide the region. In general, genetic maps based on female meioses are considerably longer than those based on male meioses.
This is summarized in Table 6.1. A frequent practice is to pool data and show a sex-averaged genetic map. It is not very clear that this is a reasonable thing to do. Instead, it would seem that once a region of interest has been selected, meioses should be chosen from either the female or the male depending on which set produces the most expanded and informative map
of the region. At present it does not appear that most workers pay much attention to this.
b gene, 4.3 kb in size.
Figure 6.28 Comparison of low-resolution genetic maps in female and male meiosis. Shown is a portion of the map of human chromosome 1.