Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Berg2946SupplementaryPDF

.pdf
Скачиваний:
3
Добавлен:
23.02.2015
Размер:
184.52 Кб
Скачать

Here m/(M-1) is the probability that there is an invasion by members or sequences from another patch that carry the same novel sequence.

The number m will increase by one only if this invasion takes place in one of the M-m patches that do not carry that novel sequence. Furthermore, a change occurs only if the invasion is successful. In this case the invader actually takes over a patch with probability

Pfixinv (s) . Here, s is the selective advantage, or disadvantage if s < 0, of the new sequence. If

the new sequence is infective, i.e. can be transferred between individuals of the population, s should be replaced by s+γ here and everywhere below, as shown in equations (7) and (9). Similarly, the sequence can be lost from a patch with rate

k

= λ

M m mPinv (s) +uN mPmut (s) +d

m

(M m)

(23)

 

m

 

M 1

fix

p fix

M

 

 

 

 

 

 

 

 

The first term is just the converse of the corresponding term in equation (22). Here, an individual from a patch not carrying the novel gene invades one of the m patches that carry the novel sequence in question at a rate given by λ(M-m)/(M-1) and is fixed with probability

Pfixinv (s) .

The second term in equation (23) is the rate of mutational inactivation: An expressed sequence can be inactivated with the mutation rate u in any of the Np cells of a patch carrying

the novel sequence. This change will become fixed in the patch with probability Pfixmut (s) .

Standard models can calculate the fixation probabilities in a single patch, as for example from equations (2) and (7) – (9). If each invasion occurs with n0 individuals entering a patch, the fixation probability in the patch is

Pinv (s) =

1esn0

 

N pe

 

sn

 

N pe

(24).

 

 

 

 

0

 

 

1esN pe

 

N p

 

esN pe

 

N p

fix

 

1

 

 

This is the same as equation (9) when n0 = 1. The extra factor Npe/Np accounts for the difference in real and effective population size of the patch (Crow and Kimura, 1970). The

expression is valid if there is little or no interference by inactivating mutations during the takeover process. Thus, it is required that u << s, or that u < 2/Npe if s = 0. Each inactivating mutation that occurs with rate u per individual will be fixed in the patch with probability

P

mut

(s) =

es 1

 

N pe

s

 

 

N pe

(25)

 

esN pe 1 N p

esN pe

1 N p

fix

 

 

 

Equation (25) is a special case of equation (24) when s Æ -s and n0 = 1.

The probability of global takeover in all patches can be calculated from the scheme in the same way as in equation (2)

P global

=

1

 

 

(26)

 

M 1 m

 

fix

 

 

 

 

 

 

1+ ∑∏

ki

 

 

 

 

+

 

 

 

 

 

m=1 i=1

ki

 

 

This expresses the probability that the gene will be fixed in all patches at least once before eventual loss, if initially it was fixed in one patch. Thus, the fixation probability is determined by the ratios of backwards and forwards rates

 

 

 

k

 

 

M 1

 

K

m

=

m

= K 1

+V

 

 

(27)

km+

 

 

 

 

 

M m

 

From equations (22) and (23) we obtain:

K =

 

λ(esn0 1)N pe +d (esN pe 1)N p

 

1

Λ

sN pe

(28)

 

 

 

 

Λ+d

 

 

λ(1esn0 )N peesN pe +d (esN pe 1)N p

 

 

V =

 

uN p N pe s

u

 

 

 

 

 

 

 

 

(29)

λ(esn0 1)N pe +d (esN pe 1)N p

Λ+d

 

 

 

The approximations are relevant in the limit of weak selection, |s|n0 << |s|Npe << 1. In these equations, the migration rate Λ has been introduced as follows:

Λ=λn0/Np

(30).

For a neutral sequence both invasion and extinction are random processes and their effects are additive. In contrast, for a selected gene, as seen in equation (28), the random extinction process (with rate d) decreases the effective selection and thereby the fixation probability.

The migration rate between patches can be defined as the fraction of individuals that are new immigrants in any generation and this rate, as given in equation (30), corresponds to the effective replacement rate if the immigrants are neutral. This calculation assumes that the diversity within a patch is negligible. This can be so if the population is strongly clonal so that the effective population size (Npe) of a patch is small. In this case, new immigrants will either be purged rapidly or take over rapidly.

6. Effective population size

A comparison between equations (27) – (29) in the limit of weak selection,

|s|n0<<|s|Npe<<1, with equation (6) for a single homogeneous population shows that the global patchy population can be described in an analogous way by the following replacements:

n Æ m, N Æ M, s+γ Æ (s+γ)NpeΛ/(Λ+d), u Æ u/(Λ+d)

(31)

Thus, for a novel sequence that initially has taken over a single patch, the total fixation

probability and the total residence times can be calculated from equations (7), (8) and (10) after the replacements indicated in equations (31). The persistence times calculated in this way will be in units of 1/(Λ+d). Similarly, the heterogeneity of the distribution of novel genes between the patches can be calculated from equations (17), (18) or (21) using these parameter replacements.

Furthermore, one finds that the number, xm, of transient genes that are present in a

total of m patches is given by equations (10) and (16) if, in addition to the replacements in equation (31), the following substitution is made:

cs Æ csNpPfix/(Λ+d)

(32)

If cs denotes the rate of incorporation in a single individual, then csNpPfix would denote the rate of novel sequence incorporation in a single patch. While the effective mutation rate, uN, in the patchy population is replaced by uM/(Λ+d), the effective selection, sN, is replaced by

sNpeMΛ/(Λ+d). The effective population size can be identified (Wright, 1931) as the size of the idealized population that would have the same gene frequencies (equation 10) as the one considered. While there exist a number of formal definitions of the effective population size (Crow and Kimura, 1970; Ewens, 1979), here we are concerned only with the practical choice(s) required to put the gene frequency distribution of the patchy population on the same footing as the homogeneous one. Thus, the effective population size, Ne, associated with the mutation rate is determined by

Ne = M/(Λ+d)

(33)

When Λ=0, this agrees with previous results (Maruyama and Kimura, 1980; Berg, 1996). However, the corresponding effective population size associated with the selection coefficient must be chosen differently

Ne’ = NpeMΛ/(Λ+d) = NeNpeΛ (34) When d=0, Ne’ is simply the product of the number of patches and the effective population size for a patch.

It can be noted that in both a panmictic global population and a patchy population the total number of neutral genes in an individual can be described by Ytr0 = c0/u, as in equation (19).

For a novel sequence that starts in a single individual in any one patch, the total

fixation probability is the product of Pfix (equation 24 with n0 = 1) and Pfixglobal (equation 26).

Thus the total fixation probability for a neutral sequence starting in a single individual in the patchy population can be calculated as

P0

=

1

Pglobal =

1

 

1uNe

(35)

N p

N 1M uNe 1

fix

 

fix

 

using equation (8) for Pfixglobal after the replacements indicated by equations (31) and

identifying MNp with the total population size N.

The average infiltration for a novel gene in the patchy population is determined from equation (17) with the replacements discussed above as

F =

1

esNex (1x)uNe 1 dx

1/ M

 

(36)

1

 

 

esNex x1 (1x)uNe 1 dx

1/ M

7. Heterogeneous Patches.

We consider next patches that are distributed over two sorts of ecological niches: In

M1 patches, a certain gene, or set of genes, are required for survival. In M2 patches the same gene(s) are “non-selected”, i.e. neutral or weakly selected or counterselected with selection coefficient s2. How many patches within the global population will carry these genes? If m2 denotes the expected number of non-selected patches where the gene is present, this

number will change according to (cf. equations 22 and 23)

ddtm2 = Mλ (M1 + m2 )(M 2 m2 )Pfixinv (s2 ) Mλ m2 (M 2 m2 )Pfixinv (s2 ) uN p m2 Pfixmut (s2 )

(37)

The first term describes the invasion/replacement of patches by cells with the novel gene. The second term describes invasion/replacement of patches by cells not carrying the same novel gene. The third term describes the fixation of alleles of the novel gene that have been inactivated or deleted by mutation. For simplicity, the contributions from extinction/recolonization events have been neglected, d=0. In the stationary state, the time derivative is zero and, using the fixation probabilities from equations (24) and (25), m2 can be

found as the solution to a second-degree equation, from which a mutation-invasion-selection balance is obtained. Some representative results are displayed in Figure 4. For a neutral or nearly neutral gene (with |exp (-s2Np)-1|<M1/M2) this gives the fraction, F2 = m2 / M2 of non-

selected patches that are infiltrated as

F2

=

 

M1

/ M

=

M1

(38)

M1

/ M +u / Λ

M1 +uNe

 

 

 

 

This pattern is displayed as the solid line in Figure 4A. Unless counter-selection (s2<0) is sufficiently strong, or migration sufficiently slow (u/Λ large), the gene(s) will be present in all patches due to recurrent invasions from the subset (M1) where they are always present.

When M1/M<<1 and there is no strong positive selection (s2Np<u/Λ), this fraction

( m2 /M2) will be small, i. e. the novel sequence will be contained in a small number of the

non-selected patches (Figure 4B). Otherwise, the gene may be present in the whole population due to recurrent invasion-replacement from the M1 patches where it is required.

The simplicity of this result clearly reflects the simplicity of the space (the island model) that we have chosen for these calculations. Thus, we assume that all patches are in contact with all other patches. However, in a more realistic "space" a complex pattern of accessibility and isolation of patches would be operative. Such complexity would certainly put constraints on the frequencies of recurrent invasions. The expected consequence of these constraints would be that in natural populations the invasion-selection balance would be a less likely outcome.

8. Neutral gene flux.

Clearly, the addition or removal of gene sequences are complicated processes that involve distributed lengths of DNA that are inserted or deleted from genomes (Mira, Ochman, and Moran, 2001). If, for simplicity, it is assumed that individual neutral genes are added with

a constant rate c0 and removed with rate kdel, the average number of neutral genes, LP, per genome will be determined by:

 

dLP

= c k

L

(39)

 

 

 

dt

0

del P

 

 

 

 

 

Thus, in the steady state, the number of neutral genes or pseudogenes equals:

 

LP = c0 / kdel

 

 

(40)

From equation (19) c0/u is the expected number of functioning neutral imported genes. Thus for every functioning gene in the genome there would be u/kdel –1 pseudogenes in various stages of decay.

If we consider gene acquisition and inactivation as stochastic processes with

constant rates, c0 and kdel, the number of neutral transient genes per genome would have a Poisson distribution with average and variance equal to c0/kdel. However, genes will not always be added or removed singly. We have simulated the stochastic acquisition and loss assuming that the size of imports and deletions are exponentially distributed with expectation values himp and hdel, respectively. In this case we find that the average and variance of the number of neutral novel genes per individual will be

Nimp

= c0 / kdel

(41a)

σimp2

= Nimp (himp +hdel )

(41b)

This would be valid only in a sufficiently large population. In a small population, the variations are constrained. It can be shown that in a finite population, the average remains the same, equation (41a), while the variance is reduced by a factor given by the heterogeneity in neutral gene content, H0 of equation (21). In this case we are interested also in inactivated genes, as long as they are not deleted, so that H0 = kdelNe/(1+kdelNe). Thus, the expected variance in genome size due to random acquisition and loss in a finite population of effective size Ne will be

σimp2 =

c0 Ne

 

(himp +hdel )

(42)

kdel Ne +1

 

 

 

When kdelNe >> 1, this relationship is reduced to equation (41b).

The variances can be much larger in models involving duplications, since the rate of creating new duplications is proportional to the total number of genes in the genome. Also, insertion of imported sequences may be facilitated if there already are a number of neutral genes present as they could provide insertion points that would not disrupt the functioning of essential genes. Furthermore, the rates cannot be expected to be constant over evolutionary time and in all ecological niches. Thus, the natural variation in gene content is expected to be larger than that given in equation (42).

Table 1. List of symbols

Symbol

Description

Equation

cs (c0)

Rate of acquisition of new sequences with growth advantage s

14, 32

d

Rate of random patch extinction and recolonization

22, 23

F

Infiltration, fraction of population that carries a certain gene

17, 36

H

Heterogeneity in gene content

18, 21

kdel

Rate of gene deletion

39

kn+ (kn-)

Rates by which the number of individuals that carry

 

 

a certain gene increase (decrease) in the population

1, 4, 5

km+ (km-)

Rates by which the number of patches that carry

 

 

a certain gene increase (decrease) in the population

22, 23

Kn, Km

Ratio of the rates of decrease and increase in the population

6, 27

M

Total number of patches in the patchy population

 

M1 (M2)

Number of selective (non-selective) patches

37

m

Number of patches that carry a certain sequence

 

N

Total number of individuals in the homogeneous population

 

Ne, Ne

Effective population size for the whole population

33, 34

Np, Npe

Real and effective population size for each patch

 

n

Number of individuals in a homogeneous population that carry

 

 

a certain gene

 

n0

Initial number of individuals that carry a certain gene; number of

 

 

invading individuals in a patch

24, 30

Pfix

Fixation probability in a homogeneous population or patch

2, 7, 8, 9

Pglobal

Fixation probability in the patchy population if the gene is initially

 

fix

 

 

 

fixed in a single patch

26

s

Selection coefficient; relative growth advantage

 

u

Inactivation rate of a gene

4, 5

xn

Gene frequency distribution

16, 20

Xtr

Total number of transient genes in the population

14

Ytr

Number of transient genes in a single individual

15

γ

Rate of infection

4

λ

Rate of invasion between patches

22, 23

Λ

Migration rate; effective invasion rate

30

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]