Download Report

Lecture 2: The neutral theory
Neutral theory of molecular evolution
In 1968 Motoo Kimura published the neutral theory of molecular evolution. This
was followed by a similar proposal by King and Jukes in 1969. This was the starting
point of an ongoing debate about the causes of genetic variation within and
between species. Kimura was convinced that most polymorphisms that occur in a
population do not influence the fitness of an individual and consequently are not
subjected to selection. He suggested neutral mutations to be the cause of genetic
variation in species. Mutations are considered to be neutral even if they happen at
non-degenerate sites.
Another important statement of the neutral theory is that most changes in allele
frequencies in a population can be attributed to genetic drift. Once a neutral
mutation arises in a gamete of an individual it is subjected to many forces: the
carrier of the gamete must survive to reach the reproductive age, this same gamete
must be fertilized and develop into an embryo. Furthermore, the embryo has to be
viable to be part of the next generation. Thereby neutral mutations create new
alleles which then can rise in frequency and become widely spread in the
population or they can get lost by chance. For this reason another term for ‘Neutral
theory of molecular evolution’ is ‘mutation-drift hypothesis’. Although Kimura
considered genetic drift to be the main force in evolution he never denied “... the
role of natural selection in determining the course of adaptive evolution" (Kimura
1986).
To model the impact of genetic drift on a population of finite size N we use the
Wright-Fisher model. In general, models without selection are much easier to treat
because we can neglect the influence of genotype dependent fitness effects. For this
reason the Wright-Fisher model is one of the most elementary models in population
genetics.
Lecture SS
Population Genetics I
1
Lecture 2: The neutral theory
Assuming the Wright-Fisher model, i.e. a
population of diploid organisms with
1) constant population size
2) discrete (non overlapping) generations
3) random mating
4) and an equal sex ratio,
we can model the changes of allele
frequencies
over
generations
due
to
genetic drift. The finite population size,
which
is
retained
unchanged
at
N
individuals over generations, consists of N1
females and N2 males (N1 = N2). We will
observe one locus with two possible alleles
A and a to follow the frequency of one
allele over one generation (A = white and
a = blue,
Figure 1).
In
the
parental
generation both alleles have the same
allele frequency, f  A = f a  =
Figure 1: The Wright-Fisher model describes
the random transfer of gametes from one
generation to the next.
4
.
8
Due to the assumption of random mating we can combine all parental alleles in one
common gene pool (Figure 1). By sampling randomly from this gene pool we can
pick alleles to build up the next generation (F1) with size N. In other words, the
parents form gametes of a certain genotype, which are pooled together. Afterwards
these gametes are combined to form a zygote, thus forming the next generation.
The allele frequencies of both alleles changed over one generation (Figure 1). The
white allele increased in frequency f a  =
5
while the blue allele decreased in
8
3
frequency f  A = .
8
Lecture SS
Population Genetics I
2
Lecture 2: The neutral theory
From the analytical point of view we can make the model more convenient if we
treat the N diploid individuals as 2N haploid individuals. Consequently we do not
have to think about mating, the gene pool, or the fusion of gametes because every
individual is the immediate genetic descendant of one parental individual.
To calculate homozygosity of the daughter generation, we sample from the gene
pool as follows. First we can sample two alleles which are identical by descent.
This means we pick one allele, e.g. a blue one, put it back and sample exactly the
same one again. The probability that this happens is 1/2N.
Second we can sample two alleles, which are not identical by descent but identical
by state (= DNA sequence). Together, we have for the homozygosity of the daughter
generation:
G' =
1 
1 
+ 1 
G .
2N  2N 
(1)
Next we consider the heterozygosity: H = 1  G . The change of heterozygosity from
one generation H to the next can be described by
1 

H' = 1 
 H = 1  G' .
 2N 
If the heterozygosity of the parental generation
(2)
H
is compared to the
heterozygosity of the next generation H' we can document the change of
heterozygosity as ΔH , where
ΔH = H'  H = 
1
H.
2N
(3)
The important impact of the population size N can be seen in equation (3). The
smaller a population is, the more heterozygotes are lost from one generation to
another.
Lecture SS
Population Genetics I
3
Lecture 2: The neutral theory
Genetic drift removes genetic variation from the population. This has several
consequences, namely non-directional changes in allele frequency, a permanent
loss of alleles and evolution which cannot be repeated. If genetic variation is
removed by genetic drift, where does the measurable genetic variation come from?
Genetic drift and mutations are two opposing forces. While drift removes variation
from a population, the appearance of new mutations increases variation. The
frequency at which new mutations enter the population can be described by the
mutation rate u . This rate displays the chance by which an organism gains a new
mutation during life time and is given as probability per site per generation. It can
vary between species and even between genomic regions within a single species.
Here are some average values of u per generation:
Eukaryotes
10-8- 10-9
Bacteria
10-8
DNA-Viruses
10-6- 10-8
RNA-Viruses
10-3- 10-5
As mentioned in the previous lecture, mutations do not necessarily change the
protein structure. Such mutations are synonymous changes in the DNA sequence
and thus may not be affected by selective forces.
All individuals in the population have the same probability to gain a mutation
according to the mutation rate u . Mutations at a particular nucleotide site occur
according to the infinite-site model.
The two mechanisms mutation and genetic drift determine heterozygosity H .
ΔH can be approximated by
ΔH  
1
H + 2u 1  H  .
2N
(4)
The right-hand side of equation (4) is the sum of both mechanisms that contribute
to the change of the genetic variation of a population.
Lecture SS
Population Genetics I
4
Lecture 2: The neutral theory
The left term
1
H
2N
is the change of genetic variation due to random genetic drift. It will always have a
ΔN H = 
negative value. The right term
Δm H = 2u1  H 
is the change of genetic variation due to new mutations. It will always have positive
~
values. At equilibrium ΔH = 0 . Therefore the equilibrium value of heterozygosity H
has to be:
4N u
~
H=
.
1+ 4 N u
(5)
The substitution rate k
One very important quantity in population genetics and molecular evolution is the
number of new neutral mutations that go to fixation due to genetic drift. This so
called substitution rate k can be obtained by comparing two sequences from
different species. We use the ratio of nucleotide differences K of the sequences and
the time of divergence T of both species to get an estimate of k as k =
K
.
2T
All neutral mutations which occur in a population have the same probability to go to
fixation of 1/2N. The substitution rate k for a population is determined by new
mutations that happen in the population ( 2 N u ) times their probability to go to
fixation 1/2N. Therefore,
k = 2N u  1 /2N = u .
As population size N cancels out, the equation given above can be reduced to k  u .
Lecture SS
Population Genetics I
5
Lecture 2: The neutral theory
This is one of the most striking results in population genetics. Against all intuition
the substitution rate is independent of population size and equals the mutation rate
u . Consequently it is possible to measure the mutation rate of a species by looking
at the number of nucleotide differences at neutral sites compared to a closely
related species.
Application:
The divergence between Drosophila melanogaster and D. simulans is roughly 5 %.
Phylogenetic analysis shows that the divergence time T is about 2.3 106 years,
which corresponds to about 2.3 107 generations.
Therefore using the equation
2Tu  k
we find
u  1.09 109 per generation.
The nearly neutral theory of molecular evolution (proposed by Tomoko Ohta in
1973) posits that mutations are neutral or slightly deleterious. Within populations
the slightly deleterious mutations show similar dynamics as neutral alleles;
however, over long time scales differences in substitution rates may become
observable.
Lecture SS
Population Genetics I
6