essay (here) for details.

Classical Population Genetics 2: J. B. S. Haldane (1924) and
“Familial Selection”
John D. Nagy
Introduction
In the first of a series of papers called, “A Mathematical Theory of Artificial and Natural Selection”
[2], J. B. S. Haldane rationalizes Darwin’s notion of natural selection with Mendel’s idea of particulate
inheritance. Therefore this paper is a central document in the Modern Synthesis. It is also important because it lays much of the foundation of modern population genetics, introducing the selection
coefficient and basic Mendelian selection model (as discussed in [5]). It also introduces nuances not
often explored in modern textbooks, including selection on one sex only and selection when family
sizes are strictly regulated. In this essay we take up the latter situation, what Haldane called “familial
selection,” and save the former (selection in one sex) for a follow-up article. Haldane was not always
transparent regarding the assumptions he was making—he played especially fast-and-loose when linking discrete biological processes with mathematical representations in the continuum—so this essay
develops the model carefully and brings to it a more modern expression and analysis.
Setting and problem statement
In his basic model of selection in a sexually reproducing population, Haldane considered selection
on a recessive Mendelian trait. He introduced the selection coefficient, denoted k, which he used
to represent both positive (k < 0) and negative (k > 0) selection. For reasons explained elsewhere
(see [5] and below), these biological assumptions limit k to the interval (−∞, 1]. Intuitively, k is like
an epidemiologist’s “risk” of being recessive relative to being dominant—that is, starting with equal
numbers of dominant and recessive zygotes, for every dominant that survives to reproduce, 1 − k
recessives do. From a recessive individual’s perspective, “positive risk” is bad, while “negative risk”
means recessives are favored.
In the basic model, Haldane assumed random mating—the probability that any female mates with
any male is uniformly distributed across males for all females. In addition, there is no migration,
no mutation, nonoverlapping generations, and selection acts equally on both males and females. We
retain all those assumptions here. We also keep the assumption of “an infinitely large population,”
which is an unfortunate way of saying that we ignore statistical fluctuations due to sampling error
in finite populations. That assumption can (usually) be justified if the population is big enough.
However, we also assume that the number of offspring per female is finite, so statistical fluctuations
here cannot be ignored. This is a point Haldane finessed, seemingly assuming that families are also
“infinitely large.”
Here’s the new part. Haldane imagines a situation in which family size is strictly regulated.
Regardless of selection, all families have the same number of offspring, say m. Selection acts only
within sibling groups; that is, the trait confers an advantage or disadvantage only when an individual
competes against its siblings. So, families in which the parents are both AA or both aa have the same
1
number of offspring, namely m. But, families from the mating Aa × aa do not have on average half
Aa and half aa. Instead, the expected ratio is 1 : 1 − k. The question is, how does such a scenario
alter the results of selection relative to the basic model?
The model
While Haldane essentially barfed the model out whole, we’ll be more careful. Let f1 (n), f2 (n) and
f3 (n) be the population frequencies, or population proportions, of AA, Aa and aa, respectively, in
generation n. These frequencies can also be understood as probabilities—fi is the probability that a
uniform random draw from the population produces an individual of the genotype corresponding with
i. Our model will be a machine that pumps out an expected number for each genotype in generation
n + 1 given the genotype frequencies in generation n.
Here are the technical details using standard modern terminology (see e.g., [1]). Let fn be the vector
of genotype frequencies in generation n; that is, fn ∈ {[f1 f2 f3 ] ; fi ∈ [0, 1] ∀i, f1 + f2 + f3 = 1}. (Here
and from now on, genotype frequencies fi (n) will be expressed without the n arguments, even though
they are still understood to be there.) The vector fn essentially represents a probability distribution
over the set Ω = {AA, Aa, aa}. The stochastic process of interest is a sequence of two random
experiments. The first draws two independent genotypes from the breeding population in year n to
represent a single mating. The assumption here is that all offspring in a family have the same father.
The second determines the offspring’s genotype in generation n + 1 given the parents’ in generation
n. Let Xn be a random variable representing the randomly chosen mating pair (first event). Then Xn
has state space
EM = {{AA × AA}, {AA × Aa}, {AA × aa}, {Aa × Aa}, {Aa × aa}, {aa × aa}}
(identical to the sample space). Next, let Yn be the genotype of an offspring in generation n; therefore
Yn has state space Ω with a probability distribution that depends on the parental genotypes. Our
stochastic process is the ordered pair, (Xn , Yn+1 ), for which the state space is the Cartesian product,
Em × Ω. The model we seek calculates P {Yn+1 = y|fn } for every y ∈ Ω and n ∈ {0, 1, . . .}. Since
matings are disjoint,
X
(1)
Pfn {Yn+1 = y|Xn = x}P {Xn = x}.
Pfn {Yn+1 = y} =
x∈EM
(Interpret PX {Y } as P {Y |X}.)
The biological assumptions of random mating and “infinite” population size imply that parental
genotypes are obtained from two independent random draws (with replacement) from Ω with probability distribution fn for each one. Therefore, by the definition of independence, the probability
distribution for EM in generation n—and namely P {Xn = x} for all x—can be calculated by expanding
(f1 (n) + f2 (n) + f3 (n))2 .
So it becomes a straightforward exercise to calculate P {Xn+1 = x} for all x ∈ Ω (Table 1, second
column).
Next we calculate the offpsring’s genotype given the parents’—i.e., P {Yn+1 = y|Xn }. Haldane’s
thinking appears to be something like the following. Families in which all offspring are the same
phenotype (AA × AA, AA × Aa, AA × aa, and aa × aa) experience no selection because litter sizes
are all the same. Selection only operates in families that can have mixed phenotypes (Aa × Aa and
Aa×aa). Consider the mating Aa×aa. Before selection we expect half the offspring to be heterozygous
2
Table 1: Probabilities required to calculate Haldane’s “familial selection” model, expression (1). “Frequency” (column 2) is the probability of the mating listed in column 1. “Prob of Z” is the probability
the offspring is genotype Z given its parents were the genotypes listed in column 1.
Mating (x)
AA × AA
AA × Aa
AA × aa
Aa × Aa
Aa × aa
aa × aa
Frequency
f12
2f1 f2
2f1 f3
f22
2f2 f3
f32
Prob of AA
1
1/2
0
1/(4 − k)
0
0
Prob of Aa
0
1/2
1
2/(4 − k)
1/(2 − k)
0
Prob of aa
0
0
0
(1 − k)/(4 − k)
(1 − k)/(2 − k)
1
(Aa) and half to be homozygous (aa). However, selection alters this ratio from 1 : 1 to 1 : 1 − k, where
−∞ < k ≤ 1. But the number of offspring does not change. So let m be the fixed number of offspring.
Since the ratio of phenotypes is 1 : 1 − k, then the ratio of numbers of offspring is m : (1 − k)m.
Therefore, the proportion of the litter with genotype Aa would be
m
1
=
,
m + (1 − k)m
2−k
and the proportion that is aa would be
1−k
.
2−k
Similarly, for matings between heterozygotes the expected ratio among offspring after selection is not
1 : 2 : 1 but 1 : 2 : 1 − k. Therefore, the proportions of litters that are AA, Aa and aa are
1
,
4−k
2
,
4−k
1−k
,
4−k
respectively (Table 1). Haldane assumes (implicitly) that these proportions are obtained exactly, which
is highly dubious unless litter sizes are huge. Here we avoid this problem by defining these proportions
as probabilities. (Indeed, this is probably closer to Haldane’s implicit assumption; he seems at times to
regard these as statistical expectations, although he also deals with them as obtained proportions.)1
Now that Table 1 is filled in we have everything needed to calculate the terms of model (1)—
multiply values in columns 3, 4 and 5 by those in the same row in column 2 and then sum the last 3
columns. Specifically,
f22
,
4−k
2f22
2f2 f3
Pfn {Yn+1 = Aa} = f1 f2 + 2f1 f3 +
+
,
4−k
2−k
(1 − k)f22 2(1 − k)f2 f3
Pfn {Yn+1 = aa} =
+
+ f32 ,
4−k
2−k
Pfn {Yn+1 = AA} = f12 + f1 f2 +
(2a)
(2b)
(2c)
1
Haldane’s presentation also differs from the one used here because he used the same notation Hardy did in his famous
letter so Science [4], which for this purpose is unnecessarily complicated. This also may be the reason for what appears
to be either an error or unstated assumption in one of Haldane’s calculations.
3
where, again, all frequencies fi depend on n. Haldane wrote equations (2) in a slightly different, but
very nice, form that allows one to see at a glance how selection affects genotype dynamics. If there
were no selection and all the so-called Hardy-Weinberg assumptions hold, then the Hardy-Weinberg
result can be written as follows:
2 2
2
1
1
1
1
1
1
f1 + f2 +
= f1 + f2 + 2 f1 + f2
f2 + f3
f2 + f3 +
f2 + f3 . (3)
2
2
2
2
2
2
(Note: this is just a fancy way of writing (pn + qn )2 , where pn and qn = 1 − pn are the allele frequencies
of A and a, respectively, in generation n.) The first term on the right-hand side is the Hardy-Weinberg
equilibrium frequency of AA, the middle term that of Aa and the last term that of aa. In comparison,
one can, after a bit of algebra, write equations (2) as follows:
2
1
kf22
Pfn {Yn+1 = AA} = f1 + f2 +
.
(4a)
2
4(4 − k)
f2
1
1
f3
,
(4b)
Pfn {Yn+1 = Aa} = 2 f1 + f2
f2 + f3 + kf2
+
2
2
2(4 − k) 2 − k
2
1
3f2
f3
Pfn {Yn+1 = aa} =
f2 + f3 − kf2
+
.
(4c)
2
4(4 − k) 2 − k
Finally, one waves a hand at infinity a couple of times to intuitively justify the assumption that
Pfn {Yn+1 } = fn+1 . In reality, the assumption’s validity relies on an unstated law of large numbers
that is complicated somewhat by finite family sizes. The intuition is essentially correct but not really
axiomatic, and this style of argument obscures information about variation among other things.
Ignored details notwithstanding, Haldane now has his dynamical system:
2
1
k(f2 (n))2
f1 (n + 1) = f1 (n) + f2 (n) +
.
(5a)
2
4(4 − k)
1
1
f2 (n)
f3 (n)
f2 (n + 1) = 2 f1 (n) + f2 (n)
f2 (n) + f3 (n) + kf2 (n)
+
,
(5b)
2
2
2(4 − k) 2 − k
f3 (n + 1) = 1 − f1 (n + 1) − f2 (n + 1).
(5c)
In this form it is easy to see that the frequencies of AA and Aa, and therefore the allele frequency of
A, tend to increase if k > 0 and decrease if k < 0; we add a positive (respectively, negative) term to
what otherwise would be Hardy-Weinberg equilibrium for both genotypes. If k = 0, then the model
reverts to Hardy-Weinberg (see expression (3)).
In this model, unlike Haldane’s simple Mendelian selection model, pn+1 is not uniquely determined
by pn , so we analyze our one-parameter family of models, viz. system (5), as a planar discrete map. It
is planar because there are only 2 degrees of freedom—f3 is completely determined by f1 and f2 since
genotype frequencies must sum to unity.
Remember that the range for our parameter, k, is the interval (−∞, 1], and note that the phase
space of model (5) is the triangular region {(f1 , f2 ); 0 ≤ f1 + f2 ≤ 1} (unshaded portions of Fig. 1).
The nullcline for f1 comprises all points in phase space that satisfy
p
−(4 − k)f1 + (4 − k)(4 − kf1 )f1
f2 =
(6)
2
(blue curves in Fig. 1), while the nullcline for f2 is the set of points satisfying
s
4−k
1
4−k 2
f2 =
(k − (4 − k)f1 ) +
(k − (4 − k)f1 )2 + 2(2 − k)(4 − k)(1 − f1 )f1
8
2
4
4
(7)
Dominant allele favored
0.9
0.9
0.8
0.8
0.7
0.7
Frequency of Aa ( f2)
1
2
Frequency of Aa ( f )
(A)
1
0.6
0.5
0.4
0.6
0.5
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0
0
0.1
0.2
0.3
0.4
0.5
0.6
Frequency of AA ( f1)
0.7
0.8
0.9
0
1
Recessive allele favored
(B)
0
0.1
0.2
0.3
0.4
0.5
0.6
Frequency of AA ( f1)
0.7
0.8
0.9
1
Figure 1: Nullclines (colored curves) and example solutions (black lines) of Haldane’s “family selection” model (system (5)). The model lives only on the unshaded portion of the plane. Arrows indicate direction of solutions in positive time. (A) Selection favors the dominant allele; i.e.,
k > 0. Specifically, k = 0.6, and example initial conditions are {(0, 0.99), (0, 0.4), (0, 0.01), (0.5, 0)}.
(B) Selection favors the recessive (k < 0). Specifically, k = −8, and example initial conditions are
{(0.99, 0), (0.4, 0), (0, 0.9), (0.4, 0.6)}.
(red curves in Fig. 1). It is easy to see that in the Hardy-Weinberg case (k = 0) these nullclines are
coincident and represent Hardy-Weinberg equilibria. Since we already know this situation, we now
assume k 6= 0. In that case, there are two fixed points: ¯fA = [1 0 0] (extinction of a allele) and
¯fB = [0 0 1] (extinction of A allele). To analyze the local stability properties of ¯fA we calculate the
Jacobi matrix at that point:
0 0
JA =
.
(8)
2
2 2−k
One eigenvalue of JA is 0 and the other is 2/(2−k). So, by the linear stability theorem for discrete maps
(see e.g., [3, Thm 3.8, pg.73]), ¯fA is locally asymptotically stable if k < 0 and unstable if k > 0. (And
we already know from the Hardy-Weinberg result that it is neutrally stable if k = 0.) Linearization
at the other generic fixed point, ¯fB , doesn’t help much. At that point,
2
1
JB =
,
(9)
−2 −1
and again one eigenvalue is 0 but the other is 1. So we cannot reach an immediate conclusion. However,
it is not hard to see intuitively that ¯fB is stable whenever ¯fA is unstable and vice versa (Fig. 1). (I
won’t bother to prove this here, but it would be a good exercise to practice analyzing a degenerate
case.) Also, both are neutrally stable when k = 0.
Biologically this model and the basic selection model are qualitatively the same. The allele destined for extinction depends on the sign of k: if positive, then the recessive disappears; if negative,
the dominant does. However, the rates of evolution are not the same in the two models—typically,
evolution proceeds faster under standard selection compared to familial selection (Fig. 2). This conclusion seemingly contradicts Haldane’s claim that “. . . recessives are eliminated far more quickly [in
5
1
0.9
0.8
Time (generations)
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0
5
10
15
20
25
30
35
40
45
50
Frequency of A (p )
Figure 2: Comparative rates of evolution in Haldane’s basic Mendelian (blue) and familial (red)
selection models. Solid curves: k = 0.8. Dashed curves: k = −1. Initial frequency of A allele (i.e., p)
is 0.5 in all cases.
the familial selection model] than in the ordinary type of selection.” But this passage in Haldane’s
paper has a number of problems, including an apparent arithmetic error. So if Haldane is not incorrect
here, he is certainly obscure.
Finally, Haldane relaxes the assumption that all siblings have the same father. Instead, every male
in the population has the same probability of fathering any given offspring from any given female.
(Think of sessile sea creatures that release gametes freely into the ocean, for example.) He retains
all other assumptions. It would be an interesting exercise to develop and analyze that model along
modern lines as we have done here.
References
[1] E. C
¸ inlar. Introduction to Stochastic Processes. Prentice-Hall, Englewood Cliffs, N.J., 1975.
[2] J. B. S. Haldane. A mathematical theory of natural and artificial selection. Part I. Trans. Camb.
Phil. Soc., 22:19–41, 1924.
[3] Jack K. Hale and H¨
useyin Ko¸cak. Dynamics and Bifurcations. Springer Verlag, New York, 1991.
[4] G. H. Hardy. Mendelian proportions in a mixed population. Science, 28:49–50, 1908.
[5] John D. Nagy. Classical population genetics 1: The selection coefficient and early selectino models
of J. B. S. Haldane. Unreviewed essay published online, November 2014.
6