The average position of the first maximum in a Margaret Archibald

The average position of the first maximum in a
sample of geometric random variables
Margaret Archibald1 and Arnold Knopfmacher2
1 Department
of Mathematics and Applied Mathematics,
University of Cape Town,
South Africa
2 School of Mathematics,
University of the Witwatersrand,
Johannesburg,
South Africa
International Conference on Infinity in Logic & Computation,
University of Cape Town, South Africa, 3–5 November 2007
AofA07
The average position of the first maximum
Introduction
Samples: (Γ1 , Γ2 , . . . Γn ) where P{Γj = i} = pq i−1 , for 1 ≤ j ≤ n,
with p + q = 1.
Example
If p = q = 21 , then
the probability of a 1 is 21 ( 12 )0 =
the probability of a 4 is 21 ( 12 )3 =
1
2
1
16
I.e., 1 is twice as likely to occur as 2 and so on.
The alphabet is infinite.
Eg. 2113131212121
AofA07
The average position of the first maximum
Previous Work
Szpankowski and Rego (1990): Maximum order statistic.
En = logQ n +
Vn =
1
γ
1
+ + P0 (logQ n) + O
,
L 2
n
π2
1
+ P1 (logQ n),
+
6L2
12
AofA07
(Q =
1
,
q
L = log Q).
The average position of the first maximum
Previous Work
Szpankowski and Rego (1990): Maximum order statistic.
En = logQ n +
Vn =
1
γ
1
+ + P0 (logQ n) + O
,
L 2
n
π2
1
+ P1 (logQ n),
+
6L2
12
(Q =
1
,
q
L = log Q).
Kirschenhofer and Prodinger (1993): Average dth largest element.
P
P
(2)
As n → ∞,
Hn = nk=1 k1 and Hn = nk=1 k12
E(d)
n ∼ logQ n +
V(d)
n ∼
γ
1 1
+ − Hd−1 + P2 (logQ n),
L 2 L
π2
1
1 (2)
+
− H
− [P22 ]0 + P3 (logQ n).
6L2
12 L2 d−1
Note: [P22 ]0 is the mean of the square of the fluctuations of the
expectation.
AofA07
The average position of the first maximum
Previous Work
Problem: R˚
ade (1991); Solution: Griffin and Lossers (1994)
Toss coins until all show heads. ‘Heads’ occurs with probability p.
AofA07
The average position of the first maximum
Previous Work
Problem: R˚
ade (1991); Solution: Griffin and Lossers (1994)
Toss coins until all show heads. ‘Heads’ occurs with probability p.
Also on this topic...
Eisenberg, Stengle and Strang (1993)
Brands, Steutel and Wilms (1994)
Baryshnikov, Eisenberg and Stengle (1995)
AofA07
The average position of the first maximum
Previous Work
Problem: R˚
ade (1991); Solution: Griffin and Lossers (1994)
Toss coins until all show heads. ‘Heads’ occurs with probability p.
Also on this topic...
Eisenberg, Stengle and Strang (1993)
Brands, Steutel and Wilms (1994)
Baryshnikov, Eisenberg and Stengle (1995)
Kirschenhofer and Prodinger (1996)
Number of winners.
(Also generalised to d below max.)
AofA07
The average position of the first maximum
Previous Work
Skip lists (see Pugh 1990)
Kirschenhofer and Prodinger (1994): Path length of skip lists.
Prodinger (1996)
Left-to-right maxima (strict and weak, e.g., 11213321).
Louchard and Prodinger (2006)
Found sth factorial moment for the number of elements a below
(resp. above/at) the level of the k + 1th maximum.
AofA07
The average position of the first maximum
Method
Symbolic Expression
{1, 2, . . . , k − 1}∗ k{1, 2, . . . , k}∗
Bivariate generating function
X
pq k−1 z
F (z, u) :=
(1 − zu(1 − q k−1 ))(1 − z(1 − q k ))
k≥1
“Position” ≡ Number of places before the first maximum occurs.
AofA07
The average position of the first maximum
Method
Symbolic Expression
{1, 2, . . . , k − 1}∗ k{1, 2, . . . , k}∗
Bivariate generating function
X
pq k−1 z
F (z, u) :=
(1 − zu(1 − q k−1 ))(1 − z(1 − q k ))
k≥1
“Position” ≡ Number of places before the first maximum occurs.
Definition - Mean and variance of PGFs
P
Given a PGF P(u) = k≥0 pk u k (pk = Pr {X = k}) for a random
variable X :
P 0 (1) gives the expected value of X
P 00 (1) + P 0 (1) − P 0 (1)2 gives the variance.
AofA07
The average position of the first maximum
Method - The Expected Value
Partial fractions on...
X
pq k−1 (1 − q k−1 )z 2
∂
F (z, u)|u=1 =
∂u
(1 − z(1 − q k ))(1 − z(1 − q k−1 ))2
k≥1
...leads to...
∂
F (z, u)|u=1
∂u
i
1 X h 1−k
=
(q
− 1)(1 − q k )n − (q 1−k + q + n − 1)(1 − q k−1 )n .
p
[z n ]
k≥2
AofA07
The average position of the first maximum
Method - Expectation
Use Binomial Theorem and get rid of sum on k:
n X
n
q i−1 (q i − 1 + nq i − nq) qn(n − 1)
∂
(−1)i
+
[z ] F (z, 1) =
∂u
i
(1 − q i−1 )(1 − q i )
p
n
i=2
AofA07
The average position of the first maximum
Method - Expectation
Use Binomial Theorem and get rid of sum on k:
n X
n
q i−1 (q i − 1 + nq i − nq) qn(n − 1)
∂
(−1)i
+
[z ] F (z, 1) =
∂u
i
(1 − q i−1 )(1 − q i )
p
n
i=2
Consequently, (Q = q −1 )
n X
∂
n
1
[z ] F (z, u)|u=1 = −
(−1)i i−1
∂u
i
Q
−1
|i=2
{z
}
n
α
−n
n X
n
|i=2
AofA07
i
(−1)i
{z
β
n(n − 1)
1
+
.
Qi − 1
Q −1
}
The average position of the first maximum
Method - Expectation - Rice’s Integrals
Let C be a curve surrounding the points 1, 2, . . . , n in the complex
plane, and let f (z) be analytic inside C . Then
n X
n
k=1
1
(−1) f (k) = −
2πi
k
k
Z
[n; z]f (z)dz,
C
where
[n; z] =
(−1)n−1 n!
Γ(n + 1)Γ(−z)
=
.
z(z − 1) · · · (z − n)
Γ(n + 1 − z)
By extending the contour of integration, we can express (see
Flajolet and Sedgewick (1995)) the asymptotic expansion as
X
Res([n; z]f (z)) + smaller order terms,
where the sum is taken over all poles different from 1, . . . , n.
AofA07
The average position of the first maximum
Method - Expectation - Calculating Residues
As an example, consider (Q := q1 )
α=
n X
n
i=2
i
(−1)i
1
Q i−1
−1
.
1
Integrand in question: [n; z]f (z) = Γ(n+1)Γ(−z)
Γ(n+1−z) Q z−1 −1 .
Double pole: z = 1;
2kπi
for
k
∈
Z\{0}
.
Simple poles: z = 0 and z = χk + 1 χk = log
Q
AofA07
The average position of the first maximum
Result - Expectation
Theorem 1.
The average position En of the first (left-most) occurrence of the
maximum in a sample of geometric random variables is given by
En = n
1
L
+
1
1
Q
+δE 1 (logQ n) + +
−δE 2 (logQ n)+o(1)
1−Q
L 1−Q
where Q = q1 ; L = log Q, χk = 2kπi/L,
δE 1 (x) :=
1X
χk Γ(−1 − χk )e 2kπix
L
k6=0
and
δE 2 (x) :=
1 X
χk (1 + χk )(χk − 2)Γ(−1 − χk )e 2kπix .
2L
k6=0
AofA07
The average position of the first maximum
Method - Variance
Differentiating partially twice gives
X
∂2
2pq k−1 z 3 (1 − q k−1 )2
F
(z,
u)
=
∂u 2
(1 − z(1 − q k ))(1 − zu(1 − q k−1 ))3
k≥1
and hence
∂2
F (z, 1)
∂u 2
X
= 2
k≥1
(q k−1 − 1)2
1
−
2
2k−2
k
p q
(1 − z(1 − q )) (1 − z(1 − q k−1 ))3
q 1−k (1 − 3q k−1 + 2q k )
p(1 − z(1 − q k−1 ))2
q 2−2k (1 + 3q 2k−2 − 3q k−1 + q k − 3q 2k−1 + q 2k )
−
.
p 2 (1 − z(1 − q k−1 ))
−
AofA07
The average position of the first maximum
Method - Variance
n i
X
∂2
2
n
i Q −1
[z ] 2 F (z, 1) =
(−1)
i
∂u
(Q − 1)2
1 − Q i−2
i=3
|
{z
}
n
(a)
4Q
+
(Q − 1)2
|
n 2n X n
Qi
Q −1
+
(−1)i
(−1) i−1
Q
−1 Q −1
i
1 − Q i−1
i
i=3
{z
} |
{z
}
n X
n
i=3
i
i
(b)
(c)
X
n n(3Q − 1)
n
Qi
2nQ(Q + 1)
2Q 2
+
− n2
(−1)i i
−
+
2
Q −1
i
Q − 1 (Q − 1)
(Q − 1)2
i=3
{z
}
|
{z
} |
(e)
(d)
−
n2 Q(3Q 2 − 7Q − 6) n3 Q(2Q 2 − 2Q − 1)
n4 Q 2
+
−
.
2
2
2(Q − 1) (Q + 1)
(Q − 1) (Q + 1)
2(Q 2 − 1)
|
{z
}
(e)
AofA07
The average position of the first maximum
Method - Variance
Hence the main terms of the second factorial moment are:
n2
2L + 3 − 4Q + Q 2
8QL − 2L − 5Q 2 + 4Q + 1
+
n
2L(Q − 1)2
2L(Q − 1)2
2
2
2Q L − 3Q − 1 + 4Q
+
.
L(Q − 1)2
We also consider (among other things) the constant terms which
arise from squaring the fluctuations from the expectation. Recall
δE 1 (x) :=
1X
χk Γ(−1 − χk )e 2kπix ,
L
where
χk = 2kπi/L.
k6=0
Hence, we require
1 X
χk (−χk )Γ(−1 − χk )Γ(−1 + χk ).
L2
k6=0
AofA07
The average position of the first maximum
Method - Variance - Squaring the Fluctuations
To find the ‘zeroth’ fourier coefficient of the square, we use a method
devised by Prodinger in 2004. Consider the function
1
I1 :=
2πi
where
F (z) :=
Z
1
+i∞
2
F (z)dz
1
−i∞
2
−L
z 2 Γ(−1 − z)Γ(−1 + z).
−1
e Lz
Express I1 in two ways:
Shift contour line left and collect residues
Sum negative residues right of the line <z =
AofA07
1
2
The average position of the first maximum
Method - Variance - Squaring the Fluctuations
The residues for the simple poles at z = 0 and z = χk , k 6= 0 can
be calculated in order to write
Z
X
1
I1 =
F (z)dz + 1 +
χk (−χk )Γ(−1 − χk )Γ(−1 + χk ).
2πi (− 1 )
k6=0
2
Now rewrite − e Lz1−1 as 1 + e −Lz1 −1 and use the change of variable
z := −z to get
X
I1 = LI2 − I1 + 1 +
χk (−χk )Γ(−1 − χk )Γ(−1 + χk ),
k6=0
where
I2 =
1
2πi
Z
z 2 Γ(−1 − z)Γ(−1 + z)dz
= − log 2 +
(− 21 )
AofA07
The average position of the first maximum
L
.
2
Method - Variance - Squaring the Fluctuations
Shifting I1 right gives
I1 =
X
L
QL2
l(−1)l
+
+
L
.
4(1 − Q) 2(Q − 1)2
(Q l − 1)(l + 1)(l − 1)
l≥2
Equating the two expressions for I1 leaves us with
X
χk (−χk )Γ(−1 − χk )Γ(−1 + χk )
k6=0
=
X
L
QL2
l(−1)l
+
+
2L
2(1 − Q) (Q − 1)2
(Q l − 1)(l + 1)(l − 1)
l≥2
+ L log 2 −
L
− 1.
2
which is what we need apart from a factor of L−2 , (L = log Q).
AofA07
The average position of the first maximum
Method - Variance - Fluctuations
It is again possible to find the fluctuations explicitly. For the fluctuations involving the largest term n2 of the variance, we consider
two sources.
The usual method seen in the expectation applied to the
second factorial moment.
Squaring the expectation fluctuations.
AofA07
The average position of the first maximum
Result - Variance
Theorem 2.
The variance of the position of the first occurrence of the
maximum in a sample of geometric random variables is given by
1
Q
2
Q +1
1+Q
− 2 +n
−
+
Var = n2
2L(Q − 1) L
(Q − 1)2 L2 2L(Q − 1)
Q
1
+
− 2 + o(1).
2
(Q − 1)
L
There are also negligibly small contributions from the fluctuating
terms.
AofA07
The average position of the first maximum
Result - Variance
Theorem 2 continued.
The contributions from fluctuating terms to order n2 are
Qn2
2n2 X
l(−1)l
log 2n2
n2
Qn2
+
+
+
− 2
2
l
2L(1 − Q) (Q − 1)
L
(Q − 1)(l + 1)(l − 1)
L
L
l≥2
and the fluctuations are
δv (x) :=
−n2 X
Vk e 2kπix ,
L
k6=0
where Vk is given by
Γ(−2 − χk )
− L(Q + 1) − 4χk (L + Q − 1) + χ2k (2 − 2Q + QL − L)
L(Q − 1)
X l(−1)l
Ql + 1
(l − χk )Γ(l − 1 − χk ) l
.
−
(l + 1)!
Q −1
l≥1
AofA07
The average position of the first maximum
Note - Variance
As Q → 1
As q → 1 samples of geometric variables tend in behaviour to that
of a permutation of n numbers.
For permutations, the average number of places before the
maximum and the second moment are
n
n−1
1X
(k − 1) =
n
2
n
and
k=1
1X
n2 n 1
(k − 1)2 =
− +
n
3
2 6
k=1
respectively.
2
n
1
− 12
, which is also obtained by taking the
Thus the variance is 12
limit as Q → 1 of the main term in the variance in Theorem 2.
AofA07
The average position of the first maximum
Thank you.
AofA07
The average position of the first maximum