The average position of the first maximum in a sample of geometric random variables Margaret Archibald1 and Arnold Knopfmacher2 1 Department of Mathematics and Applied Mathematics, University of Cape Town, South Africa 2 School of Mathematics, University of the Witwatersrand, Johannesburg, South Africa International Conference on Infinity in Logic & Computation, University of Cape Town, South Africa, 3–5 November 2007 AofA07 The average position of the first maximum Introduction Samples: (Γ1 , Γ2 , . . . Γn ) where P{Γj = i} = pq i−1 , for 1 ≤ j ≤ n, with p + q = 1. Example If p = q = 21 , then the probability of a 1 is 21 ( 12 )0 = the probability of a 4 is 21 ( 12 )3 = 1 2 1 16 I.e., 1 is twice as likely to occur as 2 and so on. The alphabet is infinite. Eg. 2113131212121 AofA07 The average position of the first maximum Previous Work Szpankowski and Rego (1990): Maximum order statistic. En = logQ n + Vn = 1 γ 1 + + P0 (logQ n) + O , L 2 n π2 1 + P1 (logQ n), + 6L2 12 AofA07 (Q = 1 , q L = log Q). The average position of the first maximum Previous Work Szpankowski and Rego (1990): Maximum order statistic. En = logQ n + Vn = 1 γ 1 + + P0 (logQ n) + O , L 2 n π2 1 + P1 (logQ n), + 6L2 12 (Q = 1 , q L = log Q). Kirschenhofer and Prodinger (1993): Average dth largest element. P P (2) As n → ∞, Hn = nk=1 k1 and Hn = nk=1 k12 E(d) n ∼ logQ n + V(d) n ∼ γ 1 1 + − Hd−1 + P2 (logQ n), L 2 L π2 1 1 (2) + − H − [P22 ]0 + P3 (logQ n). 6L2 12 L2 d−1 Note: [P22 ]0 is the mean of the square of the fluctuations of the expectation. AofA07 The average position of the first maximum Previous Work Problem: R˚ ade (1991); Solution: Griffin and Lossers (1994) Toss coins until all show heads. ‘Heads’ occurs with probability p. AofA07 The average position of the first maximum Previous Work Problem: R˚ ade (1991); Solution: Griffin and Lossers (1994) Toss coins until all show heads. ‘Heads’ occurs with probability p. Also on this topic... Eisenberg, Stengle and Strang (1993) Brands, Steutel and Wilms (1994) Baryshnikov, Eisenberg and Stengle (1995) AofA07 The average position of the first maximum Previous Work Problem: R˚ ade (1991); Solution: Griffin and Lossers (1994) Toss coins until all show heads. ‘Heads’ occurs with probability p. Also on this topic... Eisenberg, Stengle and Strang (1993) Brands, Steutel and Wilms (1994) Baryshnikov, Eisenberg and Stengle (1995) Kirschenhofer and Prodinger (1996) Number of winners. (Also generalised to d below max.) AofA07 The average position of the first maximum Previous Work Skip lists (see Pugh 1990) Kirschenhofer and Prodinger (1994): Path length of skip lists. Prodinger (1996) Left-to-right maxima (strict and weak, e.g., 11213321). Louchard and Prodinger (2006) Found sth factorial moment for the number of elements a below (resp. above/at) the level of the k + 1th maximum. AofA07 The average position of the first maximum Method Symbolic Expression {1, 2, . . . , k − 1}∗ k{1, 2, . . . , k}∗ Bivariate generating function X pq k−1 z F (z, u) := (1 − zu(1 − q k−1 ))(1 − z(1 − q k )) k≥1 “Position” ≡ Number of places before the first maximum occurs. AofA07 The average position of the first maximum Method Symbolic Expression {1, 2, . . . , k − 1}∗ k{1, 2, . . . , k}∗ Bivariate generating function X pq k−1 z F (z, u) := (1 − zu(1 − q k−1 ))(1 − z(1 − q k )) k≥1 “Position” ≡ Number of places before the first maximum occurs. Definition - Mean and variance of PGFs P Given a PGF P(u) = k≥0 pk u k (pk = Pr {X = k}) for a random variable X : P 0 (1) gives the expected value of X P 00 (1) + P 0 (1) − P 0 (1)2 gives the variance. AofA07 The average position of the first maximum Method - The Expected Value Partial fractions on... X pq k−1 (1 − q k−1 )z 2 ∂ F (z, u)|u=1 = ∂u (1 − z(1 − q k ))(1 − z(1 − q k−1 ))2 k≥1 ...leads to... ∂ F (z, u)|u=1 ∂u i 1 X h 1−k = (q − 1)(1 − q k )n − (q 1−k + q + n − 1)(1 − q k−1 )n . p [z n ] k≥2 AofA07 The average position of the first maximum Method - Expectation Use Binomial Theorem and get rid of sum on k: n X n q i−1 (q i − 1 + nq i − nq) qn(n − 1) ∂ (−1)i + [z ] F (z, 1) = ∂u i (1 − q i−1 )(1 − q i ) p n i=2 AofA07 The average position of the first maximum Method - Expectation Use Binomial Theorem and get rid of sum on k: n X n q i−1 (q i − 1 + nq i − nq) qn(n − 1) ∂ (−1)i + [z ] F (z, 1) = ∂u i (1 − q i−1 )(1 − q i ) p n i=2 Consequently, (Q = q −1 ) n X ∂ n 1 [z ] F (z, u)|u=1 = − (−1)i i−1 ∂u i Q −1 |i=2 {z } n α −n n X n |i=2 AofA07 i (−1)i {z β n(n − 1) 1 + . Qi − 1 Q −1 } The average position of the first maximum Method - Expectation - Rice’s Integrals Let C be a curve surrounding the points 1, 2, . . . , n in the complex plane, and let f (z) be analytic inside C . Then n X n k=1 1 (−1) f (k) = − 2πi k k Z [n; z]f (z)dz, C where [n; z] = (−1)n−1 n! Γ(n + 1)Γ(−z) = . z(z − 1) · · · (z − n) Γ(n + 1 − z) By extending the contour of integration, we can express (see Flajolet and Sedgewick (1995)) the asymptotic expansion as X Res([n; z]f (z)) + smaller order terms, where the sum is taken over all poles different from 1, . . . , n. AofA07 The average position of the first maximum Method - Expectation - Calculating Residues As an example, consider (Q := q1 ) α= n X n i=2 i (−1)i 1 Q i−1 −1 . 1 Integrand in question: [n; z]f (z) = Γ(n+1)Γ(−z) Γ(n+1−z) Q z−1 −1 . Double pole: z = 1; 2kπi for k ∈ Z\{0} . Simple poles: z = 0 and z = χk + 1 χk = log Q AofA07 The average position of the first maximum Result - Expectation Theorem 1. The average position En of the first (left-most) occurrence of the maximum in a sample of geometric random variables is given by En = n 1 L + 1 1 Q +δE 1 (logQ n) + + −δE 2 (logQ n)+o(1) 1−Q L 1−Q where Q = q1 ; L = log Q, χk = 2kπi/L, δE 1 (x) := 1X χk Γ(−1 − χk )e 2kπix L k6=0 and δE 2 (x) := 1 X χk (1 + χk )(χk − 2)Γ(−1 − χk )e 2kπix . 2L k6=0 AofA07 The average position of the first maximum Method - Variance Differentiating partially twice gives X ∂2 2pq k−1 z 3 (1 − q k−1 )2 F (z, u) = ∂u 2 (1 − z(1 − q k ))(1 − zu(1 − q k−1 ))3 k≥1 and hence ∂2 F (z, 1) ∂u 2 X = 2 k≥1 (q k−1 − 1)2 1 − 2 2k−2 k p q (1 − z(1 − q )) (1 − z(1 − q k−1 ))3 q 1−k (1 − 3q k−1 + 2q k ) p(1 − z(1 − q k−1 ))2 q 2−2k (1 + 3q 2k−2 − 3q k−1 + q k − 3q 2k−1 + q 2k ) − . p 2 (1 − z(1 − q k−1 )) − AofA07 The average position of the first maximum Method - Variance n i X ∂2 2 n i Q −1 [z ] 2 F (z, 1) = (−1) i ∂u (Q − 1)2 1 − Q i−2 i=3 | {z } n (a) 4Q + (Q − 1)2 | n 2n X n Qi Q −1 + (−1)i (−1) i−1 Q −1 Q −1 i 1 − Q i−1 i i=3 {z } | {z } n X n i=3 i i (b) (c) X n n(3Q − 1) n Qi 2nQ(Q + 1) 2Q 2 + − n2 (−1)i i − + 2 Q −1 i Q − 1 (Q − 1) (Q − 1)2 i=3 {z } | {z } | (e) (d) − n2 Q(3Q 2 − 7Q − 6) n3 Q(2Q 2 − 2Q − 1) n4 Q 2 + − . 2 2 2(Q − 1) (Q + 1) (Q − 1) (Q + 1) 2(Q 2 − 1) | {z } (e) AofA07 The average position of the first maximum Method - Variance Hence the main terms of the second factorial moment are: n2 2L + 3 − 4Q + Q 2 8QL − 2L − 5Q 2 + 4Q + 1 + n 2L(Q − 1)2 2L(Q − 1)2 2 2 2Q L − 3Q − 1 + 4Q + . L(Q − 1)2 We also consider (among other things) the constant terms which arise from squaring the fluctuations from the expectation. Recall δE 1 (x) := 1X χk Γ(−1 − χk )e 2kπix , L where χk = 2kπi/L. k6=0 Hence, we require 1 X χk (−χk )Γ(−1 − χk )Γ(−1 + χk ). L2 k6=0 AofA07 The average position of the first maximum Method - Variance - Squaring the Fluctuations To find the ‘zeroth’ fourier coefficient of the square, we use a method devised by Prodinger in 2004. Consider the function 1 I1 := 2πi where F (z) := Z 1 +i∞ 2 F (z)dz 1 −i∞ 2 −L z 2 Γ(−1 − z)Γ(−1 + z). −1 e Lz Express I1 in two ways: Shift contour line left and collect residues Sum negative residues right of the line <z = AofA07 1 2 The average position of the first maximum Method - Variance - Squaring the Fluctuations The residues for the simple poles at z = 0 and z = χk , k 6= 0 can be calculated in order to write Z X 1 I1 = F (z)dz + 1 + χk (−χk )Γ(−1 − χk )Γ(−1 + χk ). 2πi (− 1 ) k6=0 2 Now rewrite − e Lz1−1 as 1 + e −Lz1 −1 and use the change of variable z := −z to get X I1 = LI2 − I1 + 1 + χk (−χk )Γ(−1 − χk )Γ(−1 + χk ), k6=0 where I2 = 1 2πi Z z 2 Γ(−1 − z)Γ(−1 + z)dz = − log 2 + (− 21 ) AofA07 The average position of the first maximum L . 2 Method - Variance - Squaring the Fluctuations Shifting I1 right gives I1 = X L QL2 l(−1)l + + L . 4(1 − Q) 2(Q − 1)2 (Q l − 1)(l + 1)(l − 1) l≥2 Equating the two expressions for I1 leaves us with X χk (−χk )Γ(−1 − χk )Γ(−1 + χk ) k6=0 = X L QL2 l(−1)l + + 2L 2(1 − Q) (Q − 1)2 (Q l − 1)(l + 1)(l − 1) l≥2 + L log 2 − L − 1. 2 which is what we need apart from a factor of L−2 , (L = log Q). AofA07 The average position of the first maximum Method - Variance - Fluctuations It is again possible to find the fluctuations explicitly. For the fluctuations involving the largest term n2 of the variance, we consider two sources. The usual method seen in the expectation applied to the second factorial moment. Squaring the expectation fluctuations. AofA07 The average position of the first maximum Result - Variance Theorem 2. The variance of the position of the first occurrence of the maximum in a sample of geometric random variables is given by 1 Q 2 Q +1 1+Q − 2 +n − + Var = n2 2L(Q − 1) L (Q − 1)2 L2 2L(Q − 1) Q 1 + − 2 + o(1). 2 (Q − 1) L There are also negligibly small contributions from the fluctuating terms. AofA07 The average position of the first maximum Result - Variance Theorem 2 continued. The contributions from fluctuating terms to order n2 are Qn2 2n2 X l(−1)l log 2n2 n2 Qn2 + + + − 2 2 l 2L(1 − Q) (Q − 1) L (Q − 1)(l + 1)(l − 1) L L l≥2 and the fluctuations are δv (x) := −n2 X Vk e 2kπix , L k6=0 where Vk is given by Γ(−2 − χk ) − L(Q + 1) − 4χk (L + Q − 1) + χ2k (2 − 2Q + QL − L) L(Q − 1) X l(−1)l Ql + 1 (l − χk )Γ(l − 1 − χk ) l . − (l + 1)! Q −1 l≥1 AofA07 The average position of the first maximum Note - Variance As Q → 1 As q → 1 samples of geometric variables tend in behaviour to that of a permutation of n numbers. For permutations, the average number of places before the maximum and the second moment are n n−1 1X (k − 1) = n 2 n and k=1 1X n2 n 1 (k − 1)2 = − + n 3 2 6 k=1 respectively. 2 n 1 − 12 , which is also obtained by taking the Thus the variance is 12 limit as Q → 1 of the main term in the variance in Theorem 2. AofA07 The average position of the first maximum Thank you. AofA07 The average position of the first maximum