ACTA MATHEMATICA VIETNAMICA Volume 25, Number 2, 2000, pp. 161–179 161 RATE OF CONVERGENCE IN BOOTSTRAP APPROXIMATIONS WITH RANDOM SAMPLE SIZE NGUYEN VAN TOAN Abstract. We give the convergence rates of the random sample size bootstrap approximations £¡ N of¢2the ¤ distribution of the sample mean. Using 2 n the quantity αn =E , where Nn is the random sample size, we n −1 1 1 ¡ 1 1 ¢ obtain a convergence rate O(n− 2 (log log n) 2 ) if αn =O n− 2 (log log n) 2 . 1. Introduction Efron [2] introduced a very general resampling procedure, called the bootstrap, for estimating the distributions of statistics based on independent observations. It may be explained briefly as follows: Let X1 , . . . , Xn be independent identically distributed (i.i.d.) observations with distribution F ; let θ = θ(F ) be a parameter of interest, and let θn = θn (X1 , . . . , Xn ) be an estimator of θ. The bootstrap principle is to estimate the unknown ∗ ∗ , . . . , Xnn ) and distribution of θn by θˆn where θˆn is distributed as θn (Xn1 ∗ ∗ Xn1 , . . . , Xnn are i.i.d. from the empirical distribution of (X1 , . . . , Xn ). In sufficiently regular cases, the bootstrap approximation to an unknown distribution function has been established as an improvement over the simpler normal approximation (see [1], [3-4]). In the case where the bootstrap sample size N is in itself a random variable, Mammen [7] has considered bootstrap with a Poisson random sample size which is independent of the sample. Our work on this problem is limited to [8-11] and [13-14]. In these references we find sufficient conditions for random sample size that random sample size bootstrap distribution can be used to approximate the sampling distribution. The purpose of this paper is to study the convergence rates to zero of the discrepancy between the actual Received January 26, 1999; in revised form November 2, 1999. 1991 Mathematics Subject Classification. Primary: 62E20; Secondary: 62G05, 62G15. Key words and phrases. Bootstrap, Berry-Esseen bound, Liapounov’s inequality, random sample size. This research is supported by the National Basic Research Program in Natural Sciences, Vietnam. 162 NGUYEN VAN TOAN distribution of the sample mean and the random sample size bootstrap approximations of it. We derive our results by studying the rates of convergence of the bootstrap approximations with a random sample size. Before giving more detail, it is necessary to introduce some notations. Let X, X1 , X2 , . . . be i.i.d. random variables with finite variance. For any fixed natural number n, denote by Fn the empirical distribution of (X1 , . . . , Xn ) and by ∗ ∗ Xn1 , Xn2 , . . . i.i.d. random variables with the distribution Fn . Let Nn be a positive integer-valued random variable independent of X1 , X2 , . . . . Set µ = EX is the expectation of X, 0 < σ 2 = DX is the variance of X, ¯ n = n−1 X n X Xi , i=1 s2n = n−1 n X ¯ n )2 , (Xi − X i=1 ¯ n∗Nn = n−1 X Nn X ∗ , Xni i=1 ∗ ¯N X = Nn−1 n Nn X ∗ Xni . i=1 In the case where Nn →p ∞ as n → ∞, where →p denotes convergence in probability,√the random sample size bootstrap distribution of √ ∗ ¯∗ − X ¯ n ) and Nn (X ¯N ¯ n ) can be used to approximate the Nn (X −X Nn n sn √ √ ¯ n ¯ sampling distributions of n(Xn − µ) and (Xn − µ), respectively (see σ Nn →p 1 as n → ∞, the random sample size bootstrap [9], [13]). When n √ √ ¡ ¯ ∗Nn Nn ¯ ¢ n ¡ ¯ ∗Nn Nn ¯ ¢ distribution of n Xn − Xn and Xn − Xn can be used n sn n as given in [8] and [14]. Singh [12] has studied the uniform convergence to zero of the discrep√ ¯ ancy between the actual distribution of n(X − µ) and the bootstrap n approximation of it. The same convergence problem for the distribution of √ n ¯ (Xn −µ) was also studied by Singh in [12]. In this paper we extend the σ results of Singh [12] to the random sample size bootstrap approximations. 163 RATE OF CONVERGENCE Namely, we study the uniform and non-uniform convergence √ ¯ rates to zero of the discrepancy between the actual distribution of n(X n − µ) and its random sample size bootstrap approximations. We √ also investigate the n ¯ same convergence problem for the distribution of (Xn − µ). σ 2. Results For simplicity we let n ∆∗N 1 ¯ £√ h√ ³ i¯ ¤ Nn ¯ ´ ¯ ¯ ∗ ∗Nn ¯ ¯ n Xn − = sup X n < x ¯, ¯P n(Xn − µ) < x − P n −∞<x<+∞ n ∆∗N 2 ¯ h √n i h √n ³ ´ i¯ ¯ ∗ ¯ n − µ) < x − P ¯ ∗Nn − Nn X ¯ n < x ¯¯, = sup (X X ¯P n σ sn n −∞<x<+∞ δ1∗Nn (x) ¯ x ¯3 ´¯ £√ ³ h√ ³ i¯ ¤ Nn ¯ ´ ¯ ¯ ¯ ¯ ∗Nn ∗ ¯ ¯ = 1 + ¯ ¯ ¯P n(Xn − µ) < x − P n Xn − X n < x ¯, σ n ∗Nn δ2 (x) ¯ x ¯3 ´¯ h √n ³ ´ i¯ i h √n ³ ¯ ¯ ¯ ∗ ¯ n < x ¯¯, ¯ ¯ n∗Nn − Nn X = 1 + ¯ ¯ ¯P (Xn − µ) < x − P X σ σ sn n where P ∗ denotes conditional probability P (. . . |X1 , . . . , Xn ), and let h³ N ´2 i 1 n 2 −1 , βn = max(n− 2 , αn ), αn = E n 1 1 γn = max(n− 2 (log log n) 2 , αn ). Our main result are presented in the next two theorems. In Theorem n n 2.1, we study the convergence rates to zero of ∆∗N , δ1∗Nn (x), ∆∗N and 1 2 ∗Nn δ2 (x). Theorem 2.1. Suppose that the sequence Nn of positive integer-valued random variables satisfies the condition (a) αn → 0 as n → ∞. A) If EX 4 < ∞, then (2.1) n ∆∗N = O(γn ) a.s., 1 (2.2) δ1∗Nn (x) = O(γn ) a.s.. 164 NGUYEN VAN TOAN B) If E|X 3 | < ∞, then (2.3) n = O(βn ) a.s., ∆∗N 2 (2.4) δ2∗Nn (x) = O(βn ) a.s.. In particular, we have the following theorem which is analogous to Theorem 2.1 of [10]. Theorem 2.1’. A) If EX 4 < ∞ and if the sequence Nn of positive integer-valued random variables satisfies the condition 1 1 (b) αn = O(n− 2 (log log n) 2 ), then 1 1 1 1 (2.1’) n ∆∗N = O(n− 2 (log log n) 2 ) a.s., 1 (2.2’) δ1∗Nn (x) = O(n− 2 (log log n) 2 ) a.s.. B) If E|X 3 | < ∞ and if the sequence Nn of positive integer-valued random variables satisfies the condition 1 (c) αn = O(n− 2 ), then 1 (2.3’) n = O(n− 2 ) a.s., ∆∗N 2 (2.4’) δ2∗Nn (x) = O(n− 2 ) a.s.. 1 Let ¯ £√ hp i¯ ¤ ¯ ¯ ∗ ∗ ¯ ¯ ¯ = sup Nn (XNn − Xn ) < x ¯, ¯P n(Xn − µ) < x − P −∞<x<+∞ ¯ h √n i h √N i¯ ¯ n ¯∗ ∗2 ∗ ¯ ¯ n ) < x ¯¯, RNn = sup (Xn − µ) < x − P (XNn − X ¯P σ sn −∞<x<+∞ ¯ £√ p ¡ ¢ ¤ £ ¤¯¯ 3 ¯ ∗ ∗ ∗1 ¯ ¯ ¯ Nn (XNn − Xn ) < x ¯, rNn (x) = 1 + |x| ¯P n(Xn − µ) < x − P √ √ i h N ³ i¯ ¡ ¢¯ h n Nn ¯ ´ ¯ n ∗ ∗ ∗2 3 ¯ ¯ ¯ XNn − X n < x ¯, (Xn − µ) < x − P rNn (x) = 1 + |x| ¯P σ sn n ∗1 RN n and −1 1 1 1 ηn = E[Nn 2 ], ζn = max(n− 2 , ηn ), θn = max(n− 2 (log log n) 2 , ηn ). 165 RATE OF CONVERGENCE ∗1 ∗1 ∗2 The result on the same convergence problem for RN , rN (x), RN and n n n ∗2 rNn (x) is the following theorem. Theorem 2.2. A) If EX 4 < ∞, then (2.5) ∗1 RN = O(θn ) a.s., n (2.6) ∗1 rN (x) = O(θn ) a.s.. n B) If E|X 3 | < ∞, then ∗2 RN = O(ζn ) a.s., n (2.7) ∗2 (x) = O(ζn ) a.s. rN n (2.8) In particular, we have the following theorem which is analogous to Theorem 2.2 of [10]. Theorem 2.2’. Suppose that E[Nn−1 ] = O(n−1 ). A) If EX 4 < ∞, then p (2.5’) 1 2 − 12 lim sup n (log log n) n→∞ (2.6’) 1 ∗1 RN ≤ n D[(X − µ)2 ] √ , 2σ 2 πe 1 ∗1 rN (x) = O(n− 2 (log log n) 2 ) a.s.. n B) If E|X 3 | < ∞, then 1 (2.7’) ∗2 RN = O(n− 2 ) a.s., n (2.8’) ∗2 rN (x) = O(n− 2 ) a.s.. n 1 3. Proofs In this section we give the proofs of the above theorems. Towards the end some remarks concerning these conclusions will be given. For the simplicity of the proofs we state some facts and easily derived results will be given. 166 NGUYEN VAN TOAN Lemma 3.1. For every c > 0 we have (3.1) 1 |xφ(cx)| = √ , −∞<x<+∞ c 2πe (3.2) √ 1 2 2 √ , sup |(1 + |x| )xφ(cx)| ≤ √ + −∞<x<+∞ c 2πe e2 c4 π (3.3) |Φ(x) − Φ(cx)| ≤ min{1, |x|φ(min(1, c)x)|1 − c|}, sup 3 where Φ(x) and φ(x) are the standard normal distribution function and density, respectively. By the proof of Theorem 1 of [12] we have Lemma 3.2. Let X, X1 , X2 , . . . be i.i.d. random variables. If EX 4 < ∞, then (3.4) 1 1 lim sup n 2 (log log n)− 2 |s2n − σ 2 | = n→∞ p 2D[(X − µ)2 ] a.s., (3.5) ¯ ³x´ ³x´ ³ 1 1 ´ ³ x ´¯¯ ¯ sup −Φ − − xφ ¯Φ ¯ = O(n−1 log log n) a.s. sn σ sn σ σ −∞<x<+∞ We will also need the following results of Kruglov and Korolev [6]. Lemma 3.3 [6, Lemma 6.3.1]. Let X, X1 , X2 , . . . be i.i.d. random variables with EX = 0. If E|X 3 | < ∞, then √ (1 + |x|3 )|P (Sn < xσ n) − Φ(x)| ≤ where Sn = n P i=1 cρ √ , n σ3 Xi , ρ = E|X 3 | and c is absolute constant, c ≤ 30.5378. Theorem 3.1 [6, Theorem 6.2.1]. Let N, X, X1 , X2 , . . . be independent random variables, where N takes values among the natural numbers and X, X1 , X2 , . . . are identically distributed with EX = 0. If E|X 3 | < ∞, then for all a ∈ (0, 1) RATE OF CONVERGENCE 167 ¯ ¯ N ¯ ³ x ´¯ Kρ ¯ ¯ ¯ ¯ sup + Q(a)E ¯ − 1¯ , ¯P (SN < x) − Φ √ ¯≤ √ EN −∞<x<+∞ σ EN σ 3 a3 EN where K is the universal appearing in the Berry-Esseen bound, n 1 N P 1 1 o √ . ,√ SN = Xi and Q(a) = max , 1−a 2πea 1 + a i=1 Theorem 3.2 [6, Theorem 6.3.1]. With N, X, X1 , X2 , . . . as in Theorem 3.1, if E|X 3 | < ∞ and EN 2 < ∞, then for all a ∈ (0, 1), b ∈ (1, ∞) ¯ ³ x ´¯ ¯ ¯ (1 + |x|3 )¯P (SN < x) − Φ √ ¯≤ σ EN ¯ (DN ) 34 o n ¯ N ρ ¯ ¯ − 1¯ , + K2 (a, b, 3) max E ¯ K1 (a, 3) √ , 3 3 EN σ EN (EN ) 2 where 3 K1 (a, 3) = c + 0.7655a− 2 , c ≤ 30.5378, n w(b, 3) v(3) o b2 u(3) 1 √ , K2 (a, b, 3) = max + + , (b − 1)2 1−a a+ a 1−a ¯ ³ x ´¯ ¯ ¯ w(b, 3) = sup ¯(1 + |x|3 )xφ √ ¯, −∞<x<+∞ b ¯ n φ(x) o¯ ¯ ¯ v(3) = sup ¯ < 1.2936, ¯(1 + |x|3 ) min 1, |x| −∞<x<+∞ ¯ n r 2 Γ(2) o¯ ¯ ¯ 3 u(3) = sup < 2.5958. ¯(1 + |x| ) min 1, 3 ¯ π |x| −∞<x<+∞ Proof of Theorem 2.1. We first note that (3.6) ENn ∼ n as n → ∞, by the inequality (3.7) p |ENn − n| ≤ E|Nn − n| ≤ E[(Nn − n)2 ] (by Liapounov’s inequality), and condition (a). 168 NGUYEN VAN TOAN Part A. It is worth noting that n ∆∗N ≤ A1 + A2 + A3 + A4 + A5 , 1 where ¯ £√ ³ x ´¯ ¤ ¯ ¯ ¯ P n( X − µ) < x − Φ ¯, ¯ n σ −∞<x<+∞ ¯ ³ x´ ³ x ´¯ ¯ ¯ = sup −Φ ¯Φ t n ¯, σ σ −∞<x<+∞ ¯ ³ x´ ³ x´ ³1 1 ´ ³ x ´¯¯ ¯ = sup − Φ tn − tn − xφ tn ¯, ¯Φ t n sn σ sn σ σ −∞<x<+∞ ¯ ³1 ´ ³ ´¯ 1 x ¯ ¯ = sup − xφ tn ¯, ¯t n sn σ σ −∞<x<+∞ ¯ h√ ³ ´¯ ´ i ³ ¯ ∗ ¯ n∗Nn − Nn X ¯ n < x − Φ tn x ¯¯ = sup n X ¯P n sn −∞<x<+∞ A1 = A2 A3 A4 A5 sup and r tn = n . ENn Firstly, due to the Berry-Esseen bound, we have (3.8) A1 ≤ Kρ √ , σ3 n where K is defined as in Theorem 3.1. By Theorem 3.1, ¯ N ¯ K ρˆn ¯ n ¯ (3.9) A5 ≤ 3 √ 3 + Q(a)E ¯ − 1¯, EN sn a ENn n n 1X ¯ n |3 . where ρˆn = |Xi − X n i=1 The condition EX 4 < ∞ implies s2n → σ 2 and ρˆn → ρ a.s., and hence K ρˆ − 12 √ n = O(n ) a.s., s3n a3 ENn (3.10) because of (3.6). Further, by (3.1), (3.6) and Lemma 3.2 we see that under the condition EX 4 < ∞ we get p (3.11) 1 2 lim sup n (log log n) n→∞ (3.12) − 12 A4 = D[(X − µ)2 ] √ a.s., 2σ 2 πe A3 = O(n−1 log log n) a.s.. RATE OF CONVERGENCE Now, by Lemma 3.1 we see that A2 ≤ √ (3.13) |tn − 1| . 2πe min(1, tn ) But we have |tn − 1| ≤ |t2n − 1| = (3.14) |n − ENn | . ENn Therefore, (2.1) follows from (3.6)-(3.14) and the inequality (3.15) E|Nn − ENn | ≤ 2E|Nn − n|. To show (2.2) we note that δ1∗Nn (x) ≤ B1 (x) + B2 (x) + B3 (x) + B4 (x), where ¯ x ¯3 ´¯ £√ ³ ³ x ´¯ ¤ ¯ ¯ ¯ ¯ B1 (x) = 1 + ¯ ¯ ¯P n(Xn − µ) < x − Φ ¯, σ σ ¯ x ¯3 ´¯ ³ x ´ ³ x ´¯ ³ ¯ ¯ ¯ ¯ −Φ B2 (x) = 1 + ¯ ¯ ¯Φ tn ¯, σ σ σ ¯ x ¯3 ´¯ ³ x ´ ³ x ´¯ ³ ¯ ¯ ¯ ¯ −Φ B3 (x) = 1 + ¯ ¯ ¯Φ tn ¯, σ sn σ ¯ x ¯3 ´¯ h√ ³ i ³ x ´¯ ³ Nn ¯ ´ ¯ ¯ ¯ ∗ ¯ ∗Nn ¯ n Xn − Xn < x − Φ tn B4 (x) = 1 + ¯ ¯ ¯P ¯. σ n sn Since EX 4 < ∞, Lemma 3.3 implies (3.16) B1 (x) ≤ cρ √ . σ3 n We now apply Lemma 3.1 to get ¯ x ¯3 ´ n ¯x¯ ³ o x´ ¯ ¯ ¯ ¯ B2 (x) ≤ 1 + ¯ ¯ min 1, ¯ ¯φ min(1, tn ) |1 − tn | σ σ σ √ h i 2 2 1 ≤ √ + √ |1 − tn | 2πe min(1, tn ) e2 π[min(1, tn )]4 ³ (3.17) 169 170 NGUYEN VAN TOAN and ¯ x ¯3 ´ ³ n ¯ x¯ ³ ¡ σ ¢ x ´¯¯ σ ¯¯o ¯ ¯ ¯ ¯ (3.18) B3 (x) ≤ 1 + ¯ ¯ min 1, ¯tn ¯φ min 1, t n ¯1 − ¯ σ σ sn σ sn  ¯ x ¯3  √ ¯ 1+¯ ¯ 1 σ ¯¯ 2 2  ¯ σ ≤ ¯ x ¯3  √ ¡ σ¢+ √ h ¡ σ ¢i4  ¯1 − sn ¯. 2 π min 1, 1 + ¯tn ¯ 2πe min 1, e σ sn sn By (3.7) and (3.14) we get (3.19) B2 (x) = O(αn ). Since s2n → σ 2 a.s. and because of (3.6), a repeated application of Lemma 3.2 enables us to write 1 1 B3 (x) = O(n− 2 (log log n) 2 ) a.s.. (3.20) Theorem 3.2 now shows that (3.21) ¯ x ¯3 ¯ ¯ 1+¯ ¯ n ρˆn √ B4 (x) ≤ K (a, 3) ¯σ ¯ 1 ¯ x ¯3 s3n ENn 1+¯ ¯ sn ¯ (DN ) 34 oo n ¯ N ¯ n ¯ n + K2 (a, b, 3) max E ¯ − 1¯, 3 ENn (EN ) 2 for all a ∈ (0, 1), b ∈ (1, ∞). From (3.7), (3.10), (3.15) and the inequality (3.22) £ ¤ DNn ≤ E (Nn − n)2 , it follows that (3.23) B4 (x) = O(βn ) a.s.. Combining (3.16),(3.19),(3.20) and (3.23) we obtain (2.2). Part B. In order to prove (2.3) we note that n ∆∗N ≤ A1 + A2 + A5 , 2 171 RATE OF CONVERGENCE where A1 , A2 and A5 are defined as in Part A, and if EX 4 < ∞ is replaced by E|X 3 | < ∞. Then (3.8)-(3.10), (3.13)-(3.15) also hold. Hence (2.3) follows from (3.7). To derive (2.4) note that δ2∗Nn (x) ≤ C1 (x) + C2 (x) + C3 (x), where ¯ ¯ ³ √n(X ´ ¯ n − µ) ¯ ¯ < x − Φ(x)¯, C1 (x) = (1 + |x| )¯P σ C2 (x) = (1 + |x|3 ) |Φ(tn x) − Φ(x)| , µ ¶ √ Nn ¯ ∗Nn ¯ ¯ h n Xn − n Xn ¯ i ¯ 3 ¯ ∗ C3 (x) = (1 + |x| )¯P < x − Φ (tn x) ¯. sn 3 Since E|X 3 | < ∞, by Lemma 3.3 we immediately have C1 (x) ≤ cρ √ , n σ3 and by applying Theorem 3.2, ¯ (DN ) 43 o n ¯ N ρˆn ¯ n ¯ n √ C3 (x) ≤ K1 (a, 3) 3 + K2 (a, b, 3) max E ¯ − 1¯, 3 ENn sn ENn (ENn ) 2 for all a ∈ (0, 1), b ∈ (1, ∞). From (3.7), (3.10), (3.15) and (3.22), we deduce that C3 (x) = O(βn ) a.s.. Finally, applying Lemma 3.1, C2 (x) is bounded above by the right-hand side of (3.17), and hence, by (3.7), it follows that C2 (x) = O(αn ). This with the previous estimates prove (2.4). The proof of Theorem 2.1 is now complete. Proof of Theorem 2.1’. Part A. If condition (b) holds, then 1 1 γn = O(n− 2 (log log n) 2 ). 172 NGUYEN VAN TOAN Therefore, (2.1’) and (2.2’) immediately follow from (2.1) and (2.2), respectively. Part B. In the case where condition (c) holds, we have 1 βn = O(n− 2 ), and hence, by (2.3) and (2.4), we get (2.3’) and (2.4’), respectively. Proof of Theorem 2.2. Part A. It is easy to check that ∗1 RN ≤ A1 + A3 + A4 + A05 , n where A1 , A3 and A4 are defined as in the proof of Theorem 2.1, and A05 ¯ £p ³ x ´¯ ¡ ∗ ¢ ¤ ¯ ¯ ∗ ¯ ¯ = sup Nn XNn − Xn < x − Φ ¯P ¯. sn −∞<x<+∞ Because of EX 4 < ∞ one gets (3.8), (3.11) and (3.12). By the law of total probability and the fact that Nn and X1 , X2 , . . . are independent, we get A05 ≤ ≤ ∞ X m=1 ∞ X P [Nn = m] ¯ £√ ¡ ³ ´¯ ¢ ¤ ¯ ∗ ∗ ¯ n < x − Φ x ¯¯ ¯m −X m X ¯P sn −∞<x<+∞ P [Nn = m] K ρˆn √ s3n m m=1 = sup (by the Berry-Esseen bound) K ρˆn −1 E[Nn 2 ]. 3 sn Note that if EX 4 < ∞, s2n → σ 2 and ρˆn → ρ a.s., and hence (3.24) A05 = O(ηn ) a.s.. Combining theses results we obtain (2.5). Using the argument leading to (2.2) and (3.24) and noting that ¯ x ¯3 ¯ ¯ ³ ´ 1+¯ ¯ ∗1 σ ¯ B x + B 0 (x), rN (x) ≤ B (x) + ¯ 1 3 4 n tn ¯ x ¯3 1 + ¯t n ¯ σ RATE OF CONVERGENCE 173 where B1 (x) and B3 (x) are defined as in the proof of Theorem 2.1, and ¯ x ¯3 ´¯ £p ³ ³ x ´¯ ¡ ∗ ¢ ¤ ¯ ¯ ¯ ¯ ∗ 0 ¯ ¯ B4 (x) = 1 + ¯ ¯ ¯P Nn XNn − Xn < x − Φ ¯, σ sn we obtain (2.6). Part B. If E|X 3 | < ∞, s2n → σ 2 and ρˆn → ρ a.s. Therefore, (2.7) follows from the inequality ∗2 RN ≤ A1 + A05 , n and (3.8) and (3.24). To obtain (2.8) notice that ∗2 rN (x) ≤ B1 (σx) + n 1 + |x|3 0 ¯ s ¯3 B4 (sn x), ¯ n ¯ 1 + ¯ x¯ σ and arguing as in the proof of result (2.6). This completes the proof of Theorem 2.2. Proof of Theorem 2.2’. If E[Nn−1 ] = O(n−1 ), then by Liapounov’ inequal−1 1 ity we have E[Nn 2 ] = O(n− 2 ). Using the proof of Theorem 2.2 we obtain Theorem 2.2’. Remark 3.1. Theorem 2.1 can be stated more precisely as follows: Suppose that the sequence Nn of positive integer-valued random variables satisfies the condition: (a) αn → 0 as n → ∞. A) If EX 4 < ∞, then for all a ∈ (0, 1), b ∈ (1, ∞) p D[(X − µ)2 ] 1 n √ ≤ lim sup γn−1 ∆∗N +√ + 2Q(a) a.s., 1 2 2σ πe n→∞ 8πe e 2 + 4 ³p ≤ 2√ 2 lim sup D[(X − µ)2 ] 2e πσ n→∞ √ 2 2´ + σ + 2K2 (a, b, 3) a.s.. 2 B) If E|X 3 | < ∞, then for all a ∈ (0, 1), b ∈ (1, ∞) 3 γn−1 δ1∗Nn (x) 3 (a 2 + 1)Kρ 1 + 2Q(a) a.s., n→∞ 8πe a σ3 3 ¡ ¢ρ e2 + 4 −1 ∗Nn lim sup βn δ2 (x) ≤ c + K1 (a, 3) 3 + √ + 2K2 (a, b, 3) a.s.. σ n→∞ e2 8π lim sup n βn−1 ∆∗N 2 ≤ 3 2 +√ 174 NGUYEN VAN TOAN Remark 3.2. Theorem 2.2 can be stated more precisely as follows: A) If EX 4 < ∞, then p D[(X − µ)2 ] Kρ ∗1 √ + 3 a.s., ≤ lim sup θn−1 RN n σ 2σ 2 πe n→∞ p 3 (e 2 + 4) D[(X − µ)2 ] cρ ∗1 √ lim sup θn−1 rN (x) ≤ + a.s.. n σ3 2σ 2 e2 π n→∞ B) If E|X 3 | < ∞, then 2Kρ a.s., σ3 n→∞ 2cρ ∗2 lim sup ζn−1 rN (x) ≤ 3 a.s.. n σ n→∞ ∗2 lim sup ζn−1 RN ≤ n Remark 3.3. We can describe a general model that allows formulae such as the expansion of Edgeworth type. Let X, X1 , X2 , . . . be independent ¯n = and identically distributed random variables with mean µ, and put X n X 1 Xi . We assume that the parameter of interest has form θ0 = g(µ), n i=1 where g : IR → IR is a smooth function. The parameter estimate is 2 ¯ n ), which is assumed to have asymptotic variance σ = h(µ) , θˆ = g(X n n where h is a known, smooth function. It is supposed that σ 2 6= 0. The ¯ n )2 . sample estimate of σ 2 is σ ˆ 2 = h(X θˆ − θ0 θˆ − θ0 We wish to estimate the distribution of either or · Note σ σ ˆ ¯ n ), where that both these statistics may be written in the form A(X (3.25) A(x) = g(x) − g(µ) h(µ) A(x) = g(x) − g(µ) h(x) in the former case and (3.26) ∗ in the later. To construct the bootstrap distribution estimates, let Xn1 ,..., ∗ Xnm denote a sample drawn randomly, with replacement, from X1 , . . . , RATE OF CONVERGENCE 175 m X ¯∗ = 1 Xni ∗ for the resample mean. Define Xn , and write X nm m i=1 ∗ ∗ ¯ nm θˆnm = g(X ), ∗ ∗ ¯ nm σ ˆnm = h(X ). Both (3.27) (3.28) ∗ − θˆ θˆ∗ − θˆ θˆnm ˆX ¯ ∗ ), where and nm ∗ may be expressed as A( nm σ ˆ σ ˆ ¯n) g(x) − g(X ˆ A(x) = , ¯n) h(X ¯n) g(x) − g(X ˆ A(x) = h(x) in the respective cases. The bootstrap estimate of the distribution of ∗ ¯ n ) is given by the distribution of A( ˆX ¯ nm A(X ), conditional on X1 , . . . , Xn . Let Nn be a positive integer-valued random variable independent of X1 , X2 , . . . . The random sample size bootstrap estimate of the distribu¯ n ) is given by the distribution of A( ˆX ¯ ∗ ), conditional on tion of A(X nNn X1 , . . . , Xn . Theorem 2.2 of Hall [5] gave an Edgeworth expansion of the distribution ¯ n ), which may be stated as follows. Assume that for an integer of A(X ν ≥ 1, the function A has ν + 2 continuous derivatives in a neighbourhood of µ, that A(µ) = 0, that E|X|ν+2 < ∞, that the characteristic function ϕ of X satisfies (3.29) lim sup|ϕ(t)| < 1, |t|→∞ and that the asymptotic variance of √ ¯ n ) equals 1. Then nA(X ν X √ j ν ¯ n ) ≤ x} = Φ(x) + P { nA(X n− 2 πj (x)φ(x) + o(n− 2 ) j=1 uniformly in x, where πj is a polynomial of degree of 3j − 1, odd for even j and even for odd j, with coefficients depending on moments of X up to order j + 2. The form of πj depends very much on the form of A. We denote πj by pj if A is given by (3.25), and by qj if A is given by (3.26). 176 NGUYEN VAN TOAN The bootstrap version of this result is described in Theorem 5.1 of Hall [5]. Let Aˆ be defined by either (27) or (28), put πj = pj in the former case and πj = qj in the latter, and let π ˆj be the polynomial obtained from πj on replacing population moments by the corresponding sample moments. Arguing as in the proof of Theorem 5.1 [5], we can show the following result: Let λ > 0 be given, and let l = l(λ) denote a sufficiently large positive number (whose value we do not specify). Assume that g and h each have ν + 3 bounded derivatives in a neighbourhood of µ, that E|X|l < ∞, and that Cramer’s condition (3.29) holds. Then there exists a constant C > 0 such that (3.30) ν h° i X ° √ ∗ ∗ − ν+1 − 2j ˆ ¯ ° ° 2 P P { mA(Xnm ) ≤ x} − Φ(x) − ˆj (x)φ(x) > Cm m π j=1 = O(n−λ ), © P max sup 1≤j≤ν −∞<x<∞ ª (1 + |x|)−(3j−1) |ˆ πj (x)| > C = O(n−λ ). If we take λ > 1 in this result and apply the Borel-Cantelli lemma, we may deduce from (3.30) that ν ° ° X ν+1 ° ∗ √ ˆ ¯∗ ° − 2j (3.31) °P { mA(Xnm ) ≤ x}−Φ(x)− m π ˆj (x)φ(x)° = O(m− 2 ) j=1 with probability one. And from (3.31) we deduce that P ∗ ©p ν ³ £ ´ X ª £ −j ¤ − ν+1 ¤ ∗ ˆ ¯ Nn A(XnNn ) ≤ x = Φ(x)+ E Nn 2 π ˆj (x)φ(x)+O E Nn 2 j=1 with probability one. Remark 3.4. If Nn is a Poisson variable with ENn = n, then r ¯ N ¯ r 2 n+1 2n ¯ n ¯ −n n DNn = n, E|Nn − ENn | = 2e ∼ , E¯ − 1¯ ∼ , n! π ENn πn ¯ ³ x´ ³ x ´¯ ¯ ¯ −Φ ¯Φ t n ¯ = 0. σ σ Therefore, if EX 4 < ∞, applying results (3.8), (3.9), (3.11)-(3.14), we obtain 177 RATE OF CONVERGENCE − 12 1 2 lim sup n (log log n) n→∞ n ∆∗N 1 p D[(X − µ)2 ] √ ≤ a.s., 2σ 2 πe and from (3.16)-(3.18) and (3.21) we have lim sup n 1 2 ¡ n→∞ log log n ¢− 12 √ ip D[(X − µ)2 ] 1 2 2 δ1∗Nn (x) ≤ √ + 2√ a.s. 2σ 2 2πe e π h However, if E|X 3 | < ∞, by the proof of Part B of Theorem 2.1 we only have 1 2 n lim sup n ∆∗N 2 n→∞ 1 2 lim sup n (1 + |x| n→∞ 3 r Kρ ³ 1 ´ 2 + Q(a) a.s. ∀a ∈ (0, 1), ≤ 3 1+ √ σ π a3 )δ2∗Nn (x) ρ ≤ 3 (c + K1 (a, 3)) + K2 (a, b, 3) σ r 2 a.s. π for all a ∈ (0, 1), b ∈ (1, ∞). Remark 3.5. Also, in the case where Nn is a Poisson random variable with ENn = n, we have ∞ X nm 1 − e−n = , (m + 1)! n m=1 r q 1 − e−n − 21 −1 E[Nn ] ≤ E[Nn ] = · n E[Nn−1 ] = e−n (where Nn−1 = 0 if Nn = 0). Hence, from the proof of Theorem 2.2 it follows that A) If EX14 < ∞, then p D[(X1 − µ)2 ] √ a.s., 2σ 2 πe n→∞ √ ip h 1 D[(X − µ)2 ] 1 1 2 2 ∗1 √ √ lim sup n 2 (log log n)− 2 rN (x) ≤ + a.s.. n 2σ 2 n→∞ 2πe e2 π 1 2 lim sup n (log log n) − 12 ∗1 RN n ≤ 178 NGUYEN VAN TOAN B) If E|X 3 | < ∞, then 2Kρ a.s., σ3 n→∞ 1 2cρ ∗2 lim sup n 2 rN (x) ≤ 3 a.s.. n σ n→∞ 1 ∗2 ≤ lim sup n 2 RN n Remark 3.6. If the positive integer-valued random variable Nn has the uniform distribution function on [1, n], then n 1 X 1 ln n ∼ , n m=1 m n q 1 1 −1 E[Nn 2 ] ≤ E[Nn−1 ] ∼ n− 2 (ln n) 2 . E[Nn−1 ] = Hence, from the proof of Theorem 2.2 we get A) If EX 4 < ∞, then Kρ a.s., σ3 n→∞ 1 1 cρ ∗1 (x) ≤ 3 a.s.. lim sup n 2 (ln n)− 2 rN n σ n→∞ 1 1 ∗1 lim sup n 2 (ln n)− 2 RN ≤ n B) If E|X 3 | < ∞, then Kρ a.s., σ3 n→∞ 1 1 cρ ∗2 (x)x ≤ 3 a.s.. lim sup n 2 (ln n)− 2 rN n σ n→∞ 1 1 ∗2 lim sup n 2 (ln n)− 2 RN ≤ n Acknowledgements The author owes thanks to Professor T. M. Tuan and Professor T. H. Thao for their guidance and continuous support. The author is grateful to the anonymous referee who gave a carefully and thoroughly compiled list of places where presentation could be improved. References 1. R. Beran, Bootstrap method in statistics, Jahsesber. Deutsch. Math.-Verein 86 (1984), 14-30. RATE OF CONVERGENCE 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 179 B. Efron, Bootstrap methods: Another look at the Jackknife, Ann. Statist. 7 (1979), 1-26. B. Efron, Nonparametric standard errors and confidence intervals (with discussion), Canad. J. Statist. 9 (1981), 139-172. B. Efron and R. Tibshirani, Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy (with discussion), Statist. Sci. 1 (3) (1986), 54-47. P. Hall, The Bootstrap and Edgeworth Expansion, Springer-Verlag, New York, 1992. V. M. Kruglov, V. Yu. Korolev, Limit Theorems for Random Sums (in Russian), Moscow University Edition, 1990. E. Mammen, Bootstrap, wild bootstrap, and asymptotic normality, Prob. Theory Relat. Fields 93 (1992), 439-455. Nguyen Van Toan, Wild bootstrap and asymptotic normality, Bulletin, College of Science, University of Hue, 10 (1) (1996), 48-52. Nguyen Van Toan, On the bootstrap estimate with random sample size, Scientific Bulletin of Universities (1998), 31-34. Nguyen Van Toan, On the asymptotic accuracy of the bootstrap with random sample size, Vietnam Journal of Mathematics 26 (4) (1998), 351-356. Nguyen Van Toan, On the asymptotic accuracy of the bootstrap with random sample size, Pakistan Journal of Statist. 14 (3) (1998), to appear. K. Singh, On the asymptotic accuracy of Efron’s bootstrap, Ann. Statist. 9 (6) (1981), 1187-1195. Tran Manh Tuan and Nguyen Van Toan, On the asymptotic theory for the bootstrap with random sample size, Proceed. National Centre for Science and Technology of Vietnam 10 (2) (1998), 3-8. Tran Manh Tuan and Nguyen Van Toan, An asymptotic normality theorem of the bootstrap sample with random sample size, VNU Journal of Science, Nat. Sci. tXIV, No. 1 (1998), 1-7. Department of Mathematics, College of Science, Hue University