Sankhy¯ a : The Indian Journal of Statistics 2008, Volume 70-A, Part 1, pp. 109-123 c 2008, Indian Statistical Institute Expansions for the Joint Distribution of the Sample Maximum and Sample Estimate Christopher S. Withers Industrial Research Limited, NEW ZEALAND Saralees Nadarajah University of Manchester, UK Abstract Let Fn be the empirical distribution of a random sample in Rp from a distribution F . Let Mn be the componentwise sample maximum and T (F ) a smooth functional in Rq . Let θˆ = T (Fn ). We use the conditional Edgeworth ˆ expansion for θ|(M n ≤ y) to obtain expansions for the joint distribution of ˆ Mn ). For T (F ) = µ and µ2 , their degree of dependence as measured by (θ, ˆ Mn ) is shown to be O(n−1/2 ) for a class the strong–mixing coefficient α(θ, of distributions associated with the EV3 (Weibull), O(n−1/2 log iν n) for two classes associated with the EV1 (Gumbel) and O(ni/θ−1/2 ) for a class associated with the EV2 (θ) (Frechet), where i is the degree of T (F ), that is i = 1 for µ and i = 2 for µ2 , ν = 1 for a class that includes the gamma, and ν = 1/2 for a class that includes the normal. AMS (2000) subject classification. Primary . Keywords and phrases. Edgeworth expansions, extreme value distributions, strong mixing coefficient. 1 Introduction and Summary The asymptotic joint distribution of the sample mean and maximum was studied by Chow and Teugels (1978). They proved asymptotic independence except for a special case where the mean has a non-normal limit. When the mean has a normal limit, the mean and maximum are asymptotically independent. Their method does not extend to more general statistics. (Asymptotic independence for stationary sequences was proved under various conditions by Anderson and Turkman (1991), McCormick and Sun (1993), Hsing (1995a, 1995b) and Ho and Hsing (1996).) Here we show how 110 Christopher S. Withers and Saralees Nadarajah to obtain expansions for the joint distribution of θˆ and Mn , the sample maxˆ Both imum, for a large class of asymptotically normal sample statistics θ. ˆ θ and Mn may be multivariate. We illustrate the method for the univariate case with θˆ the sample mean or sample variance, and four classes of distributions whose domains of attraction for Mn corresponding to the three extreme-value distributions, the EV1 (Gumbel), the EV2 (Frechet), and the EV3 (Weibull). These four classes of distributions are dealt with in Sections 3–5. We show that the degree of dependence of θˆ and Mn , as measured by the strong–mixing coefficient ˆ ˆ ˆ αn = α(θ, Mn ) = sup P (θ ≤ x, Mn ≤ y) − P (θ ≤ x)P (Mn ≤ y) x,y is O(n−1/2 logiν n) amp; amp; for the two limiting EV1 classes of distributions, O(n i/θ−1/2 ) amp; amp; for the limiting EV2 class, −1/2 ) amp; amp; for the limiting EV3 class, O(n where i = 1 for T (F ) = µ(F ) and i = 2 for T (F ) = µ2 (F ). (Here ν = 1 for a class that includes the gamma distribution, and ν = 1/2 for a class that includes the normal distribution.) So, only for the limiting EV3 class does the second term in the Edgeworth expansion (that corresponding to the bias ˆ influence the joint expansion of (θ, ˆ Mn ). In no case does and skewness of θ) ˆ the second term of bias or skewness of θ influence the asymptotic value of αn . ˆ Section 2 gives the basic Edgeworth expansion for θ|(M n ≤ y) and gives ˆ conditions under which our standardisation of θ may be replaced by the ˆ the usual standardisation is n1/2 (θˆ − θ)/σ, usual one. (For univariate θ, ˆ We then show how to expand where σ 2 /n is the asymptotic variance of θ). 1/2 ˆ the joint distribution of n (θ − θ) and Mn∗ (a transformation of Mn with non-degenerate limit.) As an application we obtain in (2.17) an asymptotic ˆ Mn ) form for α(θ, For p = 1, our main result is roughly as follows: Suppose θˆ = T (Fn ) is a smooth functional in R of the empirical distribution Fn of a random sample of size n from a distribution F on Rp . Let TF (n) be R the first derivative of T (F ), the “influence function”. Set θ = TR(F ), σ 2 = TF (x)2 dF y0 = sup{x : R (x), y y0 p F (x)lt; 1} componentwise in R , so −∞ TF (x)dF (x) = − y TF (x)dF (x) → Expansions for the joint distribution of the sample maximum and sample estimate 111 0 as y ↑ y0 . Then as yn ↑ y0 , Yn = n1/2 (θˆ − θ)/σ satisfies Z y0 −1 1/2 P (Yn ≤ x|Mn ≤ yn ) − P (Yn ≤ x) ≡ φ(x)σ n TF (x)dy, yn where φ(n) is the density of a unit normal random variable. So, if yn (v) is a one–to–one function Rfrom Rp onto Rp such that P (Mn ≤ yn (r)) → G(v), y non–degenerate and if yn0 TF (r)dx ≡ λn a(v) then α(θˆn , µn ) ≡ Cn1/2 , where C = (2π)−1/2 σ −1 sup G(v)|a(v)|. Section 6 gives an expansion for P (n1/2 (θˆn −θ) ≤ x|Mn∗ = y) and suggests some extensions. 2 Conditional and Joint Expansions ˆ In this section we give the Edgeworth expansion for θ|(M n ≤ y). We 1/2 then derive expansions for the joint distribution of n (θˆ − θ) and Mn , and ˆ µn ). an asymptotic value for α(θ, Suppose we observe a random sample X1 , . . . , Xn in Rp with empirical distribution Fn (x), from some distribution F (x). Define the maximum Mn in Rp as the vector with ith element Mni = maxnj=1 (Xj )i for 1 ≤ i ≤ p. Let T (F ) be some smooth functional in Rq with functional derivatives TF (x1 , . . . , xr ). See for example Withers (1983). Set θ = T (F ) and θˆ = T (Fn ). By Withers (1983), the rth order cumulants have magnitude n1−r and expansions in n−1 . So, by the analog of the case for the mean given by Bhattacharya and Rao (1976), given in Withers (2007a), −1/2 about Pn (x, F ) = P (n1/2 (θˆ − θ) ≤ x) has an Edgeworth R expansion′ in n the multivariate normal Nq (0, V ), where V = TF (x)TF (x) dF (x). Now fix y in Rp . Then L Fn |(Mn ≤ y) = Fn |(X1 , . . . , Xn are i.i.d. Fy ), where Mn ≤ y is interpreted componentwise, Fy (x) = P (X ≤ x|X ≤ y) = F (x)F (y)−1 I(x ≤ y). (Here i.i.d. means independent and identically distributed.) So, P (n1/2 (θˆ − T (Fy )) ≤ x|Mn ≤ y) = Pn (x, Fy ). Similarly if mn is the (component-wise) minimum, then L Fn |(mn gt; y) = Fn |(X1 , . . . , Xn are i.i.d. F y ), (2.1) 112 Christopher S. Withers and Saralees Nadarajah where 1 − F y (x) = P (X gt; x|Y gt; y) = (1 − F (x))(1 − F (y))−1 I(x gt; y). So, P (n1/2 (θˆ − T (F y )) ≤ x|mn gt; y) = Pn (x, F y ). Similarly, if Tn (F ) = ET (Fn ), then ET (Fn )|(Mn ≤ y) = Tn (Fy ) and ET (Fn )|(mn gt; y) = Tn (F y ). For exposition, consider the case q = 1. Then the cumulants of θˆ can be expanded as ˆ = κr (θ) ∞ X ari (F )n−1 i=r−1 for r ≥ 1 with the leading coefficients given in Withers (1983): a10 (F ) = R T (F ), a21 (F ) = [1R2 ],R where [1r ] = TF (x)r dF (x), a32 (F ) = [13 ] + 3[1, 2, 12], where [1, 2, 12] = TF (x1 )TF (x2 )TF (x1 , x2 )dF (x1 )dF (x2 ), and so on. Assume TF (x) 6= 0 a.e. F , so that a21 (F ) gt; 0. Set Yn (F ) = n1/2 (θˆ − T (F ))a21 (F )−1/2 = n1/2 (θˆ − θ)/σ say. Then P (Yn (F ) ≤ x) = Φ(x) − φ(x) ∞ X n−r/2 hr (x, F ) = Qn (x, F ) (2.2) r=1 say, where Φ and φ are the distribution and density of a standard normal P random variable in R, hr (x, F ) = {hri (F )Hi (x) : 0 ≤ i ≤ 3r−1, r−i odd}, Hi (x) is the ith Hermite polynomial, that is Hi (x) = φ(x)−1 (−d/dx)i φ(x), and hri (F ) is a certain polynomial in Ari (F ) = ari (F )a21 (F )−r/2 . For example, h10 (F ) = A11 (F ) and h12 (F ) = A32 (F )/6, so h1 (x, F ) = A11 (F ) + A32 (F )(x2 − 1)/6. Differentiating (2.2) gives the density p(x : Yn (F )) = φ(x) ∞ X n−r/2 h∗r (x, F ), r=0 P where h∗0 (x, F ) = 1 and h∗r (x, F ) = {hri (F )Hi+1 (x) : 0 ≤ i ≤ 3r − 1, r − i odd}. So, h∗1 (x, F ) = A11 (F )x + A32 (F )(x3 − 3x)/6. Note 2.1. (2.2) is an asymptotic expression that usually diverges. However, under regularity conditions (see Corollary 3.1 of Withers (1983)) supx |LHS (2.2) - first I terms of RHS | = O(n−I/2 ). 2 Expansions for the joint distribution of the sample maximum and sample estimate 113 By (2.1), P (Yn (Fy ) ≤ x|Mn ≤ y) = Qn (x, Fy ). (2.3) Now let rn (y) : Rp → Rp be any transformation such that y ≤ z in Rp if and only if rn (y) ≤ rn (z). Fix v and set yn = rn−1 (v), Mn∗ = rn (Mn ), Yn∗ = Yn (Fyn ). By (2.3), P (Yn∗ ≤ x, Mn∗ ≤ v) = Qn (x, Fyn )P (Mn∗ ≤ v). (2.4) So, Yn and Mn are asymptotically independent (that is Yn∗ and Mn∗ are asymptotically independent) as n → ∞ provided that P (Yn∗ ≤ x) → Φ(x). By (2.2) this holds provided that ∆n1 amp; = amp; n1/2 (T (Fyn ) − T (F )) → 0, (2.5) ∆n2 amp; = amp; a21 (Fyn ) − a21 (F ) → 0. For, Yn∗ ≤ x if and only if Yn (F ) ≤ x∗ , where x∗ = xσ(yn )/σ + n1/2 {T (Fyn ) − T (F )}/σ, (2.6) where σ 2 (y) = a21 (Fy ) and σ 2 = a21 (F ). Now let us fix x∗ and set ∆n = x∗ − x = x∗ − σσ(yn )−1 {x∗ − ∆n1 /σ}. If (2.5) and (2.6) hold, then ∆n → 0, so by (2.3), expanding x about x∗ , P (Yn (F ) ≤ x∗ |Mn ≤ yn ) amp; = amp; Qn (x, Fyn ) amp; = amp; Φ(x∗ ) + φ(x∗ ) ∞ X {(−∆n )r /r! − n−r/2 hr (x, Fyn )} × r=1 amp; = amp; Φ∗ − φ∗ {∆n + n−1/2 h∗ } + . . . , where Φ∗ = Φ(x∗ ), φ∗ = φ(x∗ ) and h∗ = h1 (x∗ , F ). This is a weaker and more complex expansion than (2.3). In particular ∆n may not be O(n−1/2 ). Note 2.2. So far we have not assumed that Mn∗ has a limit. If Gn (v) = P (Mn∗ ≤ v) = F (yn )n → G(v) as n → ∞, where G(v) is non-degenerate, then we can expand P (Yn∗ ≤ x, Mn∗ ≤ v) = P (Yn (F ) ≤ x∗ , Mn∗ ≤ v) = Qn (x, Fyn )Gn (v) (2.7) 114 Christopher S. Withers and Saralees Nadarajah about Φ(x)G(v) or Φ(x∗ )G(v). By Fisher and Tippett (1928), if (2.7) holds with p = 1, rn (Mn ) = bn Mn − an , and bn gt; 0 then yn = b−1 n (v + an ) and G can be taken as Gθ (x) amp; = amp; exp(−e−x ) on R, (EV 1 or Gumbel) Gθ (x) amp; = amp; exp(−x −θ ) on (0, ∞), (EV 2 or Frechet) (2.8) (2.9) θ or Hθ (x) amp; = amp; exp(−(−x) ) on (−∞, 0), (EV 3 or Weibull).2 (2.10) We now give a method of obtaining or approximating ∆n1 of (2.4) and σ/σ(yn ) needed for expansions of x about x∗ of (2.6). Recall that for G any distribution on Rp , Z ∞ Z X T (G) − T (F ) = . . . TF (x1 , . . . , xr )dG(x1 ) . . . dG(xr )/r!. r=1 So, T (Fy ) − T (F ) = ∞ X tr (y, T )F (y)r /r!, r=1 where tr (y, T ) = Z y ... Z y TF (x1 , . . . , xr )dF (x1 ) . . . dF (xr ). We shall see that in our examples below, for large y as r increases tr (y, T ) is decreasing in magnitude, so that Z T (Fy ) − T (F ) ≈ t1 (y, T ) = − TF (x)dF (x). Also in our example, a21 (Fy ) − a21 (F ) = O(t1 (y, T )2 ). Suppose that as y approaches its upper limit, a21 (Fy ) − a21 (F ) = O(t(y, a21 )) (2.11) then σ/σ(y) = 1 + O(a21 (Fy ) − a21 (F )) = 1 + O(t1 (y, a21 )). Also ∆n1 = n1/2 t1 (yn , T ) + O(n1/2 t2 (yn , T )) assuming that T (Fy ) − T (F ) = t1 (y, T ) + O(t2 (y, T )). (2.12) Note 2.3. If T (F ) is a polynomial in F of degree I, for example the Ith cumulant or central moment µI or κI , then tr (y, T ) = 0 for r gt; I so (2.11) Expansions for the joint distribution of the sample maximum and sample estimate 115 holds if ti (y, a21 ) = O(t1 (y, a21 )) for 2 ≤ i ≤ 2I, and (2.12) holds if ti (y, T ) = O(t2 (y, T )) for 3 ≤ i ≤ I. 2 So, ∆n = ∆n0 /σ + O(en1 ), (2.13) where ∆n0 = n1/2 t1 (yn , T ), en1 = n1/2 |t1 (yn , T )| + |t2 (yn , a21 )| and en2 = en1 + ∆2n0 + n−1 then P (Yn (F ) ≤ x∗ |Mn∗ ≤ v) amp; = amp; Φ∗ − φ∗ (∆n0 σ −1 + n−1/2 h∗ ) +O(n1/2 en0 + en2 ), P (Yn (F ) ≤ x ∗ , Mn∗ (2.14) ≤ v) amp; = amp; Gn (v). RHS n o amp; = amp; Φ∗ − φ∗ (∆n0 σ −1 + n−1/2 h∗ ) ×G(v) + O(en3 ), where en0 = h1 (x∗ , Fyn ) − h1 (x∗ , F ) → 0 and en3 = n1/2 en0 + en2 + supv |Gn (v) − G(v)|. Also ▽n (x∗ , v) = P (Yn (F ) ≤ x∗ , Mn∗ ≤ v) − P (Yn (F ) ≤ x∗ )P (Mn∗ ≤ v) satisfies ▽n (x∗ , v) = −φ∗ Gn (v)∆n0 σ −1 + O(en2 ). In our examples we find that there exists a function a(v) such that t1 (yn , T ) = λn1 {a(v) + O(λn2 )}, (2.15) where λn1 and λn2 do not depend on v. So, ∆n0 amp; = amp; n1/2 λn1 {a(v) + O(λn2 )}, P (Yn (F ) ≤ x∗ |Mn∗ ≤ v) amp; = amp; Φ∗ − φ∗ {n1/2 λn1 a(v)σ −1 +n−1/2 h∗ } + O(en4 ), P (Yn (F ) ≤ x ∗ , Mn∗ (2.16) ≤ v) amp; = amp; Gn (v). RHS (2.15) ∗ ∗ amp; = amp; G(v)[Φ − φ {n +n 1/2 λn1 a(v)σ (2.17) −1 −1/2 ∗ h }] + O(en5 ) ▽n (x , v) amp; = amp; −φ∗ n1/2 λn1 Gn (v)a(v)σ −1 + O(en4 ) ∗ amp; = amp; −φ∗ n1/2 λn1 G(v)a(v)σ −1 + O(en6 ), where en4 amp; = amp; en2 + n1/2 λn1 λn2 + n−1/2 en0 , en5 amp; = amp; en4 + sup |Gn (v) − G(v)|, v 1/2 en6 amp; = amp; en4 + n λn1 sup |Gn (v) − G(v)|. v (2.18) 116 Christopher S. Withers and Saralees Nadarajah So, the strong–mixing coefficient is α(θˆn , Mn ) = n1/2 λn1 C(T ) + O(en6 ), (2.19) where C(T ) = (2π)−1/2 σ −1 K(T ) and K(T ) = supv G(v)|a(v)|. This gives an asymptotic value for αn = α(δˆn , µn ) assuming that sup |Gn (v) − G(v)| → 0 and n−1 + en1 = o(n1/2 λn1 ), v (2.20) since en6 = o(n1/2 λn1 ). And (2.19) gives an asymptotic value for αn = α(δˆn , µn ). We now give further details for T (F ) = µ and T (F ) = µ2 , where µ = µ(F ) = EX1 , µr = µr (F ) = E(X1 − µ)r and mr = EX1r . Set Z pi (y) amp; = amp; xi dF (x), y Z y xi dF (x) = mi − pi (y), pi (y) amp; = amp; q r (y) amp; = amp; Z qr (y) amp; = amp; Z r X (ri )pr−i (y)(−µ)i , (x − µ) dF (y) = r y i=0 y (x − µ)r dF (y) = µr − q r (y). By Example 5.3 of Withers (2007b), µF (x) amp; = amp; x − µ, µrF (x) amp; = amp; (x − µ)r − µr , µrF (x1 , . . . , xp ) amp; = amp; (−1)p {(r)p µr−p − (r)p−1 p p Y X r−p −1 hj , (hi − µr−p+1 hi )} × i=1 j=1 where hj = xj − µ and (r)p = r(r − 1) . . . (r − p + 1) = r!/(r − p)!. Example 2.1. Suppose T (F ) = µ. Then σ 2 = µ2 , T (Fy ) − T (F ) = t1 (y, µ) = −q(y) = −p1 (y) + µp0 (y). Also a21 (F ) = µ2 and µ2F (x1 , x2 ) = −2h1 h2 , so t1 (y, µ2 ) amp; = amp; −q2 (y) + µ2 q 0 (y) = −p2 (y) + 2µp1 (y) − m2 p0 (y), t2 (y, µ2 ) amp; = amp; −2t1 (y, µ)2 = −2q 1 (y)2 . Expansions for the joint distribution of the sample maximum and sample estimate 117 By Note 2.3, (2.12) holds, and (2.11) holds if q 1 (y)2 = O(q 2 (y) − µ2 q 0 (y)). (2.21) Also t2 (y, µ) = 0 so ∆n0 = −n1/2 {q 2 (yn ) − µ2 q 0 (yn )}, en1 = |t1 (yn , µn )| = 2 2q 1 (yn )2 and en2 = (n+2)q 1 (yn )2 +n−1 . Example 2.2. Suppose T (F ) = µ2 . Then ∆n0 = n1/2 t1 (yn , µ2 ). By Note 2.4, (2.12) holds. Also (2.11) holds if ti (y, a21 ) = O(t1 (y, a21 )) for 2 ≤ i ≤ 4, where σ 2 = a21 (F ) = µ4 − µ22 . So, a21F (x) amp; = amp; µ4F (x) − 2µ2 µ2F (x), t1 (y, a21 ) amp; = amp; −q 4 (y) + µ4 q 0 (y) + 4µ3 q 1 (y) +2µ2 {q 2 (y) − µ2 q 0 (y)}. Also, a21 (x1 , x2 ) amp; = amp; µ4F (x, x2 ) − 2µ2F (x1 )µ2F (x2 ) − 2µ2 µ2F (x1 , x2 ), 2 2 X X 2 hi , hi )h1 h2 + 4µ3 µ4F (x, x2 ) amp; = amp; 12µ2 h1 h2 − 4( i=1 i=1 so t2 (y, a21 ) amp; = amp; 12µ2 q1 (y)2 − 8q1 (y)q3 (y) + 8µ3 q1 (y) amp; amp; −2t1 (y, µ2 )2 + 4µ2 t1 (y, µ)2 amp; = amp; −8q 1 (y)q 3 (y) − 2{q 2 (y) − µ2 q 0 (y)}2 + 16µ2 q 1 (y)2 . Set T [12 . . .] = TF (x1 , x2 , . . .). By chain rule, (A4) and (A5) of Withers Pthe 3 a21 [1234] = µ4 [1234] − (2007b), a21 [123] = µ4 [123] − 2 Pµ2 [1]µ2 [23] and P P 2 3 µ3 [12]µ2 [34]. Also µ4 [123] = 3 h21 h2 h3 −12µ2 3 h2 h3 , so t3 (y, Qa421 ) = 2 2 2 36q2 (y)q1 (y) − 36µ2 q1 (y) = −36q 2 (y)q 1 (y) . Also µ4 [1234] = −72 i=1 hi , 2 so t4 (y, a21 ) = −72q1 (y)4 −6t2 (y, µ2 )2 = −96q 1 (y)4 . Note 2.4. In the univariate examples that follow we have chosen convenient location and scale/parameters. This does not restrict generality since if X0 = µ+τ X with τ gt; 0, then Mn (X0 ) = µ+τ Mn(X), where Mn (X) = Mn . So, if Mn∗ = bn Mn − an then Mn∗ = b′n Mn (X0 − a′n ), where b′n = bn /τ and a′n = an + bn µ/τ . If θ(x) = θ = T (F ) is a location parameter (for example the mean or median), then θ(X0 ) = µ + τ θ(X) and a21 (X0 ) = τ 2 a21 (X). If θ(X) = T (F ) is the rth power of a scale parameter (for example µr ) then θ(X0 ) = τ r θ(X) and a21 (X0 ) = τ 2r θ(X). 2 118 Christopher S. Withers and Saralees Nadarajah 3 An EV2 Example Here we illustrate our expansion for L(Yn (F ), Mn∗ ) for p = 1 and F having upper tail 1−F (y) = Ky −θ {1+O(y −β )} as y → ∞, where K gt; 0, θ gt; 0 and β gt; 0. We need also the slightly stronger condition d/dy{1−F (y)−Ky −θ } = O(y −θ−β−1 ) as y → ∞. Note 3.1. This holds with β = θ for the EV2 and also for the stable law of index θ lt; 1. 2 Take Mn∗ = bn Mn − an with an = 0, bn = (Kn)1/θ . Then (2.7) holds with G = Gθ of (2.9). Also yn = v(Kn)1/θ , so Gn (v) = Gθ (v) + O(n−ǫ0 ) (3.1) for ǫ0 = min(1, β/θ), pi (y) = −κθ(θ − i)−1 y i−θ {1 + O(y −β )} for θ gt; i, q r (y) = −κθ(θ − r)−1 y r−θ {1 + O(y −β ) + y −1 } for θ gt; r. Set ai = θ/(θ − i), iθ −βi . βi = a−1 i (1 + log ai ) and ki = k ai e Example 3.1. Suppose T (F ) = µ. So, σ 2 = µ2 and (2.11) holds so (2.12) and (2.13) hold. Also (2.15) holds with λn1 = n1/θ−1 , λn2 = n−ǫ0 and a(v) = k1/θ θ(θ − 1)−1 v 1−θ . So, assuming θ gt; 2 and (2.16)–(2.18) hold with en4 = O(n−ǫ1 ) for ǫ1 = min(1/2, 1 − 2θ −1 , 1/2 + (β − 1)θ −1 ), en5 = O(n−ǫ2 ) for ǫ2 = min(1/2, 1 − 2θ −1 , βθ −1 ), en6 = O(n−ǫ1 ) and k(µ) = K1 . 2 Example 3.2. Suppose T (F ) = µ2 . So, σ 2 = µ4 − µ22 . By Note 2.5, (2.12) holds since ti (y, a21 ) = O(y 4−iθ ), assuming θ gt; 4. Also en1 = O(n4/θ−1 ) and (2.17) holds with λn1 = n2/θ−1 , λn2 = n−ǫ0 and a(v) = k2/θ (θ − 2)−1 v 2−θ ). So, ∆n0 and en2 are O(n2/θ−1/τ ), and (2.16)–(2.19) hold with en4 = O(n2/θ−1/2 ), en5 = O(n−ǫ3 ) for ǫ3 = min(1/2 − 2/θ, β/θ), en6 = O(n−ǫ4 ) for ǫ4 = min(1/2, 1 − 2/θ, 1/2 + (β − 2)/θ) and k(µ2 ) = k2 . 2 In both cases (2.20) holds so (2.19) gives the asymptotic value of αn . We have shown, among other things, that for T1 = µ and T2 = µ2 , αn = α(θˆn , Mn ) = C(Ti )ni/θ−1/2 +O(en6 (Ti )), where for i = 1, 2, C(Ti ) = (2π)−1/2 ki a21 (Ti )−1/2 , en6 (Ti ) = nνi and νi = (1/2, 1 − 2/θ, 1/2 + (β − i)/θ). 4 An EV3 Example Suppose 1 − F (y) = k(y)θ {1 + O((−y)β )} as y ↑ 0, where k gt; 0, θ gt; 0 and β gt; 0. We need also the slightly stronger condition d/dy{1 − F (y) − k(−y)θ } = O((−y)θ+β−1 ) as y ↑ 0. This holds for the EV3 with β = θ. Expansions for the joint distribution of the sample maximum and sample estimate 119 Note 4.1. Take Mn∗ = bn Mn − an with an = 0 and bn = (kn)1/θ so yn = v(kn)−1/θ , then (2.7) holds with G = Hθ of (2.10). Also Gn (v) = Hθ (v)+O(n−ǫ0 ) for ǫ0 of (3.1), pi (y) = (−1)i kθ(θ+i)−1 (−y)θ+i {1+O(−y)β } and q r (y) = k(−y)θ {(−µ)r +O(−y)r +O(y)}. So, t1 (y, µ2 ) = k(−y)θ {−m2 + O(−y)β + O(y)}. 2 Example 4.1. Suppose T (F ) = µ. So, (2.21), (2.11) and (2.12) hold. Also (2.15) holds with λn1 = n−1 , λn2 = n−ǫ5 for ǫ5 = min(β, 1)/θ and a(v) = (−v)θ µ. So, en1 = O(n−1 ) and en2 = O(n−1 ). So, (2.16)–(2.19) hold with en4 = O(n−ǫ6 ) for ǫ6 = min(1, 1/2 + 1/θ, 1/2 + β/θ), en5 = O(n−ǫ7 ) for ǫ7 = min(1, 1/2 + 1/θ, β/θ), en6 = O(n−ǫ6 ), k(µ) = e−1 |µ| and C(µ) = −1/2 (2π)−1/2 e−1 |µ|µ2 . 2 Example 4.2. Suppose T (F ) = µ2 . Then t1 (y, a21 ) = k(−y)θ {k6 + O(−y)β +O(−y)}, where k6 = µ4 −4µ3 µ+2µ2 µ2 +µ4 −2µ22 = E(X1 −2µ)4 − 2µ22 − 4µ2 µ2 and ti (y, a21 ) = O(−y)iθ . So, for k3 6= 0, and (2.11) holds by Note 2.4. Also en1 = O(n−1 ) and (2.17) holds with λn1 = n−1 , λn2 = n−ǫ7 for ǫ7 = min(1, β)/θ, a(v) = (µ2 − µ2 )(−v)θ . So, ∆n0 = O(n−1/2 ) and en2 = O(n−1 ). So, (2.17)–(2.19) hold with en4 = O(n−ǫ6 ), en5 = O(n−ǫ4 ), en6 = O(n−ǫ6 ) and k(µ2 ) = e−1 |µ2 −µ2 |. 2 To summarise, for T (F ) = µ and µ2 , (2.17)–(2.19) hold with en4 = O(n−ǫ6 ), en5 = O(n−ǫ7 ), en6 = O(n−ǫ6 ), C(T ) = (2π)−1/2 e−1 C0 (T ), where −1/2 C0 (µ) = |µ|µ2 and C0 (µ2 ) = |µ2 − µ2 |(µ4 − µ22 )−1/2 . Also (2.20) holds so (2.19) gives an asymptotic value for αn . 5 Two EV1 Examples Our examples here include the gamma and normal distributions. First, suppose 1 − F (y) = ky d exp(−y){1 + O(y −β )} (5.1) as y → ∞. We need also the slightly stronger condition d/dy{1 − F (y) − ky d exp(−y)} = O(y d−β exp(−y)) as y → ∞. γ−1 −x e Note 5.1. This holds for the gamma distribution with density x Γ(γ) on (0, ∞) if β1 = 1, d = γ − 1, k = Γ(γ)−1 . It also holds for EV1 with k = 1, d = 0 and any β (in fact with y −β replaced by e−y ). 120 Christopher S. Withers and Saralees Nadarajah Set n1 = log n, n2 = log log n, kδ = log k, Mn∗ = bn Mn − an with bn = 1 and an = n1 + dn1 + kδ . Then (2.7) holds with G = Gθ of (2.8), and −β yn = v + an . Also Gn (v) = Gθ (v) + O(en ), where en = n2 n−1 and 1 + n1 d+i −β −1 pi (y) = κy exp(−y){1 + O(ǫy )}, where ǫy = y + y , and q r (y) = κy d+r exp(−y){1 + O(ǫy )}. So, µ(Fy ) − µ = t1 (y, µ) = −κy d+1 exp(−y){1 + 2d O(ǫy )}, t1 (y, µ2 ) = −κy d+2 exp(−y){1 + O(ǫy )} and t2 (y, µ2 ) =O(y 2 e−2y ). 2 Example 5.1. Suppose T (F ) = µ. So, (2.21), (2.11) and (2.12) hold. Also (2.15) holds with λn1 = n−1 n1 , λn2 = en and a(v) = −e−v . So, en1 = O(n21 /n) and en2 = O(n21 /n). So, (2.17)–(2.19) hold with en5 = O(en ), en6 = O(n−1/2 n1 en ) and k(µ) = e−1 . 2 Example 5.2. Suppose T (F ) = µ2 . Then t1 (y, a21 ) = −κy d+4 exp(−y) {1 + O(ǫy )} and ti (y, a21 ) = O(y id+4 exp(−iy)). So, (2.11) holds by Note 2.4. Also en1 = O(n41 /n), en2 = O(n41 /n) and (2.15) holds with λn1 = n21 /n, λn2 = en and a(v) = −e−v . So, (2.17)–(2.19) hold with en4 = O(n−1/2 n21 en ), en5 = O(en ), en6 = O(n−1/2 n21 en ) and k(µ2 ) = e−1 . 2 We now turn to a class of distributions that includes the normal. Replace (5.1) by the assumption: 1 − F (y) = ky d exp(−y 2 ){1 + O(y −β )} (5.2) as y → ∞ or rather the slightly stronger condition d/dy{1 − F (y) − κy d × exp(−y 2 )} = O(y d+1−β exp(−y 2 )) as y → ∞. Note 5.2. This holds for F (y) = ψ(21/2 y), the joint distribution of x ∼ n(0, 1/2), with d = 1, β = 2 and k = π −1/2 /2. 2 1/2 Set Mn∗ = bn Mn − an with an = 2n1 + dn2 /2 + k8 and bn = 2n1 . By (5.2), Gn (v) = Gθ (v) + O(en ) (5.3) −β/2 for en = n22 n−1 . Also yn = (v + an )/bn , pi (y) = 2ky i+d exp(−y 2 ) 1 + n1 {1 + O(y −ǫ )} for ǫ = min(2, β), q r (y) = 2ky r+d exp(−y 2 ) {1 + O(y −ǫ )}, µ(Fy )− µ = t1 (y, µ) = −2ky 1+d exp(−y 2 ) {1+ O(y −ǫ )}, t2 (µ, a21 ) = O(y 2+2d exp(−2y 2 )) and t1 (y, µ2 ) = −2ky 2+d exp(−y 2 ) {1 + O(y −ǫ )}. Example 5.3. Suppose T (F ) = µ. So, (2.21), (2.11) and (2.12) hold. 1/2 Also (2.15) holds with λn1 = n1 /n, λn2 = en and a(x) = −2 exp(−v). So, en1 = O(n1 /n), en2 = O(n1 /n) and (2.17)–(2.19) hold with en4 = 1/2 1/2 O(n−1/2 n1 en ), en5 = en , en6 = O(n−1/2 n1 en ) and k(µ) = 2e−1 . 2 Expansions for the joint distribution of the sample maximum and sample estimate 121 Example 5.4. Suppose T (F ) = µ2 . Then t1 (y, a21 ) = −2ky 4+d exp(−y 2 ) {1 + O(y −ǫ )} and ti (y, a21 ) = O(y 4+id exp(−iy 2 )). So, (2.11) holds by Note 2.4. Also (2.15) holds with λn1 = n1 /n, λn2 = en and a(v) = −2 exp(−v). Also en1 = O(n21 /n), en2 = O(n21 /n) and (2.17)–(2.19) hold with en4 = O(n−1/2 n1 en ), en5 = en , en6 = O(n−1/2 n1 en ) and k(µ2 ) = 2e−1 . 2 So, putting i = 1 for µ and i = 2 for µ2 , αn = n−1/2 {2D(Ti ) + O(en )} for D(Ti ) of (5.2). For x ∼ n(0, 1/2), en = O(n22 /n1 ), σ1 = 2−1/2 and σ2 = 1 so 2D(Ti ) = π −1/2 e−1 c1/2 . For x ∼ n(µ, µ2 ) apply Note 2.4. 6 Extensions Here we mention a number of ways in which these results may be extended. First consider the case of two samples of sizes n1 and n2 with empirical distributions F1n1 and F2n2 from distributions F1 , F2 on Rp1 , Rp2 . Set θˆ = T (F1n1 , F2n2 ) and θ = T (F1 , F2 ) is a smooth function in R2 . Let L ˆ M1n1 and M2n2 be the sample maxima. Then θ|(M 1n1 ≤ y1 , M2n2 ≤ y2 ) = ˆ the samples are from F1y , F2y , where Fiy is Fy for Fi . The Edgeworth θ| 1 2 expansion for θˆ is given for the case q = 1 by Withers (1988), and for general q may be obtained from the cumulant coefficients given there. In this way our previous results can be extended to two or more samples. Secondly, if q = 1, by the extended Cornish-Fisher expansions of Withers (1984), we can write the Edgeworth expansion (2.2) in the form Pn (x) = P (Yn (F ) ≤ x) = φ(x − ∞ X n−r/2 fr (x, F )) r=1 with quantile Pn−1 (x) = y + ∞ X gr (z, F ), r=1 where z = Φ−1 (x) is the unit normal quantile. So, we can do likewise with Pn (x|y) = P (Yn (Fy ) ≤ x|Mn ≤ y). Thirdly, if q = 1, we can if desired apply the nonparametric confidence interval of Withers (1983) to obtain a random interval Ink such that P (θˆ ∈ Inh |Mn ≤ y) = .95 + O(n−h/2 ) for any fixed k. This may be useful for obtaining more accurate p–values of hypotheses on θ = T (F ) conditional on Mn ≤ y, or equivalently on mn ≥ y. 122 Christopher S. Withers and Saralees Nadarajah Fourthly, writing the results of our examples in the form P (Yn (F ) ≤ ≤ v) = Φ(x∗ ) + φ(n∗ )ǫn1 a(x∗ , v) + O(ǫn1 , ǫn2 ), where ǫn1 , ǫn2 are known, if as is usual we have a ˆ(x∗ , v) = a(x∗ , v) + Op (n−1/2 ) then typically P (Yn (F ) ≤ x∗n |Mn∗ ≤ v) = Φ(x∗ ) + O(ǫn1 (n−1/2 + ǫn2 )), where x∗n = x∗ − ǫn1 a ˆ(x∗ , v). This method was used in Withers (1982). x∗ |Mn∗ Finally, we illustrate a different approach: conditioning not on Mn ≤ y L but on Mn = y: Fn |(Mn = y, ∼ F ) = (1 − n−1 )Fn−1 |(∼ Fy ) + n−1 I(y ≤ x). So, putting δy (x) = I(y ≤ x), L T (Fn )|(Mn = y, ∼ F ) amp; = amp; T ((1 − n−1 )Fn−1 + n−1 δy )|(x ∼ Fy ) ∞ X n−r TFn−1 (y r )/r!, amp; = amp; T (Fn−1 ) + r=1 (y r ) where TF = TF (x1 , . . . , xr ) with x1 = . . . = xr = y. One can now obtain the Edgeworth expansion for P (T (Fn ) ≤ x|Mn = y) by applying Corollary 5.2 of Withers (1983) with n replaced by n − 1. This requires computing some of the cross–cumulant coefficients of {TFn−1 (y r ), r ≤ 0} using Lemma 5.1 of Withers (1983). So, P (Yn (Fy ) ≤ x|Mn = y) amp; = amp; Φ(n) − φ(n) ∞ X (n − 1)−v/2 h2r (x, Fy ) r=1 −1/2 amp; = amp; Φ(n) − φ(n)n ′ ′ h1 (x, Fy ) + O(n−1 ), −1/2 where, for example, h1 (n, Fy ) = h1 (n, Fy ) + TF (y)a21 . So, P (Yn (F ) ≤ x∗ |Mn∗ = v) = RHS (2.14) = RHS (2.18) with en0 replaced by e′n0 = ′ −1/2 h1 (x∗ , Fyn ) − h′1 (x∗ , F ) = en0 + TF (yn )a21 → 0. So, for fixed x∗ , (2.11), (2.12) and (2.15) imply amp; P (n1/2 (θˆ − θ) ≤ x∗ |Mn∗ = v) oi h n amp; = amp; G(v) Φ(x∗ ) − φ(x2 ) n1/2 λn1 a(v)σ −1 + n−1/2 h∗ + O(en5 ) amp; +O(n−1/2 ) and amp; amp; P (n1/2 (θˆ − θ)/σ ≤ x∗ |Mn∗ = v) − P (n1/2 (θˆ − θ)/σ ≤ x∗ ) amp; = amp; −φ(n∗ )n1/2 λn1 G(v)a(v)σ −1 + O(en6 ) + o(n−1/2 ) for en5 and en6 of Section 2. Acknowledgments. The authors would like to thank the Editor and the referee for their helpful suggestions. Expansions for the joint distribution of the sample maximum and sample estimate 123 References Anderson, C.W. and Turkman, K.R. (1991). The joint limiting distribution of sums and maxima of stationary sequences. J. Appl. Probab., 28, 33–44. Bhattacharya, R.N. and Rao, R.R. (1976). Normal Approximation and Asymptotic Expansions. Wiley, New York. Chow, T.L. and Teugels, J.L. (1978). The sum and the maximum of i.i.d. random variables. In Proceedings of the Second Prague Symposium Asymptotic Statistics, 81–92. Fisher, R.A. and Tippett, L.H.C. (1928). Limiting forms of the frequency distributions of the largest or smallest member of a sample. Proceedings of the Cambridge Philosophical Society, 24, 180–190. Hsing, T. (1995a). A note on the asymptotic independence of the sum and maximum of strongly mixing stationary random variables. Ann. Probab., 23, 938–947. Hsing, T. (1995b). On the asymptotic independence of the sum and rare values of weakly dependent stationary random variables. Stochastic Processes and Their Applications, 60, 49–63. Ho, H.-C. and Hsing, T. (1996). On the asymptotic joint distribution of the sum and maximum of a Gaussian sequence. J. Appl. Probab., 33, 138–145. McCormick, W.P. and Sim, J. (1993). Sums and maxima of discrete stationary processes. J. Appl. Probab., 30, 863–876. Withers, C.S. (1982). Second-order inference for asymptotically normal random variables. Sankhy¯ a, B, 44, 19–27. Withers, C.S. (1983). Expansions for the distribution and quantiles of a regular functional of the empirical distribution with applications to nonparametric confidence intervals. Ann. Statist., 11, 577–587. Withers, C.S. (1984). Asymptotic expansions for distributions and quantiles with power series cumulants. J.Roy. Statist. Soc., B, 46, 389–396. Corrigendum: B, 48, 258. Withers, C.S. (1988). Nonparametric confidence intervals for functions of several distributions. Annals of the Institute of Statistical Mathematics, 40, 727–746. Withers, C.S. (2007a). Transformations of multivariate Edgeworth–type expansions. Technical Report, Applied Mathematics Group, Industrial Research Ltd., Lower Hutt, New Zealand. Withers, C.S. (2007b). Estimates of low bias for functionals of distribution functions. Technical Report, Applied Mathematics Group, Industrial Research Ltd., Lower Hutt, New Zealand. Christopher S. Withers Applied Mathematics Group Industrial Research Limited Lower Hutt, NEW ZEALAND. E-mail: [email protected] Saralees Nadarajah School of Mathematics University of Manchester Manchester M13 9PL, UK E-mail: [email protected] Paper received January 2008; revised May 2008.
© Copyright 2024