Document 271763

Sankhy¯
a : The Indian Journal of Statistics
2008, Volume 70-A, Part 1, pp. 109-123
c 2008, Indian Statistical Institute
Expansions for the Joint Distribution of the Sample
Maximum and Sample Estimate
Christopher S. Withers
Industrial Research Limited, NEW ZEALAND
Saralees Nadarajah
University of Manchester, UK
Abstract
Let Fn be the empirical distribution of a random sample in Rp from a distribution F . Let Mn be the componentwise sample maximum and T (F ) a
smooth functional in Rq . Let θˆ = T (Fn ). We use the conditional Edgeworth
ˆ
expansion for θ|(M
n ≤ y) to obtain expansions for the joint distribution of
ˆ Mn ). For T (F ) = µ and µ2 , their degree of dependence as measured by
(θ,
ˆ Mn ) is shown to be O(n−1/2 ) for a class
the strong–mixing coefficient α(θ,
of distributions associated with the EV3 (Weibull), O(n−1/2 log iν n) for two
classes associated with the EV1 (Gumbel) and O(ni/θ−1/2 ) for a class associated with the EV2 (θ) (Frechet), where i is the degree of T (F ), that is
i = 1 for µ and i = 2 for µ2 , ν = 1 for a class that includes the gamma, and
ν = 1/2 for a class that includes the normal.
AMS (2000) subject classification. Primary .
Keywords and phrases. Edgeworth expansions, extreme value distributions,
strong mixing coefficient.
1
Introduction and Summary
The asymptotic joint distribution of the sample mean and maximum
was studied by Chow and Teugels (1978). They proved asymptotic independence except for a special case where the mean has a non-normal limit.
When the mean has a normal limit, the mean and maximum are asymptotically independent. Their method does not extend to more general statistics. (Asymptotic independence for stationary sequences was proved under
various conditions by Anderson and Turkman (1991), McCormick and Sun
(1993), Hsing (1995a, 1995b) and Ho and Hsing (1996).) Here we show how
110
Christopher S. Withers and Saralees Nadarajah
to obtain expansions for the joint distribution of θˆ and Mn , the sample maxˆ Both
imum, for a large class of asymptotically normal sample statistics θ.
ˆ
θ and Mn may be multivariate. We illustrate the method for the univariate
case with θˆ the sample mean or sample variance, and four classes of distributions whose domains of attraction for Mn corresponding to the three
extreme-value distributions, the EV1 (Gumbel), the EV2 (Frechet), and the
EV3 (Weibull). These four classes of distributions are dealt with in Sections
3–5. We show that the degree of dependence of θˆ and Mn , as measured by
the strong–mixing coefficient
ˆ
ˆ
ˆ
αn = α(θ, Mn ) = sup P (θ ≤ x, Mn ≤ y) − P (θ ≤ x)P (Mn ≤ y)
x,y
is
O(n−1/2 logiν n) amp; amp; for the two limiting EV1
classes of distributions,
O(n
i/θ−1/2
) amp; amp; for the limiting EV2 class,
−1/2
) amp; amp; for the limiting EV3 class,
O(n
where i = 1 for T (F ) = µ(F ) and i = 2 for T (F ) = µ2 (F ). (Here ν = 1 for
a class that includes the gamma distribution, and ν = 1/2 for a class that
includes the normal distribution.) So, only for the limiting EV3 class does
the second term in the Edgeworth expansion (that corresponding to the bias
ˆ influence the joint expansion of (θ,
ˆ Mn ). In no case does
and skewness of θ)
ˆ
the second term of bias or skewness of θ influence the asymptotic value of
αn .
ˆ
Section 2 gives the basic Edgeworth expansion for θ|(M
n ≤ y) and gives
ˆ
conditions under which our standardisation of θ may be replaced by the
ˆ the usual standardisation is n1/2 (θˆ − θ)/σ,
usual one. (For univariate θ,
ˆ We then show how to expand
where σ 2 /n is the asymptotic variance of θ).
1/2
ˆ
the joint distribution of n (θ − θ) and Mn∗ (a transformation of Mn with
non-degenerate limit.) As an application we obtain in (2.17) an asymptotic
ˆ Mn )
form for α(θ,
For p = 1, our main result is roughly as follows: Suppose θˆ = T (Fn ) is a
smooth functional in R of the empirical distribution Fn of a random sample of
size n from a distribution F on Rp . Let TF (n) be
R the first derivative of T (F ),
the “influence function”. Set θ = TR(F ), σ 2 = TF (x)2 dF
y0 = sup{x :
R (x),
y
y0
p
F (x)lt; 1} componentwise in R , so −∞ TF (x)dF (x) = − y TF (x)dF (x) →
Expansions for the joint distribution of the sample maximum
and sample estimate
111
0 as y ↑ y0 . Then as yn ↑ y0 , Yn = n1/2 (θˆ − θ)/σ satisfies
Z y0
−1 1/2
P (Yn ≤ x|Mn ≤ yn ) − P (Yn ≤ x) ≡ φ(x)σ n
TF (x)dy,
yn
where φ(n) is the density of a unit normal random variable. So, if yn (v) is
a one–to–one function Rfrom Rp onto Rp such that P (Mn ≤ yn (r)) → G(v),
y
non–degenerate and if yn0 TF (r)dx ≡ λn a(v) then α(θˆn , µn ) ≡ Cn1/2 , where
C = (2π)−1/2 σ −1 sup G(v)|a(v)|.
Section 6 gives an expansion for P (n1/2 (θˆn −θ) ≤ x|Mn∗ = y) and suggests
some extensions.
2
Conditional and Joint Expansions
ˆ
In this section we give the Edgeworth expansion for θ|(M
n ≤ y). We
1/2
then derive expansions for the joint distribution of n (θˆ − θ) and Mn , and
ˆ µn ).
an asymptotic value for α(θ,
Suppose we observe a random sample X1 , . . . , Xn in Rp with empirical distribution Fn (x), from some distribution F (x). Define the maximum Mn in Rp as the vector with ith element Mni = maxnj=1 (Xj )i for
1 ≤ i ≤ p. Let T (F ) be some smooth functional in Rq with functional
derivatives TF (x1 , . . . , xr ). See for example Withers (1983). Set θ = T (F )
and θˆ = T (Fn ). By Withers (1983), the rth order cumulants have magnitude n1−r and expansions in n−1 . So, by the analog of the case for the
mean given by Bhattacharya and Rao (1976), given in Withers (2007a),
−1/2 about
Pn (x, F ) = P (n1/2 (θˆ − θ) ≤ x) has an Edgeworth
R expansion′ in n
the multivariate normal Nq (0, V ), where V = TF (x)TF (x) dF (x). Now fix
y in Rp . Then
L
Fn |(Mn ≤ y) = Fn |(X1 , . . . , Xn are i.i.d. Fy ),
where Mn ≤ y is interpreted componentwise,
Fy (x) = P (X ≤ x|X ≤ y) = F (x)F (y)−1 I(x ≤ y).
(Here i.i.d. means independent and identically distributed.) So,
P (n1/2 (θˆ − T (Fy )) ≤ x|Mn ≤ y) = Pn (x, Fy ).
Similarly if mn is the (component-wise) minimum, then
L
Fn |(mn gt; y) = Fn |(X1 , . . . , Xn are i.i.d. F y ),
(2.1)
112
Christopher S. Withers and Saralees Nadarajah
where
1 − F y (x) = P (X gt; x|Y gt; y) = (1 − F (x))(1 − F (y))−1 I(x gt; y).
So,
P (n1/2 (θˆ − T (F y )) ≤ x|mn gt; y) = Pn (x, F y ).
Similarly, if Tn (F ) = ET (Fn ), then ET (Fn )|(Mn ≤ y) = Tn (Fy ) and
ET (Fn )|(mn gt; y) = Tn (F y ). For exposition, consider the case q = 1.
Then the cumulants of θˆ can be expanded as
ˆ =
κr (θ)
∞
X
ari (F )n−1
i=r−1
for r ≥ 1 with the leading coefficients
given in Withers (1983): a10 (F ) =
R
T (F ), a21 (F ) = [1R2 ],R where [1r ] = TF (x)r dF (x), a32 (F ) = [13 ] + 3[1, 2, 12],
where [1, 2, 12] =
TF (x1 )TF (x2 )TF (x1 , x2 )dF (x1 )dF (x2 ), and so on. Assume TF (x) 6= 0 a.e. F , so that a21 (F ) gt; 0. Set
Yn (F ) = n1/2 (θˆ − T (F ))a21 (F )−1/2 = n1/2 (θˆ − θ)/σ
say. Then
P (Yn (F ) ≤ x) = Φ(x) − φ(x)
∞
X
n−r/2 hr (x, F ) = Qn (x, F )
(2.2)
r=1
say, where Φ and φ are the distribution
and density of a standard normal
P
random variable in R, hr (x, F ) = {hri (F )Hi (x) : 0 ≤ i ≤ 3r−1, r−i odd},
Hi (x) is the ith Hermite polynomial, that is Hi (x) = φ(x)−1 (−d/dx)i φ(x),
and hri (F ) is a certain polynomial in Ari (F ) = ari (F )a21 (F )−r/2 . For example, h10 (F ) = A11 (F ) and h12 (F ) = A32 (F )/6, so h1 (x, F ) = A11 (F ) +
A32 (F )(x2 − 1)/6. Differentiating (2.2) gives the density
p(x : Yn (F )) = φ(x)
∞
X
n−r/2 h∗r (x, F ),
r=0
P
where h∗0 (x, F ) = 1 and h∗r (x, F ) = {hri (F )Hi+1 (x) : 0 ≤ i ≤ 3r − 1, r −
i odd}. So, h∗1 (x, F ) = A11 (F )x + A32 (F )(x3 − 3x)/6.
Note 2.1. (2.2) is an asymptotic expression that usually diverges.
However, under regularity conditions (see Corollary 3.1 of Withers (1983))
supx |LHS (2.2) - first I terms of RHS | = O(n−I/2 ). 2
Expansions for the joint distribution of the sample maximum
and sample estimate
113
By (2.1),
P (Yn (Fy ) ≤ x|Mn ≤ y) = Qn (x, Fy ).
(2.3)
Now let rn (y) : Rp → Rp be any transformation such that y ≤ z in Rp
if and only if rn (y) ≤ rn (z). Fix v and set yn = rn−1 (v), Mn∗ = rn (Mn ),
Yn∗ = Yn (Fyn ). By (2.3),
P (Yn∗ ≤ x, Mn∗ ≤ v) = Qn (x, Fyn )P (Mn∗ ≤ v).
(2.4)
So, Yn and Mn are asymptotically independent (that is Yn∗ and Mn∗ are asymptotically independent) as n → ∞ provided that P (Yn∗ ≤ x) → Φ(x). By (2.2)
this holds provided that
∆n1 amp; = amp; n1/2 (T (Fyn ) − T (F )) → 0,
(2.5)
∆n2 amp; = amp; a21 (Fyn ) − a21 (F ) → 0.
For, Yn∗ ≤ x if and only if Yn (F ) ≤ x∗ , where
x∗ = xσ(yn )/σ + n1/2 {T (Fyn ) − T (F )}/σ,
(2.6)
where σ 2 (y) = a21 (Fy ) and σ 2 = a21 (F ). Now let us fix x∗ and set ∆n =
x∗ − x = x∗ − σσ(yn )−1 {x∗ − ∆n1 /σ}. If (2.5) and (2.6) hold, then ∆n → 0,
so by (2.3), expanding x about x∗ ,
P (Yn (F ) ≤ x∗ |Mn ≤ yn ) amp; = amp; Qn (x, Fyn )
amp; = amp; Φ(x∗ ) + φ(x∗ )
∞
X
{(−∆n )r /r! − n−r/2 hr (x, Fyn )}
×
r=1
amp; = amp; Φ∗ − φ∗ {∆n + n−1/2 h∗ } + . . . ,
where Φ∗ = Φ(x∗ ), φ∗ = φ(x∗ ) and h∗ = h1 (x∗ , F ). This is a weaker and
more complex expansion than (2.3). In particular ∆n may not be O(n−1/2 ).
Note 2.2. So far we have not assumed that Mn∗ has a limit. If
Gn (v) = P (Mn∗ ≤ v) = F (yn )n → G(v)
as n → ∞, where G(v) is non-degenerate, then we can expand
P (Yn∗ ≤ x, Mn∗ ≤ v) = P (Yn (F ) ≤ x∗ , Mn∗ ≤ v) = Qn (x, Fyn )Gn (v)
(2.7)
114
Christopher S. Withers and Saralees Nadarajah
about Φ(x)G(v) or Φ(x∗ )G(v). By Fisher and Tippett (1928), if (2.7) holds
with p = 1, rn (Mn ) = bn Mn − an , and bn gt; 0 then yn = b−1
n (v + an ) and G
can be taken as
Gθ (x) amp; = amp; exp(−e−x ) on R, (EV 1 or Gumbel)
Gθ (x) amp; = amp; exp(−x
−θ
) on (0, ∞), (EV 2 or Frechet)
(2.8)
(2.9)
θ
or Hθ (x) amp; = amp; exp(−(−x) ) on (−∞, 0), (EV 3 or Weibull).2
(2.10)
We now give a method of obtaining or approximating ∆n1 of (2.4) and
σ/σ(yn ) needed for expansions of x about x∗ of (2.6). Recall that for G any
distribution on Rp ,
Z
∞ Z
X
T (G) − T (F ) =
. . . TF (x1 , . . . , xr )dG(x1 ) . . . dG(xr )/r!.
r=1
So,
T (Fy ) − T (F ) =
∞
X
tr (y, T )F (y)r /r!,
r=1
where
tr (y, T ) =
Z
y
...
Z
y
TF (x1 , . . . , xr )dF (x1 ) . . . dF (xr ).
We shall see that in our examples below, for large y as r increases tr (y, T )
is decreasing in magnitude, so that
Z
T (Fy ) − T (F ) ≈ t1 (y, T ) = − TF (x)dF (x).
Also in our example, a21 (Fy ) − a21 (F ) = O(t1 (y, T )2 ). Suppose that as y
approaches its upper limit,
a21 (Fy ) − a21 (F ) = O(t(y, a21 ))
(2.11)
then σ/σ(y) = 1 + O(a21 (Fy ) − a21 (F )) = 1 + O(t1 (y, a21 )). Also ∆n1 =
n1/2 t1 (yn , T ) + O(n1/2 t2 (yn , T )) assuming that
T (Fy ) − T (F ) = t1 (y, T ) + O(t2 (y, T )).
(2.12)
Note 2.3. If T (F ) is a polynomial in F of degree I, for example the Ith
cumulant or central moment µI or κI , then tr (y, T ) = 0 for r gt; I so (2.11)
Expansions for the joint distribution of the sample maximum
and sample estimate
115
holds if ti (y, a21 ) = O(t1 (y, a21 )) for 2 ≤ i ≤ 2I, and (2.12) holds if ti (y, T ) =
O(t2 (y, T )) for 3 ≤ i ≤ I.
2
So,
∆n = ∆n0 /σ + O(en1 ),
(2.13)
where ∆n0 = n1/2 t1 (yn , T ), en1 = n1/2 |t1 (yn , T )| + |t2 (yn , a21 )| and en2
= en1 + ∆2n0 + n−1 then
P (Yn (F ) ≤ x∗ |Mn∗ ≤ v) amp; = amp; Φ∗ − φ∗ (∆n0 σ −1 + n−1/2 h∗ )
+O(n1/2 en0 + en2 ),
P (Yn (F ) ≤ x
∗
, Mn∗
(2.14)
≤ v) amp; = amp; Gn (v). RHS
n
o
amp; = amp; Φ∗ − φ∗ (∆n0 σ −1 + n−1/2 h∗ )
×G(v) + O(en3 ),
where en0 = h1 (x∗ , Fyn ) − h1 (x∗ , F ) → 0 and en3 = n1/2 en0 + en2
+ supv |Gn (v) − G(v)|. Also
▽n (x∗ , v) = P (Yn (F ) ≤ x∗ , Mn∗ ≤ v) − P (Yn (F ) ≤ x∗ )P (Mn∗ ≤ v)
satisfies ▽n (x∗ , v) = −φ∗ Gn (v)∆n0 σ −1 + O(en2 ). In our examples we find
that there exists a function a(v) such that
t1 (yn , T ) = λn1 {a(v) + O(λn2 )},
(2.15)
where λn1 and λn2 do not depend on v. So,
∆n0 amp; = amp; n1/2 λn1 {a(v) + O(λn2 )},
P (Yn (F ) ≤ x∗ |Mn∗ ≤ v) amp; = amp; Φ∗ − φ∗ {n1/2 λn1 a(v)σ −1
+n−1/2 h∗ } + O(en4 ),
P (Yn (F ) ≤ x
∗
, Mn∗
(2.16)
≤ v) amp; = amp; Gn (v). RHS (2.15)
∗
∗
amp; = amp; G(v)[Φ − φ {n
+n
1/2
λn1 a(v)σ
(2.17)
−1
−1/2 ∗
h }] + O(en5 )
▽n (x , v) amp; = amp; −φ∗ n1/2 λn1 Gn (v)a(v)σ −1 + O(en4 )
∗
amp; = amp; −φ∗ n1/2 λn1 G(v)a(v)σ −1 + O(en6 ),
where
en4 amp; = amp; en2 + n1/2 λn1 λn2 + n−1/2 en0 ,
en5 amp; = amp; en4 + sup |Gn (v) − G(v)|,
v
1/2
en6 amp; = amp; en4 + n
λn1 sup |Gn (v) − G(v)|.
v
(2.18)
116
Christopher S. Withers and Saralees Nadarajah
So, the strong–mixing coefficient is
α(θˆn , Mn ) = n1/2 λn1 C(T ) + O(en6 ),
(2.19)
where C(T ) = (2π)−1/2 σ −1 K(T ) and K(T ) = supv G(v)|a(v)|. This gives
an asymptotic value for αn = α(δˆn , µn ) assuming that
sup |Gn (v) − G(v)| → 0 and n−1 + en1 = o(n1/2 λn1 ),
v
(2.20)
since en6 = o(n1/2 λn1 ). And (2.19) gives an asymptotic value for αn =
α(δˆn , µn ).
We now give further details for T (F ) = µ and T (F ) = µ2 , where µ =
µ(F ) = EX1 , µr = µr (F ) = E(X1 − µ)r and mr = EX1r . Set
Z
pi (y) amp; = amp; xi dF (x),
y
Z y
xi dF (x) = mi − pi (y),
pi (y) amp; = amp;
q r (y) amp; = amp;
Z
qr (y) amp; = amp;
Z
r
X
(ri )pr−i (y)(−µ)i ,
(x − µ) dF (y) =
r
y
i=0
y
(x − µ)r dF (y) = µr − q r (y).
By Example 5.3 of Withers (2007b),
µF (x) amp; = amp; x − µ,
µrF (x) amp; = amp; (x − µ)r − µr ,
µrF (x1 , . . . , xp ) amp; = amp; (−1)p {(r)p µr−p − (r)p−1
p
p
Y
X
r−p
−1
hj ,
(hi − µr−p+1 hi )}
×
i=1
j=1
where hj = xj − µ and (r)p = r(r − 1) . . . (r − p + 1) = r!/(r − p)!.
Example 2.1. Suppose T (F ) = µ. Then σ 2 = µ2 ,
T (Fy ) − T (F ) = t1 (y, µ) = −q(y) = −p1 (y) + µp0 (y).
Also a21 (F ) = µ2 and µ2F (x1 , x2 ) = −2h1 h2 , so
t1 (y, µ2 ) amp; = amp; −q2 (y) + µ2 q 0 (y) = −p2 (y) + 2µp1 (y) − m2 p0 (y),
t2 (y, µ2 ) amp; = amp; −2t1 (y, µ)2 = −2q 1 (y)2 .
Expansions for the joint distribution of the sample maximum
and sample estimate
117
By Note 2.3, (2.12) holds, and (2.11) holds if
q 1 (y)2 = O(q 2 (y) − µ2 q 0 (y)).
(2.21)
Also t2 (y, µ) = 0 so ∆n0 = −n1/2 {q 2 (yn ) − µ2 q 0 (yn )}, en1 = |t1 (yn , µn )| =
2
2q 1 (yn )2 and en2 = (n+2)q 1 (yn )2 +n−1 .
Example 2.2. Suppose T (F ) = µ2 . Then ∆n0 = n1/2 t1 (yn , µ2 ). By Note
2.4, (2.12) holds. Also (2.11) holds if ti (y, a21 ) = O(t1 (y, a21 )) for 2 ≤ i ≤ 4,
where σ 2 = a21 (F ) = µ4 − µ22 . So,
a21F (x) amp; = amp; µ4F (x) − 2µ2 µ2F (x),
t1 (y, a21 ) amp; = amp; −q 4 (y) + µ4 q 0 (y) + 4µ3 q 1 (y)
+2µ2 {q 2 (y) − µ2 q 0 (y)}.
Also,
a21 (x1 , x2 ) amp; = amp; µ4F (x, x2 ) − 2µ2F (x1 )µ2F (x2 ) − 2µ2 µ2F (x1 , x2 ),
2
2
X
X
2
hi ,
hi )h1 h2 + 4µ3
µ4F (x, x2 ) amp; = amp; 12µ2 h1 h2 − 4(
i=1
i=1
so
t2 (y, a21 ) amp; = amp; 12µ2 q1 (y)2 − 8q1 (y)q3 (y) + 8µ3 q1 (y)
amp;
amp; −2t1 (y, µ2 )2 + 4µ2 t1 (y, µ)2
amp; = amp; −8q 1 (y)q 3 (y) − 2{q 2 (y) − µ2 q 0 (y)}2 + 16µ2 q 1 (y)2 .
Set T [12 . . .] = TF (x1 , x2 , . . .). By
chain rule, (A4) and (A5) of Withers
Pthe
3
a21 [1234] = µ4 [1234] −
(2007b),
a21 [123] = µ4 [123] − 2 Pµ2 [1]µ2 [23] and P
P
2 3 µ3 [12]µ2 [34]. Also µ4 [123] = 3 h21 h2 h3 −12µ2 3 h2 h3 , so t3 (y,
Qa421 ) =
2
2
2
36q2 (y)q1 (y) − 36µ2 q1 (y) = −36q 2 (y)q 1 (y) . Also µ4 [1234] = −72 i=1 hi ,
2
so t4 (y, a21 ) = −72q1 (y)4 −6t2 (y, µ2 )2 = −96q 1 (y)4 .
Note 2.4. In the univariate examples that follow we have chosen convenient location and scale/parameters. This does not restrict generality since if
X0 = µ+τ X with τ gt; 0, then Mn (X0 ) = µ+τ Mn(X), where Mn (X) = Mn .
So, if Mn∗ = bn Mn − an then Mn∗ = b′n Mn (X0 − a′n ), where b′n = bn /τ and
a′n = an + bn µ/τ . If θ(x) = θ = T (F ) is a location parameter (for example
the mean or median), then θ(X0 ) = µ + τ θ(X) and a21 (X0 ) = τ 2 a21 (X). If
θ(X) = T (F ) is the rth power of a scale parameter (for example µr ) then
θ(X0 ) = τ r θ(X) and a21 (X0 ) = τ 2r θ(X).
2
118
Christopher S. Withers and Saralees Nadarajah
3
An EV2 Example
Here we illustrate our expansion for L(Yn (F ), Mn∗ ) for p = 1 and F having
upper tail 1−F (y) = Ky −θ {1+O(y −β )} as y → ∞, where K gt; 0, θ gt; 0 and
β gt; 0. We need also the slightly stronger condition d/dy{1−F (y)−Ky −θ } =
O(y −θ−β−1 ) as y → ∞.
Note 3.1. This holds with β = θ for the EV2 and also for the stable
law of index θ lt; 1.
2
Take Mn∗ = bn Mn − an with an = 0, bn = (Kn)1/θ . Then (2.7) holds with
G = Gθ of (2.9). Also yn = v(Kn)1/θ , so
Gn (v) = Gθ (v) + O(n−ǫ0 )
(3.1)
for ǫ0 = min(1, β/θ), pi (y) = −κθ(θ − i)−1 y i−θ {1 + O(y −β )} for θ gt; i,
q r (y) = −κθ(θ − r)−1 y r−θ {1 + O(y −β ) + y −1 } for θ gt; r. Set ai = θ/(θ − i),
iθ
−βi .
βi = a−1
i (1 + log ai ) and ki = k ai e
Example 3.1. Suppose T (F ) = µ. So, σ 2 = µ2 and (2.11) holds so
(2.12) and (2.13) hold. Also (2.15) holds with λn1 = n1/θ−1 , λn2 = n−ǫ0 and
a(v) = k1/θ θ(θ − 1)−1 v 1−θ . So, assuming θ gt; 2 and (2.16)–(2.18) hold with
en4 = O(n−ǫ1 ) for ǫ1 = min(1/2, 1 − 2θ −1 , 1/2 + (β − 1)θ −1 ), en5 = O(n−ǫ2 )
for ǫ2 = min(1/2, 1 − 2θ −1 , βθ −1 ), en6 = O(n−ǫ1 ) and k(µ) = K1 .
2
Example 3.2. Suppose T (F ) = µ2 . So, σ 2 = µ4 − µ22 . By Note 2.5,
(2.12) holds since ti (y, a21 ) = O(y 4−iθ ), assuming θ gt; 4. Also en1 =
O(n4/θ−1 ) and (2.17) holds with λn1 = n2/θ−1 , λn2 = n−ǫ0 and a(v) =
k2/θ (θ − 2)−1 v 2−θ ). So, ∆n0 and en2 are O(n2/θ−1/τ ), and (2.16)–(2.19)
hold with en4 = O(n2/θ−1/2 ), en5 = O(n−ǫ3 ) for ǫ3 = min(1/2 − 2/θ, β/θ),
en6 = O(n−ǫ4 ) for ǫ4 = min(1/2, 1 − 2/θ, 1/2 + (β − 2)/θ) and k(µ2 ) = k2 . 2
In both cases (2.20) holds so (2.19) gives the asymptotic value of αn .
We have shown, among other things, that for T1 = µ and T2 = µ2 , αn =
α(θˆn , Mn ) = C(Ti )ni/θ−1/2 +O(en6 (Ti )), where for i = 1, 2, C(Ti ) = (2π)−1/2
ki a21 (Ti )−1/2 , en6 (Ti ) = nνi and νi = (1/2, 1 − 2/θ, 1/2 + (β − i)/θ).
4
An EV3 Example
Suppose 1 − F (y) = k(y)θ {1 + O((−y)β )} as y ↑ 0, where k gt; 0, θ gt; 0
and β gt; 0. We need also the slightly stronger condition d/dy{1 − F (y) −
k(−y)θ } = O((−y)θ+β−1 ) as y ↑ 0. This holds for the EV3 with β = θ.
Expansions for the joint distribution of the sample maximum
and sample estimate
119
Note 4.1. Take Mn∗ = bn Mn − an with an = 0 and bn = (kn)1/θ
so yn = v(kn)−1/θ , then (2.7) holds with G = Hθ of (2.10). Also Gn (v) =
Hθ (v)+O(n−ǫ0 ) for ǫ0 of (3.1), pi (y) = (−1)i kθ(θ+i)−1 (−y)θ+i {1+O(−y)β }
and q r (y) = k(−y)θ {(−µ)r +O(−y)r +O(y)}. So, t1 (y, µ2 ) = k(−y)θ {−m2 +
O(−y)β + O(y)}.
2
Example 4.1. Suppose T (F ) = µ. So, (2.21), (2.11) and (2.12) hold.
Also (2.15) holds with λn1 = n−1 , λn2 = n−ǫ5 for ǫ5 = min(β, 1)/θ and
a(v) = (−v)θ µ. So, en1 = O(n−1 ) and en2 = O(n−1 ). So, (2.16)–(2.19) hold
with en4 = O(n−ǫ6 ) for ǫ6 = min(1, 1/2 + 1/θ, 1/2 + β/θ), en5 = O(n−ǫ7 )
for ǫ7 = min(1, 1/2 + 1/θ, β/θ), en6 = O(n−ǫ6 ), k(µ) = e−1 |µ| and C(µ) =
−1/2
(2π)−1/2 e−1 |µ|µ2 .
2
Example 4.2. Suppose T (F ) = µ2 . Then t1 (y, a21 ) = k(−y)θ {k6 +
O(−y)β +O(−y)}, where k6 = µ4 −4µ3 µ+2µ2 µ2 +µ4 −2µ22 = E(X1 −2µ)4 −
2µ22 − 4µ2 µ2 and ti (y, a21 ) = O(−y)iθ . So, for k3 6= 0, and (2.11) holds by
Note 2.4. Also en1 = O(n−1 ) and (2.17) holds with λn1 = n−1 , λn2 = n−ǫ7
for ǫ7 = min(1, β)/θ, a(v) = (µ2 − µ2 )(−v)θ . So, ∆n0 = O(n−1/2 ) and
en2 = O(n−1 ). So, (2.17)–(2.19) hold with en4 = O(n−ǫ6 ), en5 = O(n−ǫ4 ),
en6 = O(n−ǫ6 ) and k(µ2 ) = e−1 |µ2 −µ2 |.
2
To summarise, for T (F ) = µ and µ2 , (2.17)–(2.19) hold with en4 =
O(n−ǫ6 ), en5 = O(n−ǫ7 ), en6 = O(n−ǫ6 ), C(T ) = (2π)−1/2 e−1 C0 (T ), where
−1/2
C0 (µ) = |µ|µ2
and C0 (µ2 ) = |µ2 − µ2 |(µ4 − µ22 )−1/2 . Also (2.20) holds so
(2.19) gives an asymptotic value for αn .
5
Two EV1 Examples
Our examples here include the gamma and normal distributions. First,
suppose
1 − F (y) = ky d exp(−y){1 + O(y −β )}
(5.1)
as y → ∞. We need also the slightly stronger condition
d/dy{1 − F (y) − ky d exp(−y)} = O(y d−β exp(−y))
as y → ∞.
γ−1 −x
e
Note 5.1. This holds for the gamma distribution with density x Γ(γ)
on (0, ∞) if β1 = 1, d = γ − 1, k = Γ(γ)−1 . It also holds for EV1 with k = 1,
d = 0 and any β (in fact with y −β replaced by e−y ).
120
Christopher S. Withers and Saralees Nadarajah
Set n1 = log n, n2 = log log n, kδ = log k, Mn∗ = bn Mn − an with bn = 1
and an = n1 + dn1 + kδ . Then (2.7) holds with G = Gθ of (2.8), and
−β
yn = v + an . Also Gn (v) = Gθ (v) + O(en ), where en = n2 n−1
and
1 + n1
d+i
−β
−1
pi (y) = κy
exp(−y){1 + O(ǫy )}, where ǫy = y + y , and q r (y) =
κy d+r exp(−y){1 + O(ǫy )}. So, µ(Fy ) − µ = t1 (y, µ) = −κy d+1 exp(−y){1 +
2d
O(ǫy )}, t1 (y, µ2 ) = −κy d+2 exp(−y){1 + O(ǫy )} and t2 (y, µ2 ) =O(y 2 e−2y ).
2
Example 5.1. Suppose T (F ) = µ. So, (2.21), (2.11) and (2.12) hold.
Also (2.15) holds with λn1 = n−1 n1 , λn2 = en and a(v) = −e−v . So, en1 =
O(n21 /n) and en2 = O(n21 /n). So, (2.17)–(2.19) hold with en5 = O(en ), en6 =
O(n−1/2 n1 en ) and k(µ) = e−1 .
2
Example 5.2. Suppose T (F ) = µ2 . Then t1 (y, a21 ) = −κy d+4 exp(−y)
{1 + O(ǫy )} and ti (y, a21 ) = O(y id+4 exp(−iy)). So, (2.11) holds by Note
2.4. Also en1 = O(n41 /n), en2 = O(n41 /n) and (2.15) holds with λn1 = n21 /n,
λn2 = en and a(v) = −e−v . So, (2.17)–(2.19) hold with en4 = O(n−1/2 n21 en ),
en5 = O(en ), en6 = O(n−1/2 n21 en ) and k(µ2 ) = e−1 .
2
We now turn to a class of distributions that includes the normal. Replace
(5.1) by the assumption:
1 − F (y) = ky d exp(−y 2 ){1 + O(y −β )}
(5.2)
as y → ∞ or rather the slightly stronger condition d/dy{1 − F (y) − κy d
× exp(−y 2 )} = O(y d+1−β exp(−y 2 )) as y → ∞.
Note 5.2. This holds for F (y) = ψ(21/2 y), the joint distribution of x ∼
n(0, 1/2), with d = 1, β = 2 and k = π −1/2 /2.
2
1/2
Set Mn∗ = bn Mn − an with an = 2n1 + dn2 /2 + k8 and bn = 2n1 . By
(5.2),
Gn (v) = Gθ (v) + O(en )
(5.3)
−β/2
for en = n22 n−1
. Also yn = (v + an )/bn , pi (y) = 2ky i+d exp(−y 2 )
1 + n1
{1 + O(y −ǫ )} for ǫ = min(2, β), q r (y) = 2ky r+d exp(−y 2 ) {1 + O(y −ǫ )},
µ(Fy )− µ = t1 (y, µ) = −2ky 1+d exp(−y 2 ) {1+ O(y −ǫ )}, t2 (µ, a21 ) = O(y 2+2d
exp(−2y 2 )) and t1 (y, µ2 ) = −2ky 2+d exp(−y 2 ) {1 + O(y −ǫ )}.
Example 5.3. Suppose T (F ) = µ. So, (2.21), (2.11) and (2.12) hold.
1/2
Also (2.15) holds with λn1 = n1 /n, λn2 = en and a(x) = −2 exp(−v).
So, en1 = O(n1 /n), en2 = O(n1 /n) and (2.17)–(2.19) hold with en4 =
1/2
1/2
O(n−1/2 n1 en ), en5 = en , en6 = O(n−1/2 n1 en ) and k(µ) = 2e−1 .
2
Expansions for the joint distribution of the sample maximum
and sample estimate
121
Example 5.4. Suppose T (F ) = µ2 . Then t1 (y, a21 ) = −2ky 4+d exp(−y 2 )
{1 + O(y −ǫ )} and ti (y, a21 ) = O(y 4+id exp(−iy 2 )). So, (2.11) holds by Note
2.4. Also (2.15) holds with λn1 = n1 /n, λn2 = en and a(v) = −2 exp(−v).
Also en1 = O(n21 /n), en2 = O(n21 /n) and (2.17)–(2.19) hold with en4 =
O(n−1/2 n1 en ), en5 = en , en6 = O(n−1/2 n1 en ) and k(µ2 ) = 2e−1 .
2
So, putting i = 1 for µ and i = 2 for µ2 , αn = n−1/2 {2D(Ti ) + O(en )}
for D(Ti ) of (5.2). For x ∼ n(0, 1/2), en = O(n22 /n1 ), σ1 = 2−1/2 and σ2 = 1
so 2D(Ti ) = π −1/2 e−1 c1/2 . For x ∼ n(µ, µ2 ) apply Note 2.4.
6
Extensions
Here we mention a number of ways in which these results may be extended. First consider the case of two samples of sizes n1 and n2 with
empirical distributions F1n1 and F2n2 from distributions F1 , F2 on Rp1 , Rp2 .
Set θˆ = T (F1n1 , F2n2 ) and θ = T (F1 , F2 ) is a smooth function in R2 . Let
L
ˆ
M1n1 and M2n2 be the sample maxima. Then θ|(M
1n1 ≤ y1 , M2n2 ≤ y2 ) =
ˆ the samples are from F1y , F2y , where Fiy is Fy for Fi . The Edgeworth
θ|
1
2
expansion for θˆ is given for the case q = 1 by Withers (1988), and for general
q may be obtained from the cumulant coefficients given there. In this way
our previous results can be extended to two or more samples.
Secondly, if q = 1, by the extended Cornish-Fisher expansions of Withers
(1984), we can write the Edgeworth expansion (2.2) in the form
Pn (x) = P (Yn (F ) ≤ x) = φ(x −
∞
X
n−r/2 fr (x, F ))
r=1
with quantile
Pn−1 (x) = y +
∞
X
gr (z, F ),
r=1
where z = Φ−1 (x) is the unit normal quantile. So, we can do likewise with
Pn (x|y) = P (Yn (Fy ) ≤ x|Mn ≤ y).
Thirdly, if q = 1, we can if desired apply the nonparametric confidence
interval of Withers (1983) to obtain a random interval Ink such that P (θˆ ∈
Inh |Mn ≤ y) = .95 + O(n−h/2 ) for any fixed k. This may be useful for
obtaining more accurate p–values of hypotheses on θ = T (F ) conditional on
Mn ≤ y, or equivalently on mn ≥ y.
122
Christopher S. Withers and Saralees Nadarajah
Fourthly, writing the results of our examples in the form P (Yn (F ) ≤
≤ v) = Φ(x∗ ) + φ(n∗ )ǫn1 a(x∗ , v) + O(ǫn1 , ǫn2 ), where ǫn1 , ǫn2 are
known, if as is usual we have a
ˆ(x∗ , v) = a(x∗ , v) + Op (n−1/2 ) then typically
P (Yn (F ) ≤ x∗n |Mn∗ ≤ v) = Φ(x∗ ) + O(ǫn1 (n−1/2 + ǫn2 )), where x∗n = x∗ −
ǫn1 a
ˆ(x∗ , v). This method was used in Withers (1982).
x∗ |Mn∗
Finally, we illustrate a different approach: conditioning not on Mn ≤ y
L
but on Mn = y: Fn |(Mn = y, ∼ F ) = (1 − n−1 )Fn−1 |(∼ Fy ) + n−1 I(y ≤ x).
So, putting δy (x) = I(y ≤ x),
L
T (Fn )|(Mn = y, ∼ F ) amp; = amp; T ((1 − n−1 )Fn−1 + n−1 δy )|(x ∼ Fy )
∞
X
n−r TFn−1 (y r )/r!,
amp; = amp; T (Fn−1 ) +
r=1
(y r )
where TF
= TF (x1 , . . . , xr ) with x1 = . . . = xr = y. One can now obtain
the Edgeworth expansion for P (T (Fn ) ≤ x|Mn = y) by applying Corollary
5.2 of Withers (1983) with n replaced by n − 1. This requires computing
some of the cross–cumulant coefficients of {TFn−1 (y r ), r ≤ 0} using Lemma
5.1 of Withers (1983). So,
P (Yn (Fy ) ≤ x|Mn = y) amp; = amp; Φ(n) − φ(n)
∞
X
(n − 1)−v/2 h2r (x, Fy )
r=1
−1/2
amp; = amp; Φ(n) − φ(n)n
′
′
h1 (x, Fy ) + O(n−1 ),
−1/2
where, for example, h1 (n, Fy ) = h1 (n, Fy ) + TF (y)a21 . So, P (Yn (F ) ≤
x∗ |Mn∗ = v) = RHS (2.14) = RHS (2.18) with en0 replaced by e′n0 =
′
−1/2
h1 (x∗ , Fyn ) − h′1 (x∗ , F ) = en0 + TF (yn )a21 → 0. So, for fixed x∗ , (2.11),
(2.12) and (2.15) imply
amp; P (n1/2 (θˆ − θ) ≤ x∗ |Mn∗ = v)
oi
h
n
amp; = amp; G(v) Φ(x∗ ) − φ(x2 ) n1/2 λn1 a(v)σ −1 + n−1/2 h∗ + O(en5 )
amp;
+O(n−1/2 )
and
amp;
amp; P (n1/2 (θˆ − θ)/σ ≤ x∗ |Mn∗ = v) − P (n1/2 (θˆ − θ)/σ ≤ x∗ )
amp; = amp; −φ(n∗ )n1/2 λn1 G(v)a(v)σ −1 + O(en6 ) + o(n−1/2 )
for en5 and en6 of Section 2.
Acknowledgments. The authors would like to thank the Editor and the
referee for their helpful suggestions.
Expansions for the joint distribution of the sample maximum
and sample estimate
123
References
Anderson, C.W. and Turkman, K.R. (1991). The joint limiting distribution of sums
and maxima of stationary sequences. J. Appl. Probab., 28, 33–44.
Bhattacharya, R.N. and Rao, R.R. (1976). Normal Approximation and Asymptotic
Expansions. Wiley, New York.
Chow, T.L. and Teugels, J.L. (1978). The sum and the maximum of i.i.d. random
variables. In Proceedings of the Second Prague Symposium Asymptotic Statistics,
81–92.
Fisher, R.A. and Tippett, L.H.C. (1928). Limiting forms of the frequency distributions of the largest or smallest member of a sample. Proceedings of the Cambridge
Philosophical Society, 24, 180–190.
Hsing, T. (1995a). A note on the asymptotic independence of the sum and maximum
of strongly mixing stationary random variables. Ann. Probab., 23, 938–947.
Hsing, T. (1995b). On the asymptotic independence of the sum and rare values of
weakly dependent stationary random variables. Stochastic Processes and Their
Applications, 60, 49–63.
Ho, H.-C. and Hsing, T. (1996). On the asymptotic joint distribution of the sum and
maximum of a Gaussian sequence. J. Appl. Probab., 33, 138–145.
McCormick, W.P. and Sim, J. (1993). Sums and maxima of discrete stationary processes. J. Appl. Probab., 30, 863–876.
Withers, C.S. (1982). Second-order inference for asymptotically normal random variables. Sankhy¯
a, B, 44, 19–27.
Withers, C.S. (1983). Expansions for the distribution and quantiles of a regular functional of the empirical distribution with applications to nonparametric confidence
intervals. Ann. Statist., 11, 577–587.
Withers, C.S. (1984). Asymptotic expansions for distributions and quantiles with power
series cumulants. J.Roy. Statist. Soc., B, 46, 389–396. Corrigendum: B, 48, 258.
Withers, C.S. (1988). Nonparametric confidence intervals for functions of several distributions. Annals of the Institute of Statistical Mathematics, 40, 727–746.
Withers, C.S. (2007a). Transformations of multivariate Edgeworth–type expansions.
Technical Report, Applied Mathematics Group, Industrial Research Ltd., Lower
Hutt, New Zealand.
Withers, C.S. (2007b). Estimates of low bias for functionals of distribution functions.
Technical Report, Applied Mathematics Group, Industrial Research Ltd., Lower
Hutt, New Zealand.
Christopher S. Withers
Applied Mathematics Group
Industrial Research Limited
Lower Hutt, NEW ZEALAND.
E-mail: [email protected]
Saralees Nadarajah
School of Mathematics
University of Manchester
Manchester M13 9PL, UK
E-mail: [email protected]
Paper received January 2008; revised May 2008.