Document 283486

BOOTSTRAP
J.N.K.
Rao,
Carleton
INFERENCE
University
FOR SAMPLE
and C.F.J.
SURVEYS
Wu,
University
of W i s c o n s i n - M a d i s o n
ties of the r e s u l t i n g v a r i a n c e e s t i m a t o r are
studied.
The b o o t s t r a p e s t i m a t e of b i a s of
is also obtained.
Section 3 provides bootstrap
c o n f i d e n c e i n t e r v a l s for
@ . The r e s u l t s are
e x t e n d e d to s t r a t i f i e d simple r a n d o m s a m p l i n g
w i t h o u t r e p l a c e m e n t in S e c t i o n 4.
Finally, the
m e t h o d is e x t e n d e d to u n e q u a l p r o b a b i l i t y sampling w i t h o u t r e p l a c e m e n t in S e c t i o n 5 and to twostage C l u s t e r s a m p l i n g w i t h o u t r e p l a c e m e n t in
S e c t i o n 6.
The b o o t s t r a p m e t h o d of i n f e r e n c e is e x t e n d e d
to s t r a t i f i e d sampling, i n v o l v i n g a large n u m b e r
of strata with r e l a t i v e l y few (primary) u n i t s
s a m p l e d w i t h i n strata, when the p a r a m e t e r of
interest,
@ , is a n o n l i n e a r f u n c t i o n of the
p o p u l a t i o n mean vector.
The b o o t s t r a p e s t i m a t e
of b i a s of the estimate, ~ , of
e a n d the estimate of v a r i a n c e of
@ are obtained.
Bootstrap
c o n f i d e n c e i n t e r v a l s for
@ are also given,
u t i l i z i n g the p e r c e n t i l e m e t h o d or the b o o t s t r a p
h i s t o g r a m of the t - s t a t i s t i c .
E x t e n s i o n s to
unequal probability sampling without replacement
and t w o - s t a g e sampling are also obtained.
2.
The B o o t s t r a p M e t h o d
The p a r a m e t e r of i n t e r e s t
@ is a n o n l i n e a r
f u n c t i o n of the p o p u l a t i o n mean vector
Y =
(y% .... ,~p) T , say
@ = g(Y).
This f o r m of
@
includes ratios, r e g r e s s i o n and c o r r e l a t i o n
coefficients.
If
n h (> 2) p s u ' s are s e l e c t e d
with r e p l a c e m e n t with p r o b a b i l i t i e s
Phi
in
stratum
h , then K r e w s k i and Rao (1981) have
shown that the n a t u r a l e s t i m a t o r
~ = g(~)
can
1.
Introduction
R e s a m p l i n g methods, i n c l u d i n g the jackknife
and the b o o t s t r a p , p r o v i d e s t a n d a r d error estimates and c o n f i d e n c e i n t e r v a l s for the p a r a m e t e r s
of interest.
These m e t h o d s are simple a n d
s t r a i g h t f o r w a r d b u t are c o m p u t e r - i n t e n s i v e ,
e s p e c i a l l y the b o o t s t r a p .
E f r o n (1982) has g i v e n
an e x c e l l e n t a c c o u n t of r e s a m p l i n g m e t h o d s in the
case of an i n d e p e n d e n t and i d e n t i c a l l y d i s t r i b u ted (i.i.d.) sample of f i x e d size
n
f r o m an
unknown distribution
F , and the p a r a m e t e r of
interest
@ = 0(F) . L i m i t e d e m p i r i c a l e v i d e n c e
(see Efron, 1982, p.18) has i n d i c a t e d that the
b o o t s t r a p s t a n d a r d error e s t i m a t e s are likely to
be more stable than those b a s e d on the jackknife
and a l s o less b i a s e d than those b a s e d on the
c u s t o m a r y delta (linearization) method.
Moreover,
the b o o t s t r a p h i s t o g r a m of t - s t a t i s t i c a p p r o x i m a t e s the true d i s t r i b u t i o n of
t with a r e m a i n der term of
O_ (n -1)
in the E d g e w o r t h e x p a n s i o n
(Abramovitch anPd Singh, 1984), unlike the usual
n o r m a l a p p r o x i m a t i o n with a r e m a i n d e r term of
Op(n-½).
E m p i r i c a l r e s u l t s for the ratio
e s t i m a t e (Hinkley and Wei, 1984) also indicate
that the c o n f i d e n c e i n t e r v a l s from the b o o t s t r a p
h i s t o g r a m of
t
(using the l i n e a r i z a t i o n v a r i a n c e
estimator) p e r f o r m b e t t e r than those b a s e d on the
normal a p p r o x i m a t i o n .
The main p u r p o s e of this a r t i c l e is to p r o p o s e
an e x t e n s i o n of the b o o t s t r a p m e t h o d to stratified samples, in the c o n t e x t of sample survey
data; e s p e c i a l l y to data o b t a i n e d from s t r a t i f i e d
c l u s t e r samples i n v o l v i n g large n u m b e r s of strata,
L , with r e l a t i v e l y few p r i m a r y s a m p l i n g units
(psu's) s a m p l e d w i t h i n each stratum.
For nonlinear s t a t i s t i c s
~
that can be e x p r e s s e d as
f u n c t i o n s of e s t i m a t e d means of
p (=> i) variables, K r e w s k i and Rao (1981) e s t a b l i s h e d the
a s y m p t o t i c c o n s i s t e n c y of the v a r i a n c e e s t i m a t o r s
from the jackknife, the delta and the b a l a n c e d
r e p e a t e d r e p l i c a t i o n (BRR) m e t h o d s as
L ÷ ~
w i t h i n the c o n t e x t of a sequence of finite p o p u lations
{~L } with
L
strata in ~L " Their
r e s u l t is v a l i d for any m u l t i s t a g e d e s i g n in
which the p s u ' s are s e l e c t e d with r e p l a c e m e n t and
in which i n d e p e n d e n t s u b s a m p l e s are s e l e c t e d
w i t h i n those p s u ' s s a m p l e d more than once.
Rao
and Wu (1983) o b t a i n e d second order a s y m p t o t i c
e x p a n s i o n s of these v a r i a n c e e s t i m a t o r s under the
above set up and made c o m p a r i s o n s in terms of
their biases.
The p r o p o s e d b o o t s t r a p m e t h o d for s t r a t i f i e d
samples is d e s c r i b e d in S e c t i o n 2 and the p r o p e r -
be e x p r e s s e d as
~ = g(y).
Here ~
is a d e s i g n u n b i a s e d linear e s t i m a t o r of ~ = 7.WhY h
and
m
y = ~WhY--h
where
Wh
and
Y--h= (Y--hl..... Y--hp) T
are the h-th s t r a t u m w e i g h t
(7W h = i) and p o p u l a tion mean v e c t o r r e s p e c t i v e l y and
Yh
is the
m e a n of
nh
i.i.d, r a n d o m v e c t o r s
Yhi =
(Yhil'""" ,Yhip) T
For
h ~ h'
,
for each
Yhi
b u t not n e c e s s a r i l y
2.1.
i.i.d,
and
h
with
Yh' j
are
identically
The Naive b o o t s t r a p .
{yi}~
sample
with
E(Yhi)
= Y--h"
independent
distributed.
In the case of an
E (yi)
= Y
, the b o o t -
strap m e t h o d is as follows:
(i) D r a w a simple
r a n d o m sample
{ .}n
with r e p l a c e m e n t from the
Yi 1
observed
values
YI'Y2'''''Yn
and c a l c u l a t e
@* = g (y*)
where
y•
~ Y i*/ n .
(ii) I n d e p e n d e n t ly r e p l i c a t e step (i) a large number,
B , of
times and c l a c u l a t e the c o r r e s p o n d i n g e s t i m a t e s
~,I,...,~,B .
(iii) The b o o t s t r a p v a r i a n c e
e s t i m a t o r of
e = g(y)
is g i v e n by
B
v b (a) = B-I
[ (8*b
-
G*)2
a
'
(2~)
"
b=l
where
@* = 7. @*b/B.
a
v b (a)
is an a p p r o x i m a t i o n
v b = vex,(e*)
The M o n t e - C a r l o
estimator
to
= E,(@* - E,@*)2 ,
(2.2)
where
E,
d e n o t e s the e x p e c t a t i o n with r e s p e c t
to b o o t s t r a p s a m p l i n g from a given sample
YI'''''Yn"
No c l o s e d - f o r m e x p r e s s i o n for
var,(@*)
g e n e r a l l y e x i s t s in the n o n l i n e a r case,
b u t in the linear case with
p = I,
@* = y*
and
vb
reduce s to
var,(y*)
where
106
(n-l)s
2
n-I
2
n-i
= ---~ s = - var(y),
n
n
= ~ (Yi - ~)
2
and
var(y)
(2.3)
= s2/n
is the u n b i a s e d e s t i m a t o r of v a r i a n c e of
y . The
modified variance estimator
[n/(n-l) ]var. (0")
exactly equals
var(y)
in the l i n e a r case, b u t
E f r o n (1982) f o u n d no a d v a n t a g e in this m o d i f i c a tion.
In any case,
n/(n-l) - 1
in m o s t a p p l i c a tions a n d
var. (0")
is a c o n s i s t e n t e s t i m a t o r of
the v a r i a n c e of
~ , as
n ÷ ~
(Bickel a n d
F r e e d m a n , 1981) . The b o o t s t r a p h i s t o g r a m of
where
nh
{Yhi }
in s t r a t u m
i=l
stratum.
h
f r o m the g i v e n
, independently
-.
-i
.
Yh = nh • i Y h i
Calculate
that one c o u l d use the b o o t s t r a p
~*i
~*B
g r a m of
t
,...,t
to f i n d c o n f i d e n c e
vale
for each
(2 4)
b=l
where
Carlo
--
^*
y = 7WhY h
estimator
and
vb(a)
V b = var.(0*)
2.2
The p r o p o s e d method.
follows:
(i)
D r a w a simple
*b
0a = ~@
/B
.
The M o n t e -
is an a p p r o x i m a t i o n
= E.(0*
- E.
@. 2
)
to
i
of size
,
(2.5)
nh
{Yhi }
.
i=l
(nh-i) Sh2 = 7 i (Yhi - Yh )2 "
the u n b i a s e d
estimator
~
=
Comparing
of v a r i a n c e
of
-I
mh
,
it i m m e d i a t e l y
(2 "6)
y
to
1
in p r o b a b i l i t y
as
with r e p l a c e m e n t
from
follows
(nh-i)-½ (Y *
- )
h i - Yh
-½ -. (yh-y h)
(2.7)
,
~:g(~)
that
var. (y*)/var (9)
does n o t c o n v e r g e to
1
in
probability, unless
L
is f i x e d a n d
nh ÷ ~
for
each
h . Hence,
var.(9*)
is n o t a c o n s i s t e n t
e s t i m a t o r of the v a r i a n c e of
9 • It a l s o
f o l l o w s that
vb
is n o t a c o n s i s t e n t e s t i m a t o r
of the v a r i a n c e (or m e a n square error) of a
g e n e r a l n o n l i n e a r statistic.
T h e r e does n o t
seem to be an o b v i o u s way to c o r r e c t this s c a l i n g
p r o b l e m e x c e p t w h e n ^ n~ = k
for all
h
in w h i c h
case
k(k-1)-ivar.(@*)'~ will be c o n s i s t e n t .
B i c k e l a n d F r e e d m a n (1984) a l s o n o t i c e d the
s c a l i n g p r o b l e m , b u t they were m a i n l y i n t e r e s t e d
in b o o t s t r a p c o n f i d e n c e i n t e r v a l s in the l i n e a r
case
(p = i).
T h e y h a v e e s t a b l i s h e d the
asymptotic
N(0,1)
p_ro~erty of the d i s t r i b u t i o n
of
t = (y - Y) / [var (y) ] ~ a n d of the c o n d i t i o n a l
d i s t r i b u t i o n of
(9* - Y ) / [ v a r . ( Y * ) ] ½ in
s t r a t i f i e d simple r a n d o m s a m p l i n g w i t h r e p l a c e ment, a n d a l s o p r o v e d that
(?,W~Sh2/nh)/var
.
~ ~
(9*)
converges
mh
Our m e t h o d is as
r a n d o m sample
[ Yhi : Yh + 4 ( n h - l )
i=l
~ : YWh~ h ,
var(y ) = ?W2s2/nh
for the b - t h
Calculate
Yhi = Yh + 4
Yh
with
Sb
2L,
i=l
where
E.
d e n o t e s the e x p e c t a t i o n w i t h r e s p e c t
to b o o t s t r a p sampling.
In the l i n e a r case w i t h
p = i,
0* = 7WhY-*
h = y-*
and
vb
r e d u c e s to
2
Wh
nh - i
2
(2 6)
var.(y*) : 7, ~hh ( nh ) Sh '
where
(Y
of
inter-
w2.b2
½
- Y ) / [ ? h hs
/nh]
Although
~.b
is a s y m p t o t i c a l l y
N (0, i)
in
the linear case, it is n o t l i k e l y to p r o v i d e as
g o o d an a p p r o x i m a t i o n to the d i s t r i b u t i o n of
t
as a s t a t i s t i c w h o s e d e n o m i n a t o r a n d n u m e r a t o r
are b o t h a d e q u a t e a p p r o x i m a t i o n s to their c o u n t e r p a r t s in
t . Such s t a t i s t i c s will be p r o p o s e d
in S e c t i o n 3.2.
These are a l s o a p p l i c a b l e to the
n o n l i n e a r case.
R e c o g n i z i n g the s c a l i n g p r o b l e m in a d i f f e r e n t
context, E f r o n (1982) s u g g e s t e d to d r a w a b o o t strap sample of size
nh -1
i n s t e a d of
nh
from
stratum
h
(h = I , . . . , L ) . In S e c t i o n 2.2, we
will i n s t e a d p r o p o s e a d i f f e r e n t m e t h o d w h i c h
i n c l u d e s h i s s u g g e s t i o n as a special case.
B
--
is the value
.b
way to c o n s t r u c t
t ' b - v a l u e s s i m i l a r to those of
B i c k e l a n d F r e e d m a n since
vb
h a s no c l o s e d form.
M o r e o v e r , the s t r a i g h t f o r w a r d e x t e n s i o n of the
b o o t s t r a p (hereafter c a l l e d the n a i v e b o o t s t r a p )
does n o t p e r m i t the use of the p e r c e n t i l e m e t h o d
b a s e d on the b o o t s t r a p h i s t o g r a m of
@*i ..... @*B.
-.
-*
' y =?WhYh
a
shb2-
~*b
t
=
histo-
b o o t s t r a p sample
(b = 1 .... ,B).
In the nonl i n e a r case, there does n o t seem to be a simple
sample
[ (~.b _ ~.)2,
Y , where
for
where
and
0* = g(Y*)(ii) I n d e p e n d e n t l y r e p l i c a t e
step (i) a large number,
B , of t i m e s and
c a l c u l a t e the c o r r e s p o n d i n g e s t i m a t e s
@*l,...,
@*B.
(iii) The b o o t s t r a p v a r i a n c e e s t i m a t o r of
@ = g(y)
is g i v e n by
v b (a) - B-I
Their result
implies
@.I,...,@.B
may be u s e d to f i n d c o n f i d e n c e
i n t e r v a l s for
0 . T h i s m e t h o d (Efron, 1982) is
c a l l e d the p e r c e n t i l e method.
N o t i n g the i.i.d, p r o p e r t y of the
Yhi' s
w i t h i n e a c h stratum, a s t r a i g h t f o r w a r d e x t e n s i o n
of the u s u a l b o o t s t r a p m e t h o d to s t r a t i f i e d
s a m p l e s (referred to as the "naive b o o t s t r a p " )
is as follows:
(i) Take a simple r a n d o m sample
.
nh
{Yhi }
with r e p l a c e m e n t
i=l
(nh-l)Sh 2 = 7 i( Y h.i - Y h-. ) 2 "
n = 7,nh ÷ ~
(ii) I n d e p e n d e n t l y r e p l i c a t e step (i) a large
number,
B , of t i m e s and c a l c u l a t e the c o r r e s ponding estimates
~i ..... ~B .
(iii)
The b o o t strap e s t i m a t o r
E. (8)
of
0 can be a p p r o x i m a t e d by
estimator
0a
of
=^zOb/B- .
0
The b o o t s t r a p
is g i v e n by
o b = v b = var.(8)
with
its M o n t e - C a r l o
O b(a)
: v b(a)
= E.(0
variance
(2.8)
- E.0)2
approximation
: B-I
g (
b=l
- ~a ) "
(2.9)
A
One can r e p l a c e
E.0
2.3.
Justification
l i n e a r case,
0 : Y ,
mary unbiased variance
,
107
in
(2.8) by
0 .
of the method.
In the
vb
r e d u c e s to the custoestimator
var(y) :
22
Y.WhSh/n h
case,
for any choice
it can be
shown
mh
.
its b o o t s t r a p
In the n o n l i n e a r
v b = v L + Op(n
-2
°b~*2(a) = vb(a)
*
where
that
)
,.
o.
(l-2Q)-level
B(@)
= E.(~)
-
@.
{@-
for
Confidence
o~
#{
For
Q < 0.5,
~
~-i
8Up(Q) = CDF
~b
define
~
8LOW(C~)
(I-Q).
Then
the
.
I"--1
= CDF
(Q)
~,
In the n o n l i n e a r
F r e e d m a n (1984),
(3.1)
A
and
)), C D ~ - I ( } ( 2 z 0
(~*
A
for
2
-~.. ,
)
,
(3.5)
.
A
h
(3.6)
A
~ - tLOWUJ}
(3.7)
A
where
tLO
and
t
are the lower and upper
Q - p o i n t s o~ the s t a ~ P s t i c
t* : (~ - 0)/~j
J
o b t a i n e d f r o m the b o o t s t r a p h i s t o g r a m of
(3.2)
+ z ))}
(3.4)
case, f o l l o w i n g B i c k e l and
~,2 ~2
we can show that
O b /O b
con-
{@ - tupOj,
is an a p p r o x i m a t e
(l-2Q)-level c o n f i d e n c e interval for
8 . It has the c e n t r a l
i-2~
portion
of the b o o t s t r a p d i s t r i b u t i o n (Efron, 1982, p.78).
One can also c o n s i d e r a b i a s - c o r r e c t e d p e r c e n t i l e
method, f o l l o w i n g Efron (1982, p.82).
This
m e t h o d leads to
{~F-l(~(2z0-z
g i v e n by
v e r g e s to
1
in p r o b a b i l i t y as
n ÷ ~.
Hence,
it f o l l o w s that the c o n d i t i o n a l d i s t r i b u t i o n of
t*
is a s y m p t o t i c a l l y
N (0,1) .
One c o u l d use a jackknife t - s t a t i s t i c
tj =
(@ - 8)/Oj
i n s t e a d of
t , where
~2_ is a
j a c k k n i f e v a r i a n c e e s t i m a t o r of
@ J(see K r e w s k i
and Rao, 1981).
The c o r r e s p o n d i n g c o n f i d e n c e
interval is then given by
interval
{~LOW (Cz)' ~UP (~) }
@
an a s y m p t o t i c j u s t i f i c a t i o n
Vb (a)
is a Monte Carlo
~*2
~2
E,O b : O b
< t; b = 1 ..... B}/B
for
tLOWOb } .
: v v : ~ . **
Intervals
3.1.
P e r c e n t i l e method.
For r e a d y reference,
we n o w give a b r i e f a c c o u n t of the p e r c e n t i l e
m e t h o d b a s e d on the b o o t s t r a p h i s t o g r a m of
~i ..... ~B . D e f i n e the c u m u l a t i v e b o o t s t r a p
d i s t r i b u t i o n f u n c t i o n as
=
, ~-
interval
where
~*
is the value of
~ obtained from
b o o t s t r a p p i n g the p a r t i c u l a r sample
{Yhi } and
E**
is the s e c o n d p h a s e b o o t s t r a p e x p e c t a t i o n ,
t* = (~-@)/U*.
In the linear case
we can write
8 = Y
it is e a s i l y seen that
We n o w c o n s i d e r d i f f e r e n t b o o t s t r a p m e t h o d s
s e t t i n g c o n f i d e n c e i n t e r v a l s for
8 .
CDF(t)
tupO b
~ 2
(2.11)
which is a p p r o x i m a t e d by
8 a - 8.
It can be
shown that
B(@)
is a c o n s i s t e n t e s t i m a t o r of
B(@)
(Rao and Wu, 1983).
On the other hand, the
bias estimate
B(@) = E,(@*) - @ , b a s e d on the
naive b o o t s t r a p , is not a c o n s i s t e n t e s t i m a t o r of
B(@)
(Rao and Wu, 1983).
3.
confidence
We n o w p r o v i d e
t*.
N o t i n g that
a p p r o x i m a t i o n to
Our b o o t s t r a p
~
,
#
62 < ~
for all
h , i.e. the b o o t s t r a p sample
size
mh
s h o u l d be c o m p a r a b l e t o the o r i g i n a l
size
nh
in each stratum.
of
variance
values
t *I
t *B
of
t*
U t i l i z i n g the
•I
*B
b o o t s t r a p h i s t o g r a m of
t
,...,t
, we define
A
t, b
A -i
CDF t (x) = #{
< x}/B , { L O W : CDF t (Q)
~-I
tup = CDF
(l-Q) , and c o n s t r u c t an a p p r o x i m a t e
A
E s t i m a t e of b i a s
of b i a s is
is the b o o t s t r a p
e s t i m a t o r o b t a i n e d from (2.9) by b o o t s t r a p p i n g
the p a r t i c u l a r b o o t s t r a p sample
{Yhi } i.e. by
replacing
Yhi
by
Yhi
in the p r o p o s e d method.
For the s e c o n d p h a s e b o o t s t r a p p i n g one c o u l d
choose values
(m~, B')
d i f f e r e n t from
(mh,B) This d o u b l e - b o o t s t r a p m e t h o d thus leads to
B
(2.10)
for any
m h , where
vL
is the l i n e a r i z a t i o n
v a r i a n c e e s t i m a t o r (Rao a n d Wu, 1983) . In the
linear case
(p=l) , v L
r e d u c e s to
var(y) .
Under r e a s o n a b l e r e g u l a r i t y c o n d i t i o n s ,
vL
is a
c o n s i s t e n t e s t i m a t o r of v a r i a n c e of
@ , Var(@).
Hence, it f o l l o w s f r o m (2.10) that
vb
is also
c o n s i s t e n t for
Var(8) .
The a s y m p t o t i c
N(0,1)
p r o p e r t y of the
c o n d i t i o n a l d i s t r i b u t i o n of
( 8 - 8)/O b
can be
e s t a b l i s h e d , a s s u m i n g that
0 < 61 < mh/(nh-l) <
2.4.
estimate
I~* (a)
t * = (e - e)/O
b
counterpart
t .]1 ,..., tj*B
and
,
"*2
oj
is o b t a i n e d
from
~2j
by j a c k k n i f i n g the p @ r t i c u l a r b o o t s t r a p sample
{Yhi }.
It can be shown that the c o n f i d e n c e
interval (3.7) is also a s y m p t o t i c a l l y correct.
A c o n f i d e n c e interval of this type was c o n s i d e r e d
by E f r o n (1981) in the case of an i.i.d, sample
(3.3)
{yi}.
o
as an a p p r o x i m a t e (l-2<~)-level c o n f i d e n c e interval
for
8 , where
~
is the c u m u l a t i v e d i s t r i b u t i o n
f u n c t i o n of a s t a n d a r d normal,
z0 = @-I (CDF (@))
and
z~ = ~-i(i-~).
The a d v a n t a g e of the interval (3.3) over (3.2) has b e e n d e m o n s t r a t e d by
Efron (1982) in the i.i.d, case.
It is p o s s i b l e to r e p l a c e
~
by the BRR or
the l i n e a r i z a t i o n v a r i a n c e e s t i m a t o r and o b t a i n
a c o n f i d e n c e interval similar to (3.7).
3.2.
Bootstrap t-statistics.
I n s t e a d of
a p p r o x i m a t i n g the d i s t r i b u t i o n of
@ by the
b o o t s t r a p d i s t r i b u t i o n of
~ , we can a ~ p r o x i m a t e
the d i s t r i b u t i o n of the t - s t a t i s t i c t = ( 8 - 8 ) / O b by
~
*
and our m e t h o d r e d u c e s to the naive
Yhi = Yhi
b o o t s t r a p , e x c e p t that in step (i) of the latter
m e t h o d a simple r a n d o m sample of size
nh-i
is
L2
3.3.
Choice
is a n a t u r a l
108
of
one.
mh
.
The choice
The choice
mh = nh
m h = nh-i
gives
selected
from
{Yhi}~"l__
is stratum
h.
5.
For
n h = 2 , m h = i , the method reduces to the wellknown random half-sample replication and the
resulting variance estimators are less stable than
those o b t a i n e d from B R R for the same number, B ,
of half samples (McCarthy, 1969).
It may be worth
considering a b o o t s t r a p sample size
~
larger
than
n h = 2, say m h = 3 or 4.
Unequal P r o b a b i l i t y
Replacement
S a m p l i n g Without
5.1. T h e Rao, H a r t l e y and C0chran Method.
Rao, Hartley and Cochran (1962) p r o p o s e d a simple
method of sampling with unequal probabilities and
w i t h o u t replacement.
The population of N units
is p a r t i t i o n e d at random into
n groups
G 1 .....
G n of sizes
NI,...,N n respectively
(7.Nk = N),
and then one unit is drawn from each of the n
groups with probabilities
pt/Pk
for the k-th
group.
Here
mt=xt/x
' Pk = t ~ p t '
xt = a
In the linear case
(p = i), it can be shown
that the bootstrap third moment, E, (y - y) 3 ,
matches the unbiased estimate of the third moment
of y , if ~ = (nh-2)2/(nh-l) , ~ > 4.
This
measure of size of the t-th unit
(t=l, .... N)
that is approximately proportional to Yt
(in
the scalar case of p =i)_ and
X = 7 x t. Their
unbiased estimator of Y is given by
^
n
Y = 7. ZkP k
(5.1)
k=l
p r o p e r t y is not enjoyed b y the jackknife or the
BRR.
Similarly, it can also be shown that the
same choice of m h ensures that the bootstrap
h i s t o g r a m of a t-statistic (see section 3.2)
approximates the true distribution of t w i t h a
remainder term of O_(n-l).
Details will be
p r o v i d e d in a separa[e paper.
where
~
=yk/(NPk)
and
(yk,Pk)
denote the
4. Stratified Simple Random Sampling w i t h o u t
Replacement
values for the unit selected from the k-th group
(Y' Pk =i).
An unbiased estimator of variance of
All the previous results apply to the case of
stratified simple random sampling w i t h o u t replacepent by making a slight change in the definition
Y
of
Yhi
~
Yhi
is
^
var(Y)
12
=
Y.pk(Zk - Y)
2
'
(5.2)
:
_
=Yh +~(nh-1)
-½
½
(1-fh)
where
-
(Yhi - Y h ) '
(4.1)
12 = (~ N 2 - N ) / ( N 2 -
where
fh = n h / N h
is the sampling fraction in
stratum h.
It is interesting to observe that,
even by choosing
mh = n h , l ' Yhi ~ y h i " Hence the
naive b o o t s t r a p using
Yhi will still have the
p r o b l e m of giving a w r o n g scale as discussed
before.
In the special case of n h = 2
for all
h , McCarthy (1969) used a finite p o p u l a t i o n
correction similar to (4.1) in the context of BRR.
It can be shown that the choice
Y.N2)
.
(5.3)
In the special case of equal group sizes,
Nk =
N/n , I reduces to
(1-f) ½(n-l)-½
where
f = n/N.
Our bootstrap sampling for the above method is
as follows:
(i) Attach the probability
Pk to
the sampled unit from G k and then select a
sample
{Yl
,}m
Pi i=l
'
of size
m
with replacement
}n
mh = [ (nh_2) 2/(nh_l ) ] [ (l_fh)/(l_2fh) 2]
(4.2)
w i t h probabilities
Calculate
matches the third moments, but it provides an
approximation to the true distribution of the tstatistic w i t h a remainder term of Op(n -½)
only,
as in the case of a normal approximation.
Bickel and Freedman (1984) considered a different bootstrap sampling method in order to recover
the finite p o p u l a t i o n correction,
1 - fh ' in the
variance formula.
This method essentially creates
populations consisting of copies of each Yhi '
i = l , . . . , n h and h = I , . . . , L
and then generates
nh
{y*}
as a simple r a n d o m sample w i t h o u t replacehi 1
~.
1
Pk
from
{Yk'Pk k=l"
z*i = Y i */(NPl)
=
Y
~
-I
Y = m
+
Im½(z.
m
l
-
)
^
--
~ zi = Y + Im½(
i=l
~*
-~)
(5.4)
where
z* = Z z./m
*
. (2)
Implement steps (ii)
and (iii) of Seltion 2.2, using
~ obtained from
(5.4).
In the linear case with
p =i,
@ = Y , we get
pent from the created population, independently
in each stratum.
This "blow-up" b o o t s t r a p was
first p r o p o s e d by Gross (1980) and also independently b y Chao and Lo (1983).
The variance estimator resulting from this m e t h o d (by w o r k i n g
directly w i t h
Yhi ) , however, r e m a i n s ^ i n c o n s i s t e n t
for estimating the true variance of
@ . It is
possible, however, to make the variance estimator
consistent by reducing the bootstrap sample size
to nh-I , as in Section 2.
E,(Y%
= E,(zi ) = Y + I m ½ ( y
- Y) = Y
(5.5)
and
var, (~) = -1 var,(z.)
m
1
= 12var, (z l)
12 n
-
^
= var (Y) .
(5.6)
k=l
~
Hence,
The "blow-up" bootstrap approximates the true
distribution of the t-statistic w i t h a remainder
(see A b r a m o v i t c h and Singh,
term of Op(n -I)
1984) , unlike our method.
the bootstrap
variance
estimator,
var,(Y),
^
reduces
to
var(Y)
in the linear
case.
The
properties
of vat,(8)
in the nonlinear
case and
the bootstrap
methods of setting
confidence
109
i n t e r v a l s for
inves ti gate d.
@ , given in S e c t i o n
Letting
3, are b e i n g
7.
k..I.. = c.
j (~i)
13 13
l
E,(Y)
= Y + 7. z. ( c . - c . )
1 1
l
i£s
var,(Y)
d i(s)y i ,
= X
.
Similarly,
A
7.
ies
that
k..I.. = k..1.. , we get
13 13
31 31
5.2
General Results.
A general linear unb i a s e d e s t i m a t o r of the p o p u l a t i o n total
Y
(in
the scalar case of p =i)
b a s e d on a sample
s
of size
n w i t h a s s o c i a t e d p r o b a b i l i t y of selection p(s) , is given b y
y =
and n o t i n g
= imE, {ki, j.(zi. - z j , ) } 2
(5.7)
= 1
w h e r e the k n o w n w e i g h t s
di(s)
can d e p e n d on
s
and the unit
i
(i ~ s).
Rao (1979) p r o v e d that
a n o n n e g a t i v e ^ u n b i a s e d q u a d r a t i c e s t i m a t o r of
variance of Y
is n e c e s s a r i l y of the form
7.7. k . . I . . ( z .
m i~j
13 l]
-z.)2.
1
(5.13)
3
£s
k i2j l i j /m
We now choose
such that
(5.13)
is
A
var(Y)
= - 7. 7. d.. (s) w.w.(z. - z . ) 2 ,
i < j 13
i 3 l
3
identical
(5.8)
to
var(Y)
vat(Y)
=
£ s
where
z i = Yi/Wi
, the
w.l
given by
(5.8), i.e.,
7.7. b . . ( s ) w . w . ( z . - z . ) 2
i~j
13
1 ] l
3
.
£s
are k n o w n c o n s t a n t s
Thus
such that Var(Y) = 0 w h e n
Yi ~ wi • Here the
known c o e f f i c i e n t s
d.. (s) satisfy the u n b i a s e d ness c o n d i t i o n
13
1 k 2.1. . = b..(s)
m
13 l]
13
w.w. .
i 3
(5.14)
The choice of
Edi3.(s) = C o v [ d i(s),dj(s)]
,
k..
and
I..
s a t i s f y i n g (5.14)
13
13
is not unique.
One simple choice is
I.. =
I/In(n-l) ] , i.e. equal p r o b a b i l i t i e s
13
for each of the
n(n-l) pairs of
i ~ j. A n o t h e r
choice is
(5.9)
where
d. (s) = 0 if
i ~ s , and
d..(s) = 0 if
l
13
s does not c o n t a i n b o t h
i and
j. As an
example,the well-known Horvitz-Thompson estimator
YHT
satisfies (5.7) w i t h
d.l(s) = i/7 i , i £ s
(5.i5)
lij = 7[ij/ds ,
and
w. = 7[.
1
and
1
where
-d..(s)w.w. =
~3
x 3
where
7. = 7~ p(s)
1
sgi
,
7..
13
and
=
7.. =
7.
p(s)
l]
s 9 i,j
£s
S p e c i a l case.
we h a v e '
are
m
7.
k
(z -z
)
(i*,j*)£s*
i'j*
i*
j*
(5.11)
Y
HT
e = g (Y) ,
for
i < j £ s
It follows
to be specified,
+
7.7. k . . I . . ( z .
i~j
13 13
-z.)
l
.
7[.7. -7[..
1 3
i~
7[..
13
estimator,
(5 16)
,
7.7[. - 7 . .
i 3
i~
7[..
13
.
(5.17)
to
m
7
*-~i* j* ½
7
[ i*Tj
-- - ] (zi,-zj,). (5.18)
(i*, j*)=l
7i*j*
from
(5.13)
that
var,(YHT)_
Yates-Grundy variance
7.7[. -7[..
Var(YHT ) =
7~ 7.
i 3
13 (z.-z.)2.
7[
,
1
3
i<j£s
i3
reduces
estimator:
(5.19)
The p r o p e r t i e s of
var,(8)
in the n o n l i n e a r
case and the b o o t s t r a p c o n f i d e n c e i n t e r v a l s o n
8 are b e i n g i n v e s t i g a t e d .
= Y + E , { k lj,. ,(zi, - z j , ) }
= Y
reduces
to the w e l l - k n o w n
and
s*
denotes the set of
m b o o t s t r a p pairs.
(2)
I m p l e m e n t steps (ii) and (iii) of S e c t i o n 2.2
using
@ o b t a i n e d from (5.11).
In the linear case w i t h
p = I , @ = Y , we get
E,(Y)
k 2. . .I.
=
13
13
k 2 • = n 2 (n-l) 2
13
YHT = Y H T +
k ij = kji
For the H o r v i t z - T h o m p s o n
u s i n g (5.10) in (5.14).
If m = n(n-l)
and
I.. = i/[n(n-l)], then (5.16) reduces to
13
Hence,
where
7.7 7[...
iflj 13
(5.10)
the first and s e c o n d o r d e r i n c l u s i o n p r o b a b i l i ties_
Let bij (s) = -d i • (s)/2
and assume that
3
S
bi~(s) > 0
for all
i < j £ s and all
4
This is true for many w e l l - k n o w n schemes,
i n c l u d i n g the R a o - H a r t l e y - C o c h r a n (RHC) method.
Our b o o t s t r a p s a m p l i n g is as follows:
(I)
c o n s i d e r all the
n(n-l)
pairs
(i,j), i ~ j
and s e l e c t
m pairs
(i*, j*)
with replacement
with probabilities
~ij (= ~ji ) to be specified.
Calculate
= ~ + 1
m
d
s
~.7[. -7[..
(5.12)
3
Remark.
In the c a s e of RHC method, we now have
two d i f f e r e n t b o o t s t r a p m e t h o d s :
(i) The m e t h o d
of S e c t i o n 6.1 b a s e d on s e l e c t i n g a b o o t s t r a p
£S
110
Y* = Y.-value of the i-th bootstrap
l
1
Calculate
sample with probabilities
Pk and with replacen
{yk,Pk}l . (2) The present method of
ment from
A,
selecting a bootstrap sample of pairs with replacement from the n(n-l)
pairs (i,j) £ s ,
i ~ j, with probabilities
I... We are investigating their relative merits~,3but intuitively
method (i) is more appealing.
Yij : Y + 1 1
~0-
..,
g
+ 12i[
**
M0
cluster.
^~
(6.4)
-
and
m.*
6.
Two-stage Cluster Sampling W i t h o u t Replacement
Y =-4
Yi
n i=l mi j=i
J
Suppose that the population is comprised of N
clusters with M t elements (subunits) in the t-th
cluster
(t = i ..... N).
The population size M 0
(= 7.M. ) is unknown in many applications.
A
simple~ random sample of n clusters is selected
without replacement, and m i elements are chosen,
again by simple random sampling w i t h o u t replacement, from the M i elements in i-th cluster if
the latter is selected.
The customary unbiased
estimator of t h e population total Y is
^*
~_
11
= y +-n
where
^*
+
i=l
~?,
i
~**
Yi
2i
*-**
= MiYi
-**
'
'
**.
Yi
(6 5)
"
,
= lj Yij/mi
, and
m.
,2
*
i
2 = n__~(l_fl ) ;
= fl (I)
11
n-i
12i
f2i m*-i
"
(6 6)
"
1
n
n
N
[ MiYi : n I Yi
i=l
i=l
^
N
Y = n
'
(2) Implement steps (ii) and (iii) of Section
2.2 with
@ = g(Y) .
Let E2,
and var2,
respectively denote the
(6.1)
Yi is the sample mean for i-th sample
cluster.
The corresponding estimator of
@ is
written as ~ = g(~), where
where
^
!
Y=M~0
n
conditional bootstrap expectation and variance,
for a given bootstrap sample of clusters.
Similarly,
El,
and varl,
denote the bootstrap
expectation and variance
sample clusters.
Then
1
i=l
respectively
for the
^,
1
n
1
:-
II
n i~l~
where
M 0 = M0/N.
mi MiYij
,
E2,(~)
(6.2)
= y + -n- 7~ ~
j=
For instance,
and
if^ @ = Y/M 0
A
where
M0
is
unknown,
then
0 = Y/X
where
E,(Y) =EI,E2,(Y) = Y + I I ( Y - Y )
(6.7)
= Y .
^
X = X/M 0
all
and
elements
X = (N/n) 7~M.x.1
1
j
with
in any cluster
X.l = i
i.
for
S imi i arly,
^,
An u n b i a s e d
^
estimator of variance of
(Cochran, 1977, p.303)
Y
varl,E2,(?)
is given by
= l I varl, ~ 2
Yi _
(6.8)
--
var(Y)
n
= n(n-l)
i=l
A
Also
m.
+j
m.• (m -i)
--
-
,
n
(6.3)
i=l
where
Yij
is the y-value
in the i-th sample cluster,
f2i = mi/M'l
mi
l
[MiYij
Y )2
j=l
for j-th sample element
fl =n/N
and
where
.
2
12i = fl (l-f2i)mi/(mi-l) "
Hence,
combin-
ing
(6.8) and (6.9) we get
~
~__
~
~__
var, (Y) = varl,E2, (Y) + E l , v a r 2 , (Y) = var(Y) .
We employ two-stage bootstrap sampling to obtain
- i
-~Y**~
from _{yi } as follows:
(I) Select
J
J
a simple random sample of n clusters with
replacement from the n sample clusters and then
draw a simple random sample of m i elements with
replacement from the m i elements in i-th sample
cluster if the latter is chosen.
(Independent
bootstrap subsampling for the same cluster chosen
more than once.)
We use the following notation:
** = y-value of the j-th bootstrap element in
Yij
the i-th bootstrap
cluster;
m. = m . - v a l u e
1
i-th bootstrap
1
cluster
(similarly
~
Hence, the bootstrap variance estimator,
var,(Y),
reduces to vat(Y-)
in the linear case.
Thus the
variability between bootstrap estimates correctly
accounts for the between-cluster and withincluster components of the variance without the
necessity of estimating them separately as in the
case of BRR or the jackknife.
Acknowledgement.
This research was supported by a grant from
of the
1
M[), and
111
Hinkley, D. and Wei, B. (1984) . Improvements of
jackknife confidence limit methods.
Biometrika, 71, 331-339.
the Natural Sciences and Engineering Research
Council of Canada.
C.F.J. Wu is partially
supported by NSF grant No. MCS-83-00140,
Our
thanks are due to David Binder for useful discussions on the bootstrap for unequal probability
sampling without replacement.
Krewski, D. and Rao, J.N.K. (1981).
Inference
from stratified samples: properties of the
linearization, jackknife and balanced repeated
replication methods.
Ann. Statist., 9,
1010-1019.
Re fe rence s
m
Abramovitch, L. and Singh, K. (1984).
Edgeworth
corrected pivotal statistics and the bootstrap.
Preprint.
Bickel, P.J. and Freedman, D.A. (1981).
asymptotic theory for the bootstrap.
Statist., 9, 1196-1217.
McCarthy, P.J. (1969).
Pseudoreplication: half
samples.
Rev. Int. Statist. Inst., 37,
239-264.
Some
Ann.
Rao, J.N.K. and Wu, C.F.J. (1983).
Inference
from stratified samples: second order analysis
of three methods for nonlinear statistics.
Technical Report Series of the Laboratory for
Research in Statistics and Probability,
Carleton University, Ottawa, No. 7.
Bickel, P.J. and Freedman, D.A. (1984).
Asymptotic normality and the bootstrap in stratified sampling.
Ann. Statist., 12, 470-482.
Chao, M.T. and Lo, S.H. (1983).
method for finite population.
A bootstrap
Preprint.
Cochran, W.G. (1977).
Sampling Techniques
ed.) . Wiley, New York.
Rao, J.N.K. (1979).
On deriving mean square
errors and their non-negative unbiased estimators in finite population sampling.
J. Indian
Statist. Assoc., 17, 125-136.
(3rd
Rao, J.N.K., Hartley, H.O. and Cochran, W.G.
(1962).
On a simple procedure of unequal p r o ~
ability sampling without replacement.
J.R.
Statist. Soc. B, 24, 482-491.
Efron, B. (1981).
Nonparametric standard errors
and confidence intervals.
Canadian J.
Statist., 9, 139-172.
Efron, B. (1982).
The Jackknife, the Bootstrap
and other Resampling Plans.
SIAM, Philade iphi a.
Rao, J.N.K. and Wu, C.F.J. (1983).
Bootstrap
inference with stratified samples.
Technical
Report Series of the Laboratory for Research
in Statistics and Probability, Carleton
University, Ottawa, No. 19.
Gross, S. (1980).
Median estimation in sample
surveys.
Proc. Amer. Statist. Assoc., Section
on Survey Research Methods, 181-184.
112