BOOTSTRAP J.N.K. Rao, Carleton INFERENCE University FOR SAMPLE and C.F.J. SURVEYS Wu, University of W i s c o n s i n - M a d i s o n ties of the r e s u l t i n g v a r i a n c e e s t i m a t o r are studied. The b o o t s t r a p e s t i m a t e of b i a s of is also obtained. Section 3 provides bootstrap c o n f i d e n c e i n t e r v a l s for @ . The r e s u l t s are e x t e n d e d to s t r a t i f i e d simple r a n d o m s a m p l i n g w i t h o u t r e p l a c e m e n t in S e c t i o n 4. Finally, the m e t h o d is e x t e n d e d to u n e q u a l p r o b a b i l i t y sampling w i t h o u t r e p l a c e m e n t in S e c t i o n 5 and to twostage C l u s t e r s a m p l i n g w i t h o u t r e p l a c e m e n t in S e c t i o n 6. The b o o t s t r a p m e t h o d of i n f e r e n c e is e x t e n d e d to s t r a t i f i e d sampling, i n v o l v i n g a large n u m b e r of strata with r e l a t i v e l y few (primary) u n i t s s a m p l e d w i t h i n strata, when the p a r a m e t e r of interest, @ , is a n o n l i n e a r f u n c t i o n of the p o p u l a t i o n mean vector. The b o o t s t r a p e s t i m a t e of b i a s of the estimate, ~ , of e a n d the estimate of v a r i a n c e of @ are obtained. Bootstrap c o n f i d e n c e i n t e r v a l s for @ are also given, u t i l i z i n g the p e r c e n t i l e m e t h o d or the b o o t s t r a p h i s t o g r a m of the t - s t a t i s t i c . E x t e n s i o n s to unequal probability sampling without replacement and t w o - s t a g e sampling are also obtained. 2. The B o o t s t r a p M e t h o d The p a r a m e t e r of i n t e r e s t @ is a n o n l i n e a r f u n c t i o n of the p o p u l a t i o n mean vector Y = (y% .... ,~p) T , say @ = g(Y). This f o r m of @ includes ratios, r e g r e s s i o n and c o r r e l a t i o n coefficients. If n h (> 2) p s u ' s are s e l e c t e d with r e p l a c e m e n t with p r o b a b i l i t i e s Phi in stratum h , then K r e w s k i and Rao (1981) have shown that the n a t u r a l e s t i m a t o r ~ = g(~) can 1. Introduction R e s a m p l i n g methods, i n c l u d i n g the jackknife and the b o o t s t r a p , p r o v i d e s t a n d a r d error estimates and c o n f i d e n c e i n t e r v a l s for the p a r a m e t e r s of interest. These m e t h o d s are simple a n d s t r a i g h t f o r w a r d b u t are c o m p u t e r - i n t e n s i v e , e s p e c i a l l y the b o o t s t r a p . E f r o n (1982) has g i v e n an e x c e l l e n t a c c o u n t of r e s a m p l i n g m e t h o d s in the case of an i n d e p e n d e n t and i d e n t i c a l l y d i s t r i b u ted (i.i.d.) sample of f i x e d size n f r o m an unknown distribution F , and the p a r a m e t e r of interest @ = 0(F) . L i m i t e d e m p i r i c a l e v i d e n c e (see Efron, 1982, p.18) has i n d i c a t e d that the b o o t s t r a p s t a n d a r d error e s t i m a t e s are likely to be more stable than those b a s e d on the jackknife and a l s o less b i a s e d than those b a s e d on the c u s t o m a r y delta (linearization) method. Moreover, the b o o t s t r a p h i s t o g r a m of t - s t a t i s t i c a p p r o x i m a t e s the true d i s t r i b u t i o n of t with a r e m a i n der term of O_ (n -1) in the E d g e w o r t h e x p a n s i o n (Abramovitch anPd Singh, 1984), unlike the usual n o r m a l a p p r o x i m a t i o n with a r e m a i n d e r term of Op(n-½). E m p i r i c a l r e s u l t s for the ratio e s t i m a t e (Hinkley and Wei, 1984) also indicate that the c o n f i d e n c e i n t e r v a l s from the b o o t s t r a p h i s t o g r a m of t (using the l i n e a r i z a t i o n v a r i a n c e estimator) p e r f o r m b e t t e r than those b a s e d on the normal a p p r o x i m a t i o n . The main p u r p o s e of this a r t i c l e is to p r o p o s e an e x t e n s i o n of the b o o t s t r a p m e t h o d to stratified samples, in the c o n t e x t of sample survey data; e s p e c i a l l y to data o b t a i n e d from s t r a t i f i e d c l u s t e r samples i n v o l v i n g large n u m b e r s of strata, L , with r e l a t i v e l y few p r i m a r y s a m p l i n g units (psu's) s a m p l e d w i t h i n each stratum. For nonlinear s t a t i s t i c s ~ that can be e x p r e s s e d as f u n c t i o n s of e s t i m a t e d means of p (=> i) variables, K r e w s k i and Rao (1981) e s t a b l i s h e d the a s y m p t o t i c c o n s i s t e n c y of the v a r i a n c e e s t i m a t o r s from the jackknife, the delta and the b a l a n c e d r e p e a t e d r e p l i c a t i o n (BRR) m e t h o d s as L ÷ ~ w i t h i n the c o n t e x t of a sequence of finite p o p u lations {~L } with L strata in ~L " Their r e s u l t is v a l i d for any m u l t i s t a g e d e s i g n in which the p s u ' s are s e l e c t e d with r e p l a c e m e n t and in which i n d e p e n d e n t s u b s a m p l e s are s e l e c t e d w i t h i n those p s u ' s s a m p l e d more than once. Rao and Wu (1983) o b t a i n e d second order a s y m p t o t i c e x p a n s i o n s of these v a r i a n c e e s t i m a t o r s under the above set up and made c o m p a r i s o n s in terms of their biases. The p r o p o s e d b o o t s t r a p m e t h o d for s t r a t i f i e d samples is d e s c r i b e d in S e c t i o n 2 and the p r o p e r - be e x p r e s s e d as ~ = g(y). Here ~ is a d e s i g n u n b i a s e d linear e s t i m a t o r of ~ = 7.WhY h and m y = ~WhY--h where Wh and Y--h= (Y--hl..... Y--hp) T are the h-th s t r a t u m w e i g h t (7W h = i) and p o p u l a tion mean v e c t o r r e s p e c t i v e l y and Yh is the m e a n of nh i.i.d, r a n d o m v e c t o r s Yhi = (Yhil'""" ,Yhip) T For h ~ h' , for each Yhi b u t not n e c e s s a r i l y 2.1. i.i.d, and h with Yh' j are identically The Naive b o o t s t r a p . {yi}~ sample with E(Yhi) = Y--h" independent distributed. In the case of an E (yi) = Y , the b o o t - strap m e t h o d is as follows: (i) D r a w a simple r a n d o m sample { .}n with r e p l a c e m e n t from the Yi 1 observed values YI'Y2'''''Yn and c a l c u l a t e @* = g (y*) where y• ~ Y i*/ n . (ii) I n d e p e n d e n t ly r e p l i c a t e step (i) a large number, B , of times and c l a c u l a t e the c o r r e s p o n d i n g e s t i m a t e s ~,I,...,~,B . (iii) The b o o t s t r a p v a r i a n c e e s t i m a t o r of e = g(y) is g i v e n by B v b (a) = B-I [ (8*b - G*)2 a ' (2~) " b=l where @* = 7. @*b/B. a v b (a) is an a p p r o x i m a t i o n v b = vex,(e*) The M o n t e - C a r l o estimator to = E,(@* - E,@*)2 , (2.2) where E, d e n o t e s the e x p e c t a t i o n with r e s p e c t to b o o t s t r a p s a m p l i n g from a given sample YI'''''Yn" No c l o s e d - f o r m e x p r e s s i o n for var,(@*) g e n e r a l l y e x i s t s in the n o n l i n e a r case, b u t in the linear case with p = I, @* = y* and vb reduce s to var,(y*) where 106 (n-l)s 2 n-I 2 n-i = ---~ s = - var(y), n n = ~ (Yi - ~) 2 and var(y) (2.3) = s2/n is the u n b i a s e d e s t i m a t o r of v a r i a n c e of y . The modified variance estimator [n/(n-l) ]var. (0") exactly equals var(y) in the l i n e a r case, b u t E f r o n (1982) f o u n d no a d v a n t a g e in this m o d i f i c a tion. In any case, n/(n-l) - 1 in m o s t a p p l i c a tions a n d var. (0") is a c o n s i s t e n t e s t i m a t o r of the v a r i a n c e of ~ , as n ÷ ~ (Bickel a n d F r e e d m a n , 1981) . The b o o t s t r a p h i s t o g r a m of where nh {Yhi } in s t r a t u m i=l stratum. h f r o m the g i v e n , independently -. -i . Yh = nh • i Y h i Calculate that one c o u l d use the b o o t s t r a p ~*i ~*B g r a m of t ,...,t to f i n d c o n f i d e n c e vale for each (2 4) b=l where Carlo -- ^* y = 7WhY h estimator and vb(a) V b = var.(0*) 2.2 The p r o p o s e d method. follows: (i) D r a w a simple *b 0a = ~@ /B . The M o n t e - is an a p p r o x i m a t i o n = E.(0* - E. @. 2 ) to i of size , (2.5) nh {Yhi } . i=l (nh-i) Sh2 = 7 i (Yhi - Yh )2 " the u n b i a s e d estimator ~ = Comparing of v a r i a n c e of -I mh , it i m m e d i a t e l y (2 "6) y to 1 in p r o b a b i l i t y as with r e p l a c e m e n t from follows (nh-i)-½ (Y * - ) h i - Yh -½ -. (yh-y h) (2.7) , ~:g(~) that var. (y*)/var (9) does n o t c o n v e r g e to 1 in probability, unless L is f i x e d a n d nh ÷ ~ for each h . Hence, var.(9*) is n o t a c o n s i s t e n t e s t i m a t o r of the v a r i a n c e of 9 • It a l s o f o l l o w s that vb is n o t a c o n s i s t e n t e s t i m a t o r of the v a r i a n c e (or m e a n square error) of a g e n e r a l n o n l i n e a r statistic. T h e r e does n o t seem to be an o b v i o u s way to c o r r e c t this s c a l i n g p r o b l e m e x c e p t w h e n ^ n~ = k for all h in w h i c h case k(k-1)-ivar.(@*)'~ will be c o n s i s t e n t . B i c k e l a n d F r e e d m a n (1984) a l s o n o t i c e d the s c a l i n g p r o b l e m , b u t they were m a i n l y i n t e r e s t e d in b o o t s t r a p c o n f i d e n c e i n t e r v a l s in the l i n e a r case (p = i). T h e y h a v e e s t a b l i s h e d the asymptotic N(0,1) p_ro~erty of the d i s t r i b u t i o n of t = (y - Y) / [var (y) ] ~ a n d of the c o n d i t i o n a l d i s t r i b u t i o n of (9* - Y ) / [ v a r . ( Y * ) ] ½ in s t r a t i f i e d simple r a n d o m s a m p l i n g w i t h r e p l a c e ment, a n d a l s o p r o v e d that (?,W~Sh2/nh)/var . ~ ~ (9*) converges mh Our m e t h o d is as r a n d o m sample [ Yhi : Yh + 4 ( n h - l ) i=l ~ : YWh~ h , var(y ) = ?W2s2/nh for the b - t h Calculate Yhi = Yh + 4 Yh with Sb 2L, i=l where E. d e n o t e s the e x p e c t a t i o n w i t h r e s p e c t to b o o t s t r a p sampling. In the l i n e a r case w i t h p = i, 0* = 7WhY-* h = y-* and vb r e d u c e s to 2 Wh nh - i 2 (2 6) var.(y*) : 7, ~hh ( nh ) Sh ' where (Y of inter- w2.b2 ½ - Y ) / [ ? h hs /nh] Although ~.b is a s y m p t o t i c a l l y N (0, i) in the linear case, it is n o t l i k e l y to p r o v i d e as g o o d an a p p r o x i m a t i o n to the d i s t r i b u t i o n of t as a s t a t i s t i c w h o s e d e n o m i n a t o r a n d n u m e r a t o r are b o t h a d e q u a t e a p p r o x i m a t i o n s to their c o u n t e r p a r t s in t . Such s t a t i s t i c s will be p r o p o s e d in S e c t i o n 3.2. These are a l s o a p p l i c a b l e to the n o n l i n e a r case. R e c o g n i z i n g the s c a l i n g p r o b l e m in a d i f f e r e n t context, E f r o n (1982) s u g g e s t e d to d r a w a b o o t strap sample of size nh -1 i n s t e a d of nh from stratum h (h = I , . . . , L ) . In S e c t i o n 2.2, we will i n s t e a d p r o p o s e a d i f f e r e n t m e t h o d w h i c h i n c l u d e s h i s s u g g e s t i o n as a special case. B -- is the value .b way to c o n s t r u c t t ' b - v a l u e s s i m i l a r to those of B i c k e l a n d F r e e d m a n since vb h a s no c l o s e d form. M o r e o v e r , the s t r a i g h t f o r w a r d e x t e n s i o n of the b o o t s t r a p (hereafter c a l l e d the n a i v e b o o t s t r a p ) does n o t p e r m i t the use of the p e r c e n t i l e m e t h o d b a s e d on the b o o t s t r a p h i s t o g r a m of @*i ..... @*B. -. -* ' y =?WhYh a shb2- ~*b t = histo- b o o t s t r a p sample (b = 1 .... ,B). In the nonl i n e a r case, there does n o t seem to be a simple sample [ (~.b _ ~.)2, Y , where for where and 0* = g(Y*)(ii) I n d e p e n d e n t l y r e p l i c a t e step (i) a large number, B , of t i m e s and c a l c u l a t e the c o r r e s p o n d i n g e s t i m a t e s @*l,..., @*B. (iii) The b o o t s t r a p v a r i a n c e e s t i m a t o r of @ = g(y) is g i v e n by v b (a) - B-I Their result implies @.I,...,@.B may be u s e d to f i n d c o n f i d e n c e i n t e r v a l s for 0 . T h i s m e t h o d (Efron, 1982) is c a l l e d the p e r c e n t i l e method. N o t i n g the i.i.d, p r o p e r t y of the Yhi' s w i t h i n e a c h stratum, a s t r a i g h t f o r w a r d e x t e n s i o n of the u s u a l b o o t s t r a p m e t h o d to s t r a t i f i e d s a m p l e s (referred to as the "naive b o o t s t r a p " ) is as follows: (i) Take a simple r a n d o m sample . nh {Yhi } with r e p l a c e m e n t i=l (nh-l)Sh 2 = 7 i( Y h.i - Y h-. ) 2 " n = 7,nh ÷ ~ (ii) I n d e p e n d e n t l y r e p l i c a t e step (i) a large number, B , of t i m e s and c a l c u l a t e the c o r r e s ponding estimates ~i ..... ~B . (iii) The b o o t strap e s t i m a t o r E. (8) of 0 can be a p p r o x i m a t e d by estimator 0a of =^zOb/B- . 0 The b o o t s t r a p is g i v e n by o b = v b = var.(8) with its M o n t e - C a r l o O b(a) : v b(a) = E.(0 variance (2.8) - E.0)2 approximation : B-I g ( b=l - ~a ) " (2.9) A One can r e p l a c e E.0 2.3. Justification l i n e a r case, 0 : Y , mary unbiased variance , 107 in (2.8) by 0 . of the method. In the vb r e d u c e s to the custoestimator var(y) : 22 Y.WhSh/n h case, for any choice it can be shown mh . its b o o t s t r a p In the n o n l i n e a r v b = v L + Op(n -2 °b~*2(a) = vb(a) * where that ) ,. o. (l-2Q)-level B(@) = E.(~) - @. {@- for Confidence o~ #{ For Q < 0.5, ~ ~-i 8Up(Q) = CDF ~b define ~ 8LOW(C~) (I-Q). Then the . I"--1 = CDF (Q) ~, In the n o n l i n e a r F r e e d m a n (1984), (3.1) A and )), C D ~ - I ( } ( 2 z 0 (~* A for 2 -~.. , ) , (3.5) . A h (3.6) A ~ - tLOWUJ} (3.7) A where tLO and t are the lower and upper Q - p o i n t s o~ the s t a ~ P s t i c t* : (~ - 0)/~j J o b t a i n e d f r o m the b o o t s t r a p h i s t o g r a m of (3.2) + z ))} (3.4) case, f o l l o w i n g B i c k e l and ~,2 ~2 we can show that O b /O b con- {@ - tupOj, is an a p p r o x i m a t e (l-2Q)-level c o n f i d e n c e interval for 8 . It has the c e n t r a l i-2~ portion of the b o o t s t r a p d i s t r i b u t i o n (Efron, 1982, p.78). One can also c o n s i d e r a b i a s - c o r r e c t e d p e r c e n t i l e method, f o l l o w i n g Efron (1982, p.82). This m e t h o d leads to {~F-l(~(2z0-z g i v e n by v e r g e s to 1 in p r o b a b i l i t y as n ÷ ~. Hence, it f o l l o w s that the c o n d i t i o n a l d i s t r i b u t i o n of t* is a s y m p t o t i c a l l y N (0,1) . One c o u l d use a jackknife t - s t a t i s t i c tj = (@ - 8)/Oj i n s t e a d of t , where ~2_ is a j a c k k n i f e v a r i a n c e e s t i m a t o r of @ J(see K r e w s k i and Rao, 1981). The c o r r e s p o n d i n g c o n f i d e n c e interval is then given by interval {~LOW (Cz)' ~UP (~) } @ an a s y m p t o t i c j u s t i f i c a t i o n Vb (a) is a Monte Carlo ~*2 ~2 E,O b : O b < t; b = 1 ..... B}/B for tLOWOb } . : v v : ~ . ** Intervals 3.1. P e r c e n t i l e method. For r e a d y reference, we n o w give a b r i e f a c c o u n t of the p e r c e n t i l e m e t h o d b a s e d on the b o o t s t r a p h i s t o g r a m of ~i ..... ~B . D e f i n e the c u m u l a t i v e b o o t s t r a p d i s t r i b u t i o n f u n c t i o n as = , ~- interval where ~* is the value of ~ obtained from b o o t s t r a p p i n g the p a r t i c u l a r sample {Yhi } and E** is the s e c o n d p h a s e b o o t s t r a p e x p e c t a t i o n , t* = (~-@)/U*. In the linear case we can write 8 = Y it is e a s i l y seen that We n o w c o n s i d e r d i f f e r e n t b o o t s t r a p m e t h o d s s e t t i n g c o n f i d e n c e i n t e r v a l s for 8 . CDF(t) tupO b ~ 2 (2.11) which is a p p r o x i m a t e d by 8 a - 8. It can be shown that B(@) is a c o n s i s t e n t e s t i m a t o r of B(@) (Rao and Wu, 1983). On the other hand, the bias estimate B(@) = E,(@*) - @ , b a s e d on the naive b o o t s t r a p , is not a c o n s i s t e n t e s t i m a t o r of B(@) (Rao and Wu, 1983). 3. confidence We n o w p r o v i d e t*. N o t i n g that a p p r o x i m a t i o n to Our b o o t s t r a p ~ , # 62 < ~ for all h , i.e. the b o o t s t r a p sample size mh s h o u l d be c o m p a r a b l e t o the o r i g i n a l size nh in each stratum. of variance values t *I t *B of t* U t i l i z i n g the •I *B b o o t s t r a p h i s t o g r a m of t ,...,t , we define A t, b A -i CDF t (x) = #{ < x}/B , { L O W : CDF t (Q) ~-I tup = CDF (l-Q) , and c o n s t r u c t an a p p r o x i m a t e A E s t i m a t e of b i a s of b i a s is is the b o o t s t r a p e s t i m a t o r o b t a i n e d from (2.9) by b o o t s t r a p p i n g the p a r t i c u l a r b o o t s t r a p sample {Yhi } i.e. by replacing Yhi by Yhi in the p r o p o s e d method. For the s e c o n d p h a s e b o o t s t r a p p i n g one c o u l d choose values (m~, B') d i f f e r e n t from (mh,B) This d o u b l e - b o o t s t r a p m e t h o d thus leads to B (2.10) for any m h , where vL is the l i n e a r i z a t i o n v a r i a n c e e s t i m a t o r (Rao a n d Wu, 1983) . In the linear case (p=l) , v L r e d u c e s to var(y) . Under r e a s o n a b l e r e g u l a r i t y c o n d i t i o n s , vL is a c o n s i s t e n t e s t i m a t o r of v a r i a n c e of @ , Var(@). Hence, it f o l l o w s f r o m (2.10) that vb is also c o n s i s t e n t for Var(8) . The a s y m p t o t i c N(0,1) p r o p e r t y of the c o n d i t i o n a l d i s t r i b u t i o n of ( 8 - 8)/O b can be e s t a b l i s h e d , a s s u m i n g that 0 < 61 < mh/(nh-l) < 2.4. estimate I~* (a) t * = (e - e)/O b counterpart t .]1 ,..., tj*B and , "*2 oj is o b t a i n e d from ~2j by j a c k k n i f i n g the p @ r t i c u l a r b o o t s t r a p sample {Yhi }. It can be shown that the c o n f i d e n c e interval (3.7) is also a s y m p t o t i c a l l y correct. A c o n f i d e n c e interval of this type was c o n s i d e r e d by E f r o n (1981) in the case of an i.i.d, sample (3.3) {yi}. o as an a p p r o x i m a t e (l-2<~)-level c o n f i d e n c e interval for 8 , where ~ is the c u m u l a t i v e d i s t r i b u t i o n f u n c t i o n of a s t a n d a r d normal, z0 = @-I (CDF (@)) and z~ = ~-i(i-~). The a d v a n t a g e of the interval (3.3) over (3.2) has b e e n d e m o n s t r a t e d by Efron (1982) in the i.i.d, case. It is p o s s i b l e to r e p l a c e ~ by the BRR or the l i n e a r i z a t i o n v a r i a n c e e s t i m a t o r and o b t a i n a c o n f i d e n c e interval similar to (3.7). 3.2. Bootstrap t-statistics. I n s t e a d of a p p r o x i m a t i n g the d i s t r i b u t i o n of @ by the b o o t s t r a p d i s t r i b u t i o n of ~ , we can a ~ p r o x i m a t e the d i s t r i b u t i o n of the t - s t a t i s t i c t = ( 8 - 8 ) / O b by ~ * and our m e t h o d r e d u c e s to the naive Yhi = Yhi b o o t s t r a p , e x c e p t that in step (i) of the latter m e t h o d a simple r a n d o m sample of size nh-i is L2 3.3. Choice is a n a t u r a l 108 of one. mh . The choice The choice mh = nh m h = nh-i gives selected from {Yhi}~"l__ is stratum h. 5. For n h = 2 , m h = i , the method reduces to the wellknown random half-sample replication and the resulting variance estimators are less stable than those o b t a i n e d from B R R for the same number, B , of half samples (McCarthy, 1969). It may be worth considering a b o o t s t r a p sample size ~ larger than n h = 2, say m h = 3 or 4. Unequal P r o b a b i l i t y Replacement S a m p l i n g Without 5.1. T h e Rao, H a r t l e y and C0chran Method. Rao, Hartley and Cochran (1962) p r o p o s e d a simple method of sampling with unequal probabilities and w i t h o u t replacement. The population of N units is p a r t i t i o n e d at random into n groups G 1 ..... G n of sizes NI,...,N n respectively (7.Nk = N), and then one unit is drawn from each of the n groups with probabilities pt/Pk for the k-th group. Here mt=xt/x ' Pk = t ~ p t ' xt = a In the linear case (p = i), it can be shown that the bootstrap third moment, E, (y - y) 3 , matches the unbiased estimate of the third moment of y , if ~ = (nh-2)2/(nh-l) , ~ > 4. This measure of size of the t-th unit (t=l, .... N) that is approximately proportional to Yt (in the scalar case of p =i)_ and X = 7 x t. Their unbiased estimator of Y is given by ^ n Y = 7. ZkP k (5.1) k=l p r o p e r t y is not enjoyed b y the jackknife or the BRR. Similarly, it can also be shown that the same choice of m h ensures that the bootstrap h i s t o g r a m of a t-statistic (see section 3.2) approximates the true distribution of t w i t h a remainder term of O_(n-l). Details will be p r o v i d e d in a separa[e paper. where ~ =yk/(NPk) and (yk,Pk) denote the 4. Stratified Simple Random Sampling w i t h o u t Replacement values for the unit selected from the k-th group (Y' Pk =i). An unbiased estimator of variance of All the previous results apply to the case of stratified simple random sampling w i t h o u t replacepent by making a slight change in the definition Y of Yhi ~ Yhi is ^ var(Y) 12 = Y.pk(Zk - Y) 2 ' (5.2) : _ =Yh +~(nh-1) -½ ½ (1-fh) where - (Yhi - Y h ) ' (4.1) 12 = (~ N 2 - N ) / ( N 2 - where fh = n h / N h is the sampling fraction in stratum h. It is interesting to observe that, even by choosing mh = n h , l ' Yhi ~ y h i " Hence the naive b o o t s t r a p using Yhi will still have the p r o b l e m of giving a w r o n g scale as discussed before. In the special case of n h = 2 for all h , McCarthy (1969) used a finite p o p u l a t i o n correction similar to (4.1) in the context of BRR. It can be shown that the choice Y.N2) . (5.3) In the special case of equal group sizes, Nk = N/n , I reduces to (1-f) ½(n-l)-½ where f = n/N. Our bootstrap sampling for the above method is as follows: (i) Attach the probability Pk to the sampled unit from G k and then select a sample {Yl ,}m Pi i=l ' of size m with replacement }n mh = [ (nh_2) 2/(nh_l ) ] [ (l_fh)/(l_2fh) 2] (4.2) w i t h probabilities Calculate matches the third moments, but it provides an approximation to the true distribution of the tstatistic w i t h a remainder term of Op(n -½) only, as in the case of a normal approximation. Bickel and Freedman (1984) considered a different bootstrap sampling method in order to recover the finite p o p u l a t i o n correction, 1 - fh ' in the variance formula. This method essentially creates populations consisting of copies of each Yhi ' i = l , . . . , n h and h = I , . . . , L and then generates nh {y*} as a simple r a n d o m sample w i t h o u t replacehi 1 ~. 1 Pk from {Yk'Pk k=l" z*i = Y i */(NPl) = Y ~ -I Y = m + Im½(z. m l - ) ^ -- ~ zi = Y + Im½( i=l ~* -~) (5.4) where z* = Z z./m * . (2) Implement steps (ii) and (iii) of Seltion 2.2, using ~ obtained from (5.4). In the linear case with p =i, @ = Y , we get pent from the created population, independently in each stratum. This "blow-up" b o o t s t r a p was first p r o p o s e d by Gross (1980) and also independently b y Chao and Lo (1983). The variance estimator resulting from this m e t h o d (by w o r k i n g directly w i t h Yhi ) , however, r e m a i n s ^ i n c o n s i s t e n t for estimating the true variance of @ . It is possible, however, to make the variance estimator consistent by reducing the bootstrap sample size to nh-I , as in Section 2. E,(Y% = E,(zi ) = Y + I m ½ ( y - Y) = Y (5.5) and var, (~) = -1 var,(z.) m 1 = 12var, (z l) 12 n - ^ = var (Y) . (5.6) k=l ~ Hence, The "blow-up" bootstrap approximates the true distribution of the t-statistic w i t h a remainder (see A b r a m o v i t c h and Singh, term of Op(n -I) 1984) , unlike our method. the bootstrap variance estimator, var,(Y), ^ reduces to var(Y) in the linear case. The properties of vat,(8) in the nonlinear case and the bootstrap methods of setting confidence 109 i n t e r v a l s for inves ti gate d. @ , given in S e c t i o n Letting 3, are b e i n g 7. k..I.. = c. j (~i) 13 13 l E,(Y) = Y + 7. z. ( c . - c . ) 1 1 l i£s var,(Y) d i(s)y i , = X . Similarly, A 7. ies that k..I.. = k..1.. , we get 13 13 31 31 5.2 General Results. A general linear unb i a s e d e s t i m a t o r of the p o p u l a t i o n total Y (in the scalar case of p =i) b a s e d on a sample s of size n w i t h a s s o c i a t e d p r o b a b i l i t y of selection p(s) , is given b y y = and n o t i n g = imE, {ki, j.(zi. - z j , ) } 2 (5.7) = 1 w h e r e the k n o w n w e i g h t s di(s) can d e p e n d on s and the unit i (i ~ s). Rao (1979) p r o v e d that a n o n n e g a t i v e ^ u n b i a s e d q u a d r a t i c e s t i m a t o r of variance of Y is n e c e s s a r i l y of the form 7.7. k . . I . . ( z . m i~j 13 l] -z.)2. 1 (5.13) 3 £s k i2j l i j /m We now choose such that (5.13) is A var(Y) = - 7. 7. d.. (s) w.w.(z. - z . ) 2 , i < j 13 i 3 l 3 identical (5.8) to var(Y) vat(Y) = £ s where z i = Yi/Wi , the w.l given by (5.8), i.e., 7.7. b . . ( s ) w . w . ( z . - z . ) 2 i~j 13 1 ] l 3 . £s are k n o w n c o n s t a n t s Thus such that Var(Y) = 0 w h e n Yi ~ wi • Here the known c o e f f i c i e n t s d.. (s) satisfy the u n b i a s e d ness c o n d i t i o n 13 1 k 2.1. . = b..(s) m 13 l] 13 w.w. . i 3 (5.14) The choice of Edi3.(s) = C o v [ d i(s),dj(s)] , k.. and I.. s a t i s f y i n g (5.14) 13 13 is not unique. One simple choice is I.. = I/In(n-l) ] , i.e. equal p r o b a b i l i t i e s 13 for each of the n(n-l) pairs of i ~ j. A n o t h e r choice is (5.9) where d. (s) = 0 if i ~ s , and d..(s) = 0 if l 13 s does not c o n t a i n b o t h i and j. As an example,the well-known Horvitz-Thompson estimator YHT satisfies (5.7) w i t h d.l(s) = i/7 i , i £ s (5.i5) lij = 7[ij/ds , and w. = 7[. 1 and 1 where -d..(s)w.w. = ~3 x 3 where 7. = 7~ p(s) 1 sgi , 7.. 13 and = 7.. = 7. p(s) l] s 9 i,j £s S p e c i a l case. we h a v e ' are m 7. k (z -z ) (i*,j*)£s* i'j* i* j* (5.11) Y HT e = g (Y) , for i < j £ s It follows to be specified, + 7.7. k . . I . . ( z . i~j 13 13 -z.) l . 7[.7. -7[.. 1 3 i~ 7[.. 13 estimator, (5 16) , 7.7[. - 7 . . i 3 i~ 7[.. 13 . (5.17) to m 7 *-~i* j* ½ 7 [ i*Tj -- - ] (zi,-zj,). (5.18) (i*, j*)=l 7i*j* from (5.13) that var,(YHT)_ Yates-Grundy variance 7.7[. -7[.. Var(YHT ) = 7~ 7. i 3 13 (z.-z.)2. 7[ , 1 3 i<j£s i3 reduces estimator: (5.19) The p r o p e r t i e s of var,(8) in the n o n l i n e a r case and the b o o t s t r a p c o n f i d e n c e i n t e r v a l s o n 8 are b e i n g i n v e s t i g a t e d . = Y + E , { k lj,. ,(zi, - z j , ) } = Y reduces to the w e l l - k n o w n and s* denotes the set of m b o o t s t r a p pairs. (2) I m p l e m e n t steps (ii) and (iii) of S e c t i o n 2.2 using @ o b t a i n e d from (5.11). In the linear case w i t h p = I , @ = Y , we get E,(Y) k 2. . .I. = 13 13 k 2 • = n 2 (n-l) 2 13 YHT = Y H T + k ij = kji For the H o r v i t z - T h o m p s o n u s i n g (5.10) in (5.14). If m = n(n-l) and I.. = i/[n(n-l)], then (5.16) reduces to 13 Hence, where 7.7 7[... iflj 13 (5.10) the first and s e c o n d o r d e r i n c l u s i o n p r o b a b i l i ties_ Let bij (s) = -d i • (s)/2 and assume that 3 S bi~(s) > 0 for all i < j £ s and all 4 This is true for many w e l l - k n o w n schemes, i n c l u d i n g the R a o - H a r t l e y - C o c h r a n (RHC) method. Our b o o t s t r a p s a m p l i n g is as follows: (I) c o n s i d e r all the n(n-l) pairs (i,j), i ~ j and s e l e c t m pairs (i*, j*) with replacement with probabilities ~ij (= ~ji ) to be specified. Calculate = ~ + 1 m d s ~.7[. -7[.. (5.12) 3 Remark. In the c a s e of RHC method, we now have two d i f f e r e n t b o o t s t r a p m e t h o d s : (i) The m e t h o d of S e c t i o n 6.1 b a s e d on s e l e c t i n g a b o o t s t r a p £S 110 Y* = Y.-value of the i-th bootstrap l 1 Calculate sample with probabilities Pk and with replacen {yk,Pk}l . (2) The present method of ment from A, selecting a bootstrap sample of pairs with replacement from the n(n-l) pairs (i,j) £ s , i ~ j, with probabilities I... We are investigating their relative merits~,3but intuitively method (i) is more appealing. Yij : Y + 1 1 ~0- .., g + 12i[ ** M0 cluster. ^~ (6.4) - and m.* 6. Two-stage Cluster Sampling W i t h o u t Replacement Y =-4 Yi n i=l mi j=i J Suppose that the population is comprised of N clusters with M t elements (subunits) in the t-th cluster (t = i ..... N). The population size M 0 (= 7.M. ) is unknown in many applications. A simple~ random sample of n clusters is selected without replacement, and m i elements are chosen, again by simple random sampling w i t h o u t replacement, from the M i elements in i-th cluster if the latter is selected. The customary unbiased estimator of t h e population total Y is ^* ~_ 11 = y +-n where ^* + i=l ~?, i ~** Yi 2i *-** = MiYi -** ' ' **. Yi (6 5) " , = lj Yij/mi , and m. ,2 * i 2 = n__~(l_fl ) ; = fl (I) 11 n-i 12i f2i m*-i " (6 6) " 1 n n N [ MiYi : n I Yi i=l i=l ^ N Y = n ' (2) Implement steps (ii) and (iii) of Section 2.2 with @ = g(Y) . Let E2, and var2, respectively denote the (6.1) Yi is the sample mean for i-th sample cluster. The corresponding estimator of @ is written as ~ = g(~), where where ^ ! Y=M~0 n conditional bootstrap expectation and variance, for a given bootstrap sample of clusters. Similarly, El, and varl, denote the bootstrap expectation and variance sample clusters. Then 1 i=l respectively for the ^, 1 n 1 :- II n i~l~ where M 0 = M0/N. mi MiYij , E2,(~) (6.2) = y + -n- 7~ ~ j= For instance, and if^ @ = Y/M 0 A where M0 is unknown, then 0 = Y/X where E,(Y) =EI,E2,(Y) = Y + I I ( Y - Y ) (6.7) = Y . ^ X = X/M 0 all and elements X = (N/n) 7~M.x.1 1 j with in any cluster X.l = i i. for S imi i arly, ^, An u n b i a s e d ^ estimator of variance of (Cochran, 1977, p.303) Y varl,E2,(?) is given by = l I varl, ~ 2 Yi _ (6.8) -- var(Y) n = n(n-l) i=l A Also m. +j m.• (m -i) -- - , n (6.3) i=l where Yij is the y-value in the i-th sample cluster, f2i = mi/M'l mi l [MiYij Y )2 j=l for j-th sample element fl =n/N and where . 2 12i = fl (l-f2i)mi/(mi-l) " Hence, combin- ing (6.8) and (6.9) we get ~ ~__ ~ ~__ var, (Y) = varl,E2, (Y) + E l , v a r 2 , (Y) = var(Y) . We employ two-stage bootstrap sampling to obtain - i -~Y**~ from _{yi } as follows: (I) Select J J a simple random sample of n clusters with replacement from the n sample clusters and then draw a simple random sample of m i elements with replacement from the m i elements in i-th sample cluster if the latter is chosen. (Independent bootstrap subsampling for the same cluster chosen more than once.) We use the following notation: ** = y-value of the j-th bootstrap element in Yij the i-th bootstrap cluster; m. = m . - v a l u e 1 i-th bootstrap 1 cluster (similarly ~ Hence, the bootstrap variance estimator, var,(Y), reduces to vat(Y-) in the linear case. Thus the variability between bootstrap estimates correctly accounts for the between-cluster and withincluster components of the variance without the necessity of estimating them separately as in the case of BRR or the jackknife. Acknowledgement. This research was supported by a grant from of the 1 M[), and 111 Hinkley, D. and Wei, B. (1984) . Improvements of jackknife confidence limit methods. Biometrika, 71, 331-339. the Natural Sciences and Engineering Research Council of Canada. C.F.J. Wu is partially supported by NSF grant No. MCS-83-00140, Our thanks are due to David Binder for useful discussions on the bootstrap for unequal probability sampling without replacement. Krewski, D. and Rao, J.N.K. (1981). Inference from stratified samples: properties of the linearization, jackknife and balanced repeated replication methods. Ann. Statist., 9, 1010-1019. Re fe rence s m Abramovitch, L. and Singh, K. (1984). Edgeworth corrected pivotal statistics and the bootstrap. Preprint. Bickel, P.J. and Freedman, D.A. (1981). asymptotic theory for the bootstrap. Statist., 9, 1196-1217. McCarthy, P.J. (1969). Pseudoreplication: half samples. Rev. Int. Statist. Inst., 37, 239-264. Some Ann. Rao, J.N.K. and Wu, C.F.J. (1983). Inference from stratified samples: second order analysis of three methods for nonlinear statistics. Technical Report Series of the Laboratory for Research in Statistics and Probability, Carleton University, Ottawa, No. 7. Bickel, P.J. and Freedman, D.A. (1984). Asymptotic normality and the bootstrap in stratified sampling. Ann. Statist., 12, 470-482. Chao, M.T. and Lo, S.H. (1983). method for finite population. A bootstrap Preprint. Cochran, W.G. (1977). Sampling Techniques ed.) . Wiley, New York. Rao, J.N.K. (1979). On deriving mean square errors and their non-negative unbiased estimators in finite population sampling. J. Indian Statist. Assoc., 17, 125-136. (3rd Rao, J.N.K., Hartley, H.O. and Cochran, W.G. (1962). On a simple procedure of unequal p r o ~ ability sampling without replacement. J.R. Statist. Soc. B, 24, 482-491. Efron, B. (1981). Nonparametric standard errors and confidence intervals. Canadian J. Statist., 9, 139-172. Efron, B. (1982). The Jackknife, the Bootstrap and other Resampling Plans. SIAM, Philade iphi a. Rao, J.N.K. and Wu, C.F.J. (1983). Bootstrap inference with stratified samples. Technical Report Series of the Laboratory for Research in Statistics and Probability, Carleton University, Ottawa, No. 19. Gross, S. (1980). Median estimation in sample surveys. Proc. Amer. Statist. Assoc., Section on Survey Research Methods, 181-184. 112
© Copyright 2024