Exercises on the bootstrap

Exercises on the bootstrap
Applied statistics
1. Let X1 , . . . , Xn be distinct observations (no ties). Show that there are (2n−1)
n
distinct bootstrap samples. Note that this number is asymptotic to (nπ)−1/2 22n−1 ,
increasing exponentially.
2. Let θ = h(F ) be the median of F and λn (F ) the bias of the sample median θˆn ,
i.e.
λn (F ) = E(θˆn ) − θ,
where θˆn is the sample median. Suppose that n = 3 and the observed values of
(X1 , X2 , X3 ) are (b, c, d). Compute λn (Fˆn ). See page 8 of slides Bootstrap
3. Let X1 , . . . , Xn be a sample from the uniform distribution on [0, θ]. In this
exercise we will see an example where the nonparametric bootstrap fails. This
was the assignment of group 10.
(a) Show that the maximum likelihood estimator for θ is given by θˆn = X(n)
(the largest order statistic). Derive the sampling distribution of θˆn .
(b) Sample 25 observations from a uniformly distributed random variable on
(0, θ). Take θ = 2. Implement the nonparametric bootstrap by drawing
∗
B = 5000 bootstrap samples. Compute θˆn,i
, i = 1, . . . , B and make a
histogram of these values.
ˆ and T ∗ = n(θˆn − θˆ∗ ). Show that for t ≥ 0
(c) Let Tn = n(θ − θ)
n
n
PFˆn (Tn∗ ≤ t) ≥ PFˆn (Tn∗ ≤ 0) = 1 − (1 − 1/n)n ,
the first inequality being trivial.
(d) Show that
lim sup sup |PF (Tn ≤ t) − PFˆn (Tn∗ ≤ t)| ≥ 1 − e−1 .
n→∞
t
Here Fˆn denotes the empirical distribution function.
Hint:
sup |PF (Tn ≤ t) − PFˆn (Tn∗ ≤ t)| ≥ |PF (Tn ≤ 0) − PFˆn (Tn∗ ≤ 0)|
t
= PFˆn (Tn∗ ≤ 0).
1
(e) The parametric bootstrap generates bootstrap samples X1∗ , . . . , Xn∗ by drawing from a uniform distribution on [0, θˆn ]. Denote the corresponding distribution by Fθˆn . Argue that
p
sup |PF (Tn ≤ t) − PFθˆn (Tn∗ ≤ t)| −→ 0
n → ∞.
t
and verify this by drawing B = 5000 bootstrap simulations.
Hint: Show that Tn converges to an exponential distribution with mean θ.
4. Suppose X1 , . . . , Xn is a random sample from an exponential distribution with
parameter θ. So its density is given by f (x) = θe−θx for x ≥ 0.
¯n.
(a) Show that the maximum likelihood estimator is given by θˆn = 1/X
√ ˆ
(b) Show that n(θn − θ)
N (0, θ2 ). See for example
chapter one in Wasser√
N (0, 1).
man. Conclude with the Delta-method that n(log θˆn − log θ)
(c) Show that an asymptotic CI for θ is given by
h
√
√ i
θˆn e−zα/2 / n , θˆn ezα/2 / n ,
with zα denoting α-critical values of the N (0, 1) distribution.
(d) Assume that we don’t know the underlying distribution is exponential
distribution. Construct confidence intervals of θ via non-parametric bootstrap. Compare in a simulation study the coverage and length of the
asymptotic CI, bootstrap pivotal confidence interval, bootstrap normal
confidence interval.
2