Homework 6 - Solutions Intermediate Statistics - 36705 Problem 1. [20 points: 5+10+5] (a) First note that: θ = P (Xi = 0) = e−λ ¯ So by In HW5, problem 3, we saw that the MLE for λ is given by the sample mean X. ¯ −X ˆ equivariance we have that the MLE θ for θ is e . (b) By the Central Limit Theorem: √ ¯ n(X − λ) √ N (0, 1) λ ¯ So by the Delta Method with g(t) = e−t we have that the limiting distribution of θˆ = e−X is defined by: √ −X¯ n(e − e−λ ) √ N (0, 1) e−λ λ P ¯ → (c) By the Weak Law of Large numbers X λ. Since function e−t is continuous, by the ¯ P Continuous Mapping Theorem θˆ = e√−X → e−λ , i.e. θˆ is consistent. ¯ Note: notice that we also have n(√X−λ) N (0, 1). How would you formally show it? ¯ X Problem 2. [30 points: 5+5+10+10] (a) θ ¯ =⇒ θˆ1 = 2X 2 The likelihood function is L(θ) = θ1n I(X(n) ,∞) (θ). This function is equal to zero on (−∞, X(n) ), n then at X(n) jumps to 1/(X(n) ) > 0 and decreases asymptotically to zero as θ → ∞. This means that the supremum is at θ = X(n) , i.e. the MLE for θ is E [X1 ] = θˆ2 = X(n) P ¯− (b) By the Weak Law of Large Numbers we know that X → θ/2 so by Continuous Mapping Theorem θˆ1 is consistent. To show consistency of θˆ2 , let > 0 and then, using results from previous HWs and Test1: n P(|θ − X(n) | > ) = P(θ − > X(n) ) = 1 − →0 θ as n → ∞, which proves the claim. (c) MOME. We have that Var(θˆ1 ) = θ2 . 3n Thus, by the Central Limit Theorem: √ ¯ − θ) θˆ1 − θ n(2X √ = θ/ 3 sd(θˆ1 ) N (0, 1) MLE. To find the limiting distribution θˆ2 = X(n) : n(θ − X(n) ) c n P ≤ c = P(θ − X(n) ≤ θc/n) = 1 − 1 − → 1 − e−c θ n which implies that: n(θ − X(n) ) Exp(1) θ (d) MOME. For every > 0 we can find K such that √ nVar(θˆ1 ) θ2 = < n|θˆ1 − θ| > K = P |θˆ1 − θ| > K/ n ≤ K2 3K 2 √ in particular for every K > √θ3 . Thus, θˆ1 − θ = OP (1/ n). n MLE. Recall that if Wi ∼ Uniform(0, 1), then W(n) ∼ Beta(n, 1) such that E[W(n) ] = n+1 2 θn θ and Var(W(n) ) = (n+1)n2 (n+2) . Thus E[θˆ2 ] = n+1 and Var(θˆ2 ) = (n+1)θ 2n(n+2) and Bias(θˆ2 ) = n+1 . Therefore, for every > 0 we can find K such that 2 2 2 2 n θ n n θ 2 + P(n|θˆ1 − θ| > K) ≤ 2 E[(θˆ1 − θ) ] = 2 K K (n + 1)2 (n + 1)2 (n + 2) n2 θ 2 θ 2 n 2θ2 ≤ 2 + < ≤ K n2 n3 K2 √ P √ in particular for every K > 2θ . This proves that θˆ2 − θ = OP (1/n). In terms of speed of convergence the θˆ2 is preferable. Problem 3. [10 points] ¯ We also have that For the Normal distribution we know the MLE for its mean is given by X. 1 ¯ in this problem sd(X) = √n . Thus, the Wald statistic for this problem is given by: W = √ ¯ − µ0 ) n(X and under the null hypothesis H0 : µ = µ0 we have W ∼ N (0, 1) (exact distribution). The alternative hypothesis is H1 : µ 6= µ0 , which means that we are doing a two-sided test. Thus, in the case of equal tails, the rejection rule of a size α Wald test has the form: √ ¯ − µ0 )| > zα/2 W < −zα/2 or W > zα/2 ⇐⇒ |W | > zα/2 ⇐⇒ | n(X where zβ = Φ−1 (1 − β), and Φ(∗) is the cdf of Z ∼ N (0, 1). Notice that the absolute value comes from symmetry about zero. √ The p-value for the observed w = n(¯ x − µ) is: Pµ0 (|W | > w) = 2Φ(−|w|) where Φ √ is the CDF of the standard normal distribution. If Xi ∼ N (µ, 1) then we have that W ∼ N ( n(µ − µ0 ), 1), where µ is the true value. Thus, the CDF of the p-value is given by: Pµ (2Φ(−|W |) ≤ α) = Pµ (Φ(−|W |) ≤ α/2) = Pµ (−|W | ≤ −zα/2 ) = = Pµ (W ≤ −zα/2 ) + Pµ (W ≥ zα/2 ) √ Writing W = Z + n(µ − µ0 ), where Z ∼ N (0, 1), the above equals to √ √ = Pµ (Z ≤ −zα/2 − n(µ − µ0 )) + Pµ (Z ≥ zα/2 − n(µ − µ0 )) √ √ = Φ(−zα/2 − n(µ − µ0 )) + Φ(−zα/2 + n(µ − µ0 )) In the particular case of µ = µ0 the previous equation equals α which proves that the p-value has Uniform distribution under the null hypothesis. Problem 4. [40 points: 10+10+10+10] (a) We compute the MLE when both µ, σ 2 are unknown for a given sample X1 , . . . , Xn : " n # n X (Xi − µ)2 1 L(µ, σ 2 ) = √ exp − 2σ 2 2πσ i=1 Taking log and finding the partial derivatives: n 1 X ∂ 2 log L(µ, σ ) = 2 (Xi − µ) ∂µ σ i=1 n ∂ n 1 X 2 log L(µ, σ ) = − 2 + 4 (Xi − µ)2 2 ∂σ σ σ i=1 Setting both partial derivatives equal to 0 and solving the system of equations we obtain the estimators: n 1X 2 ¯ 2 ¯ (Xi − X) µ ˆ = X, σ ˆ = n i=1 This is exactly the same as example 4 in lecture 7. See also Lecture 9, Example 9 to see that the second order conditions are satisfied. Since the MLE σ ˆ is a consistent estimator of σ, by Slutsky Theorem ¯ −µ X √ N (0, 1) σ ˆ/ n Thus, the test statistic of the Wald test for H0 : µ = µ0 versus Ha : µ 6= µ0 is W = ¯ − µ0 X √ σ ˆ/ n Hence, under H0 we have W ≈ N (0, 1). The alternative hypothesis is H1 : µ 6= µ0 , which means that we are doing a two-sided test. Thus, in the case of equal tails, the rejection rule of a size α Wald test has the form: W < −zα/2 or W > zα/2 ⇐⇒ |W | > zα/2 (b) Likelihood Ratio Test for H0 : µ = µ0 Vs H1 : µ 6= µ0 . We reject H0 if: supθ∈Θ0 L(θ) L(θˆ0 ) = <k λ(X n ) = ˆ supθ∈Θ L(θ) L(θ) where Θ0 = µ0 × (0, +∞), and k is some constant that will make P(TYPE I error) ≤ α. We have: ( ) n X 1 L(µ0 ; σ 2 ) = (2πσ 2 )−n/2 exp − 2 (Xi − µ0 )2 2σ i=1 n n 1 X 2 l(µ0 ; σ ) = − log(2πσ ) − 2 (Xi − µ0 )2 2 2σ i=1 2 Moreover, let ξ = σ 2 : ∂l n 1 =− + ∂ξ 2ξ 2 2 Pn ∗ solved for σ ˆ =ξ = i=1 (Xi n − µ0 )2 Pn i=1 (Xi − ξ2 µ0 )2 =0 (1) (2) (3) (4) . The second order condition is satisfied: n n ∂ 2l = − <0 ∂ξ 2 2ξ 2 ξ 2 (5) Thus Pn n no 2 −n/2 i=1 (Xi − µ0 ) 2π exp − n 2 ¯ and 1 Pn (Xi − X) ¯ 2 , respectively. Thus: The MLE estimators for µ and σ 2 are X i=1 n Pn n no ¯ 2 −n/2 (X − X) i i=1 ˆ = 2π L(θ) exp − n 2 L(θˆ0 ) = L(µ0 , σ ˆ2) = Therefore, we will reject H0 if Pn n no 2 −n/2 i=1 (Xi − µ0 ) Pn exp − 2π ¯ 2 n/2 (Xi − X) L(θˆ0 ) n 2 n i=1 λ(X ) = = Pn = Pn = 2 n no 2 −n/2 ˆ (X − µ ) ¯ i 0 L(θ) (X − X) i=1 i 2π i=1 exp − n 2 Pn n/2 ¯ 2 ¯ − µ0 )2 −n/2 n(X i=1 (Xi − X) = Pn = 1 + Pn <k ¯ 2 + n(X ¯ − µ0 )2 ¯ 2 (Xi − X) (Xi − X) i=1 i.e. we reject H0 if (6) (7) (8) (9) i=1 ¯ − µ0 )2 n(X Pn ¯ 2 i=1 (Xi − X) (10) ¯ − µ0 X is sufficiently large. Equivalently we reject H0 when T = p is very large or very small, 2 /n S Pn ¯ 2 i=1 (Xi − X) 2 where S = . Since T ∼ Tn−1 , for a fixed α we reject H0 if |T | > tn−1,α/2 . n−1 However, under the null hypothesis, for large n, T ≈ N (0, 1), such that we can also use zα/2 as the critical value. Thus, compare results in (a). (c) We have that Pn i=1 (Xi − σ2 ¯ 2 X) ∼ χ2n−1 so that for the MLE P ¯ 2 σ 2 ni=1 (Xi − X) σ2 2 σ ˆ = χ ∼ n σ2 n n−1 p 2 Thus, sd(ˆ σ 2 ) = σn 2(n − 1) which we can estimate by ˆ2 p ˆ σ2) = σ 2(n − 1) sd(ˆ n So by asymptotic normality of MLE and Slutsky Theorem, we have 2 σ ˆ 2 − σ2 σ ˆ 2 − σ2 p p ≈ σ ˆ2 2 2(n − 1) σ ˆ 2/n n N (0, 1) So the test statistic is σ ˆ 2 − σ02 p σ ˆ 2 2/n and the rejection region of the α level Wald Test is defined by W = |W | > zα/2 for similar reasons explained in (a). (d) Likelihood Ratio Test for H0 : σ = σ0 Vs. σ 6= σ0 . We reject H0 if: supθ∈Θ0 L(θ) L(θˆ0 ) = <k (11) λ(X n ) = ˆ supθ∈Θ L(θ) L(θ) where Θ0 = σ0 × R and k is a constant. For every fixed value of σ 2 , we have that the MLE for ¯ Thus: µ is X. P ¯ 2 1 ni=1 (Xi − X) 2 −n/2 ˆ L(θ0 ) = 2πσ0 exp − (12) 2 σ02 ¯ and 1 Pn (Xi − X) ¯ 2 , respectively. Thus: The MLE estimators for µ and σ 2 are X n i=1 Pn n no ¯ 2 −n/2 i=1 (Xi − X) ˆ exp − (13) L(θ) = 2π n 2 Thus: P ¯ 2 1 ni=1 (Xi − X) exp − 2 σ02 n λ(X ) = Pn = n no ¯ 2 −n/2 i=1 (Xi − X) 2π exp − n 2 P Pn ¯ 2 n/2 ¯ 2 n n ni=1 (Xi − X) i=1 (Xi − X) = + exp − nσ02 2 nσ02 2 −n/2 (2πσ02 ) (14) (15) We can see that λ(X n ) is a function of the kind (g(y))n/2 , where g(y) = ye1−y . Function g is positive with unique maximum in y = 1 (see figure below). We will reject < k1 or Pn H0 if g(y) Pn 2 2 ¯ ¯ (Xi − X) (Xi − X) . We can then use statistic T = i=1 2 ∼ χ2n−1 , g(y) > k2 , where y = i=1 2 nσ0 σ0 under the null hypothesis H0 . In particular, we will reject H0 if T < a or T > b, for some particular a, b such that Pσ0 ({T < a} ∪ {T > b}) = α. The exact choice would be {a, b} = 2/n g −1 (kα ), where kα is such that Pσ0 (λ(X n ) < kα ) = α and g −1 is the generalized inverse of g. However, we might simply consider the equal tailed quantiles of the χ2n−1 distribution (i.e. a = χ2n−1;1−α/2 , b = χ2n−1;α/2 ), or choosing a, b delimiting the shortest interval with coverage 1 − α of the χ2n−1 distribution (see the figure below). Those a, b satisfy the following system of equations: 0<a<b f (a) = f (b) (16) Rb f (t)dt = 1 − α a where f is the p.d.f. of χ2n−1 . These values can be found numerically (e.g. by using R). Alternatively, we could use the asymptotic result κ = −2 log λ(X n ) ≈ χ21 (1 df because under H1 there are 2 free parameters, while under H0 we have 1 free parameter) and rejecting H0 if κ > χ21,α . Yet, in this exercise we prefer the previous non-asymptotic solutions. g(y) = yexp(1 − y) 0.0 0.10 0.00 0.2 0.05 0.4 g(y) f(t) 0.6 0.8 0.15 1.0 Chi−square. LR−Test on sigma 0 5 10 y 15 20 0 2 4 6 t 8 10
© Copyright 2024