EdPsych/Psych/Soc 589 C.J. Anderson Homework 5: Answer Key 1

EdPsych/Psych/Soc 589
C.J. Anderson
Homework 5: Answer Key
1. The negative binominal is fit by only changing “dist=Poisson” to “dist=NegBin”
(regardless of whether you use method I or II). I’ll show Method II here.
Poisson
Negative Binomial
Parameter Estimate
SE
Wald p-value
Est.
se
Wald
p
α
−7.8180 0.0216 130, 556 < .0001 −7.8129 0.1200 4239.77 < .0001
1/ϕ = D
0.0000
.
0.3189 0.0936
Fit Statistics
df
22
21
2
G
669.4458
24.1502
2
X
658.4846
21.6263
AIC
812.6197
244.2357
Note: 95% CI for dispersion parameter is (0.1355, 0.5023)
The parameter estimates for α for the two models are very similar in value (i.e.,
−7.8180 versus −7.8129); however, their standard errors differ considerably (i.e.,
0.0216 for Poisson and 0.1200). This is consistent with the need for the Negative
Binomial due to overdispersion; that is, the se from Poisson are too small.
There is evidence in support of the Negative Binomial model being the better one:
• The estimated standard errors for Poisson and Negative binomial are very
different and that for NB is much larger. This is consistent with there being
over dispersion in the data that is a problem for the Poisson.
• The 95% CI for the dispersion parameter (0.14, 0.50) does not include 1 and
suggests there is overdispersion in the data.
• G2 and X 2 indicate an acceptable fit of the model to the data (i.e., comparing
them to a χ2 with ν = 22 would yield a large p-value. However, this is not the
case for the Poisson.
• The various information criteria (only AIC reported above) are all smaller for
the NB than the Poisson. The smaller the value, the better the model.
• It is reasonable to expect that the crowds over teams are heterogeneous, perhaps
due to living in different cities with different SES, crowding, etc.
1
Before starting to analyze the data for the next three problems (i.e., 3.13., 3.14 and zero
inflated, do a little bit of exploratory data analysis:
1. Compute the mean and variance of number of satellites. Compare.
The mean is less than the variance (i.e., 2.92 < (3.15)2 ), which suggests
overdispersion.
2. Plot a histogram of the number of satellites. Comment.
Below is a graph of the distribution of satellites. Notice that there are a lot with 0.
This could Other than this end of the distribution, Poisson may be OK. A lot of 0s
could explain why we have overdispersion (model fitting will help us decide for sure).
This might be best fit using a zero-inflated Poission.
Figure 1: The distribution of the counts.
3. Look at the relationship between number of satellites (or log of the number) by
weight. Comment.
Also a look at the number of satellites versus weight with a smooth curve (actually a cubic
regression) in the Figure 3.
2
Figure 2: Initial look at the data: counts versus explanatory variable with a cubic regression
curve and log(count) versus explanatory variable
3 with a linear regression curve overlayed..
It appears that there is an outlier in terms of weight (i.e., weight> 5). I deleted it and
recomputed the mean and variance, but doesn’t change results much, so I left it in for the
homework answers.
And now to do the problems. . .
Problem 3.13 on page 94 of Agresti (2007). The data with SAS code to create a SAS data
set is on the course web-site. Note that I re-scaled weight to kg.
1. The prediction equation is
µ
ˆi = exp(−0.4284 + 0.5893(weight)i )
2. The estimated mean for a female weighing 2.44kg is
µ
ˆ = exp(−0.4284 + 0.5893(2.44)) = exp(1.0095) = 2.7442
3. For a one kg increase in weight, the (mean) number of satellites is exp(.5893) = 1.80
times (or 80% larger).
Although a 95% confidence interval of βˆ is given in the SAS output, this comes from
βˆ ± 1.96(se)
ˆ
0.5893 ± 1.96(0.0650)
0.5893 ± 0.1274 −→ (0.4619, 0.7167)
The 95% confidence interval for the multiplicative effect (i.e., exp(β)) is found by
taking exp of the end-points of the interval for β:
(exp(0.4619), exp(0.7167)) −→ (1.59, 2.05)
4. A Wald test: Ho : β = 0 versus Ha : β ̸= 0.
(
0.5893
X =
.0650
2
)2
= 82.15.
Comparing 82.15 a chi-square distribution with df = 1 yields a very small p-value;
therefore, reject Ho and conclude the data support the hypothesis that the number of
satellites is related to weight of the female crab.
5. A likelihood ratio test:
−2(35.9898 − 71.9524) = 71.93,
which has a very small p-value (compare 71.93 to chi-square with df = 1). Conclusion
is the same as in part (d).
4
Problem 3.14 on page 94 of Agresti (2007). Fitting a negative binomial model. . .
1. The prediction equation is
µ
ˆ = exp(−0.8647 + 0.7603(weight)).
The dispersion parameter is
1/ϕ = D(in Agresti notation) = 1.0740.
The estimated standard error of the dispersion parameter is 0.1935.
There is evidence that the Negative Binomial gives a better fit than the Poisson:
• The 95% confidence interval for 1/ϕ = D is (0.6948, 1.4533). The value 0 is not
in this interval which suggests we need the scale parameter.
• All of the global fit statistics are much better for the Negative Binomial than
the Poisson:
Criterion
Deviance
Pearson Chi-Square
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
DF
171
171
Poisson
Value Value/DF
560.8664
3.2799
535.8957
3.1339
920.1641
920.2347
926.4707
Negative Binomial
Value Value/DF
196.1603
1.1471
147.9588
0.8653
754.6437
754.7857
764.1036
Since the Poisson is a special case of the binomial, we could do a likelihood ratio
test (i.e., LR= 560.8664 − 196.1603 = 364.71, df = 1, p is tiny). Also, according
to the information criteria, the Negative Binomial has smaller values and this
indicate it’s better than the Poisson model.
• Graphics indicate that the Negative Binomial out-performs the Poisson (I didn’t
expect graphs, but it you did them, Great!). The Negative Binomial includes
more points within the 95% confidence bands and fits the distribution of counts
better than the Poisson (however there is room for improve. . . ZIP does the
best).
5
Figure 3: The models were fit to data and then grouped to “see” how well the models are
fitting the data.
6
Figure 4: To see how well the various models are doing in terms of fitting the distribution
of number of satellites. Neither the Poisson or Negative Binomial are really doing that well;
however, the ZIP does pretty good.
7
2. A 95% confidence interval for β with the Negative Binomial is
βˆ ± 1.96(se) = 0.7603 ± 1.96(0.1769)
= 0.7603 ± 0.3467 −→ (0.4136, 1.1070)
Versus the one from the Poisson regression that was (0.4619, 0.7167) that has
half-length equal to 0.1274. The one from the Negative Binomial is wider than the
Poisson because the greater the estimated variance with the Negative Binomial (i.e.,
µ
ˆi + 1.0740ˆ
µ2i ) results in greater estimated standard error for β (see page 82 of the
text).
Fit a zero inflated Poisson regression using weight as a predictor of the mean and width as
a predictor in a logit model for the mixing probability.
I fit several ZIP models, but the one that seemed to the best in terms of fit of model to
data and parameter estimates are significant is one with weight as a predictor in the
Poisson regression and width as a predictor in a logit model for the mixing probability.
The results are
Criteria For Assessing Goodness Of Fit
Criterion
DF
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson X2
Log Likelihood
Full Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
Value
725.7859
725.7859
229.9698
229.9698
167.1414
-362.8930
733.7859
734.0240
746.3991
169
169
Value/DF
1.3608
1.3608
Algorithm converged.
Analysis Of Maximum Likelihood Parameter Estimates
Parameter
DF
Estimate
Standard
Error
Wald 95% Confidence
Limits
8
Wald
Chi-Square
Pr > ChiSq
Intercept
weight
Scale
1
1
0
0.9901
0.1945
1.0000
0.2092
0.0761
0.0000
0.5800
0.0454
1.0000
1.4002
0.3436
1.0000
22.39
6.54
<.0001
0.0106
NOTE: The scale parameter was held fixed.
Analysis Of Maximum Likelihood Zero Inflation Parameter Estimates
Parameter
DF
Estimate
Standard
Error
Intercept
width
1
1
12.3902
-0.5005
2.6937
0.1044
Wald 95% Confidence
Limits
7.1106
-0.7051
17.6698
-0.2959
Wald
Chi-Square
21.16
22.98
Pr > ChiSq
<.0001
<.0001
So the estimated model for the probability is
π
ˆi =
exp(12.3902 − 0.5005(width)i ))
.
1 + exp(12.3902 − 0.5005(width)i )
The odds of being in the “zero class” is exp(−.5005) = 0.61 times the odds for a one unit
increase in width. In other words, the wider the crab, the less likely they’re in the
zero-class.
The estimated probability of a count:
µ
ˆi = exp(0.9901 + 0.1945(weight)i )
{
P (Yi = y) =
π
ˆi + (1 − π
ˆi ) exp(−ˆ
µi )
exp(−ˆ
µi )ˆ
µyi
(1 − π
ˆi )
y!
for y = 0
for y > 0
Since exp(0.1945) = 1.21, the expected number of satellites is 1.21 than the mean number
of satellites with one unit less in weight.
9
Figure 5: Observed and fitted from ZIP with logit model.
10