Analysis of Variance STAT E-150 Statistical Methods

STAT E-150
Statistical Methods
Analysis of Variance
In Analysis of Variance, we are testing for the equality of the means of
several levels of a variable. The technique is to compare the variation
between the levels and the variation within each level. (The levels of
the variable are also referred to as groups, or treatments.)
If the variation due to the level (variation between levels) is
significantly larger than the variation within each level, then we can
conclude that the means of the levels are not all equal.
2
We will test the hypothesis
vs.
using the ratio F 
H0: μ1 = μ2 = ∙∙∙ = μk
Ha: the means are not all equal
variability between groups
MSGroups

variability within the groups
MSError
When the numerator is large compared to the denominator, we will
reject the null hypothesis.
3
∙ The numerator of F measures the variation between groups; this is
called the Mean Square for groups:
MSGroups = SSGroups/df = SSGroups/(k-1)
∙ The denominator of F measures the variation within groups; this is
called the Error Mean Square: MSE = SSError/df = SSError/(n - k)
MSGroups
. We will reject H0 when F is large.
∙ The test statistic is F =
MSError
∙ MSGroups has k - 1 degrees of freedom, where k = the number of
groups.
∙ MSError has n - k degrees of freedom, where n is the total sample size.
4
The ANOVA table:
Source
df
SS
MS
F
Model
k-1
SSGroups
MSGroups
MSGroups/MSError
Error
n-k
SSError
MSError
Total
n-1
SSTotal
p
If the null hypothesis is true, the groups have a common mean, μ.
Each group mean μk may differ from the grand mean, μ, by some value.
This difference is called the group effect, and we denote this value for the
kth group by αk.
5
If the null hypothesis is true, the groups have a common mean, μ.
Each group mean μk may differ from the grand mean, μ, by some value.
This difference is called the group effect, and we denote this value for the
kth group by αk.
6
One-Way Analysis of Variance Model
The ANOVA model for a quantitative response variable and a single
categorical explanatory variable with K values is
Response = Grand Mean + Group Effect + Error Term
Y
=
μ
+
αk
+
ε
The Grand Mean (μ) is the part of the model that is common to all
observations. The Group Effect is the variability between groups.
The residual, or error, is the variability within groups.
Since μk = μ + αk we can write this model as Y = μk + ε
where ε ~ N(0, σε) and are independent.
That is, the errors are approximately normally distributed with a mean of
0 and a common standard deviation, and are independent.
7
The assumptions for a One-Way ANOVA are:
1. Independence Assumption
The groups must be independent of each other, and the subjects
within each group must be randomly assigned.
Think about how the data was collected:
Were the data collected randomly or generated from a randomized
experiment?
Were the treatments randomly assigned to experimental groups?
8
2. Equal Variance Assumption
The variances of the treatment groups are equal.
Look at side-by- side boxplots of the data to see if the spreads are
similar; also check that the spreads don't change systematically
with the centers and that the data is not skewed in each group.
If either of these is true, a transformation of the data may be
appropriate.
Also plot the residuals against the predicted values to see if larger
predicted values lead to larger residuals; this may also suggest that
a reexpression should be considered.
9
3. Normal Population Assumption
The values for each treatment group are normally distributed.
Again, check side-by-side boxplots of the data for indications of
skewness and outliers.
10
Example:
A study reported in 1994 compared different psychological therapies for
teenaged girls with anorexia. Each girl’s weight was measured before
and after a period of cognitive behavioral therapy designed to aid weight
gain. One group used a cognitive-behavioral treatment, a second group
received family therapy, and the third group was a control group which
received no therapy. The subjects in this study were randomly assigned
to these groups.
The weight change was calculated as weight at the end of the study
minus weight at the beginning of the study; the weight change was
positive if the subject gained weight and negative if she lost weight.
What does this data indicate about the relative success of the three
treatments?
Note that in this analysis, the explanatory variable (type of therapy) is
categorical and the response variable (weight change) is quantitative.
11
The hypotheses are:
H0: μ1 = μ2 = μ3
Ha: the means are not all equal
Note that the null hypothesis is not H0: μ1 ≠ μ2 ≠ μ3
12
Some of the data is shown below. For SPSS analysis, the data should
be entered with the group in one column and the data in a second
column:
Group
WeightGain
Group
WeightGain
1
1
-0.5
37
2
11.7
2
1
-9.3
38
2
6.1
3
1
-5.4
39
2
1.1
4
1
12.3
40
2
-4
5
1
-2
41
2
20.9
6
1
-10.2
42
2
-9.1
7
1
-12.2
43
2
2.1
8
1
11.6
44
2
-1.4
9
1
-7.1
45
2
1.4
10
1
6.2
46
2
-0.3
11
1
-0.2
47
2
-3.7
12
1
-9.2
48
2
-0.8
13
1
8.3
49
2
2.4
14
1
3.3
50
2
12.6
15
1
11.3
51
2
1.9
16
1
0
52
2
3.9
17
1
-1
53
2
0.1
18
1
-10.6
54
2
15.4
19
1
-4.6
55
2
-0.7
20
1
-6.7
56
3
11.4
21
1
2.8
57
3
11
13
First we will see if the equal variance condition is met, by comparing
side-by-side boxplots of the data:
The boxplots do not show a great deal of difference in the spread of the
data, but are not conclusive.
14
We can compare the largest standard deviation and the smallest
standard deviation; if this ratio is less than or equal to 2, then we can
assume that the variances are similar.
In this case Smax = 7.99 and Smin = 7.16 The ratio is 7.99/7.16 = 1.116,
which is less than 2, and so we can assume that the equal variance
condition is met.
15
We can also use Levene's test:
Test of Homogeneity of Variances
WeightGain
Levene Statistic
.314
df1
df2
2
Sig.
69
.731
This test for homogeneity of variances tests the null hypothesis that the
population variances are equal:
H0: σ12 = σ22 = σ32
Ha: the variances are not all equal
Since the p-value is very large (.731), we cannot reject this null
hypothesis, and we can conclude that the data does not violate the equal
variance assumption.
16
We can check the Normality condition with Normal Probability Plots of
the three groups:
17
We can also use the table shown below to assess Normality, using a
hypothesis test where the null hypothesis is that the distribution is
normal. The p-values for groups 1 and 3 are larger than .05, so this null
hypothesis is not rejected for these groups.
Tests of Normality
Kolmogorov-Smirnova
Treatment
WeightGain
Statistic
df
Shapiro-Wilk
Sig.
Statistic
df
Sig.
1
.094
26
.200*
2
.223
29
.001
.896
29
.008
17
.200*
.954
17
.516
3
.129
a. Lilliefors Significance Correction
*. This is a lower bound of the true significance.
.952
26
.257
For the moment, we will assume that the conditions are met.
18
The SPSS output includes the following ANOVA table:
ANOVA
Gain
Sum of Squares
Between Groups
df
Mean Square
614.644
2
307.322
Within Groups
3910.742
69
56.677
Total
4525.386
71
F
5.422
Sig.
.006
You can see that F = 5.422 and the p-value is .006.
Since p is small, we reject the null hypothesis that the means are all
equal. This data provides evidence of a difference in the mean weight
gain for the three groups.
But where is this difference?
19
Descriptives
Gain
95% Confidence Interval for
Mean
N
Mean
Std. Deviation
Std. Error
Lower Bound
Upper Bound
Minimum
Maximum
1
26
-.450
7.9887
1.5667
-3.677
2.777
-12.2
15.9
2
29
3.007
7.3085
1.3572
.227
5.787
-9.1
20.9
3
17
7.265
7.1574
1.7359
3.585
10.945
-5.3
21.5
Total
72
2.764
7.9836
.9409
.888
4.640
-12.2
21.5
Which group had the greatest mean weight gain? Group 3
Which group had the lowest mean weight gain?
Group 1
20
Descriptives
Gain
95% Confidence Interval for
Mean
N
Mean
Std. Deviation
Std. Error
Lower Bound
Upper Bound
Minimum
Maximum
1
26
-.450
7.9887
1.5667
-3.677
2.777
-12.2
15.9
2
29
3.007
7.3085
1.3572
.227
5.787
-9.1
20.9
3
17
7.265
7.1574
1.7359
3.585
10.945
-5.3
21.5
Total
72
2.764
7.9836
.9409
.888
4.640
-12.2
21.5
Which group had the greatest mean weight gain? Group 3
Which group had the lowest mean weight gain?
Group 1
Is either of these values significantly different from the other group
means? Are all three groups different in terms of weight gain?
We can answer these questions using a post-hoc test, Tukey's Honestly
Significant Difference test, which compares all pairs of group means.
21
Here is one result of this test:
Multiple Comparisons
Gain
Tukey HSD
95% Confidence Interval
(I) Group
(J) Group
1
2
-3.4569
2.0333
.212
-8.327
1.413
3
-7.7147*
2.3482
.005
-13.339
-2.090
1
3.4569
2.0333
.212
-1.413
8.327
3
-4.2578
2.2996
.161
-9.766
1.251
1
7.7147*
2.3482
.005
2.090
13.339
2
4.2578
2.2996
.161
-1.251
9.766
2
3
Mean Difference (I-J) Std. Error
Sig.
Lower Bound Upper Bound
*. The mean difference is significant at the 0.05 level.
The first line shows the comparison between Group 1 and Group 2.
The mean difference is -3.4569, but it is not significant since p = .212.
22
Here is one result of this test:
Multiple Comparisons
Gain
Tukey HSD
95% Confidence Interval
(I) Group
(J) Group
1
2
-3.4569
2.0333
.212
-8.327
1.413
3
-7.7147*
2.3482
.005
-13.339
-2.090
1
3.4569
2.0333
.212
-1.413
8.327
3
-4.2578
2.2996
.161
-9.766
1.251
1
7.7147*
2.3482
.005
2.090
13.339
2
4.2578
2.2996
.161
-1.251
9.766
2
3
Mean Difference (I-J) Std. Error
Sig.
Lower Bound Upper Bound
*. The mean difference is significant at the 0.05 level.
The next line shows that the difference between Group 1 and Group 3
is significant; not only is p = .005, but SPSS shows an asterisk beside
the mean difference of -7.7147 to indicate that the difference is
significant.
23
Here is one result of this test:
Multiple Comparisons
Gain
Tukey HSD
95% Confidence Interval
(I) Group
(J) Group
1
2
-3.4569
2.0333
.212
-8.327
1.413
3
-7.7147*
2.3482
.005
-13.339
-2.090
1
3.4569
2.0333
.212
-1.413
8.327
3
-4.2578
2.2996
.161
-9.766
1.251
1
7.7147*
2.3482
.005
2.090
13.339
2
4.2578
2.2996
.161
-1.251
9.766
2
3
Mean Difference (I-J) Std. Error
Sig.
Lower Bound Upper Bound
*. The mean difference is significant at the 0.05 level.
What conclusions can you draw about the difference between
Group 2 and Group 3? The difference between Group 2 and Group 3 is
not significant.
24
Here is one result of this test:
Multiple Comparisons
Gain
Tukey HSD
95% Confidence Interval
(I) Group
(J) Group
1
2
-3.4569
2.0333
.212
-8.327
1.413
3
-7.7147*
2.3482
.005
-13.339
-2.090
1
3.4569
2.0333
.212
-1.413
8.327
3
-4.2578
2.2996
.161
-9.766
1.251
1
7.7147*
2.3482
.005
2.090
13.339
2
4.2578
2.2996
.161
-1.251
9.766
2
3
Mean Difference (I-J) Std. Error
Sig.
Lower Bound Upper Bound
*. The mean difference is significant at the 0.05 level.
What conclusions can you draw about the difference between
Group 2 and Group 3? Since p = .161, the difference between Group 2
and Group 3 is not significant.
Are the means different for all three groups?
25
Here is one result of this test:
Multiple Comparisons
Gain
Tukey HSD
95% Confidence Interval
(I) Group
(J) Group
1
2
-3.4569
2.0333
.212
-8.327
1.413
3
-7.7147*
2.3482
.005
-13.339
-2.090
1
3.4569
2.0333
.212
-1.413
8.327
3
-4.2578
2.2996
.161
-9.766
1.251
1
7.7147*
2.3482
.005
2.090
13.339
2
4.2578
2.2996
.161
-1.251
9.766
2
3
Mean Difference (I-J) Std. Error
Sig.
Lower Bound Upper Bound
*. The mean difference is significant at the 0.05 level.
Are the means different for all three groups?
The three means are not all different; the only significant difference is
between the means of Group 1 and Group 3.
26
Hypothesis Tests and Confidence Intervals
A pair of means can be considered significantly different at a .05 level of
significance if and only if zero is not contained in a 95% confidence
interval for their difference.
We can use Fisher's Least Significant Difference to determine where any
differences lie by identifying any confidence intervals which do not
contain 0.
27
Are the means different for all three groups?
Are the results the same as when we used Tukey's HSD?
28
Are the means different for all three groups?
Are the results the same as when we used Tukey's HSD?
The only confidence interval that does not contain 0 is the CI for the
difference on the means of Group 1 and Group 3. This indicates that the
means for these two groups are different.
29
How else can we follow up on this analysis? Since the groups are
independent, we can do our own pairwise t-tests for the difference of the
means.
30
How else can we follow up on this analysis? Since the groups are
independent, we can do our own pairwise t-tests for the difference of the
means.
Two-sample t-tests
H0: μ1 - μ2 = 0
Ha: μ1 - μ2 ≠ 0 (or > 0 or < 0)
Assumptions:
Independent random samples:
Approximately Normal distributions for both samples
31
Here are the results for the test of
H0: μ1 - μ2 = 0
Ha: μ1 - μ2 ≠ 0
Is there a significant difference between the means of these groups ?
What is your statistical conclusion? Be sure to state the p-value. p =
.100
Since p is small, the null hypothesis is reject
What is your conclusion in context?t there is a significant difference
32
Is there a significant difference between the means of these groups?
What is your statistical conclusion? p = .100
Since p > .05, the null hypothesis is not rejected.
What is your conclusion in context?
The data does not indicate that there is a significant difference
between the mean weight gain with cognitive-behavioral treatment and
family therapy.
33
What else can be concluded?
If the data was gathered in a well-designed experiment in which subjects
were randomly assigned to treatment groups, then we can conclude
causality.
In an observational study in which random samples are taken from the
populations, the results can be extended to the associated populations.
34
SPSS Instructions for ANOVA
To create side-by-side boxplots of the data:
Assume that your file has the groups in one column and the values of the
variable in a second column.
Choose > Graphs > Chart Builder
Choose Boxplot and drag the first boxplot (Simple) to the preview area.
Drag the column with the groups to the x-axis, and the column with the
values of the predictor variable to the y-axis.
Click OK.
35
To create Normal Probability Plots of the data:
Choose > Analyze > Descriptive Statistics > Explore
In the Explore dialog box, choose the Dependent List variable and the
Factor List variable.
Click on Plots.
Click OK.
36
To perform a One-Way Analysis of Variance
Choose > Analyze > Compare Means > One-Way ANOVA
Choose the Dependent List variable and the Factor List variable.
Click on Options, and under Statistics, choose Descriptive and
Homogeneity of Variance Test. Click on Continue and then OK.
37
To perform Tukey's Honestly Significant Difference test
Choose > Analyze > Compare Means > One-Way ANOVA
(The variables may still be selected, so you may not have to enter the
Dependent List variable and the Factor List variable.)
Click on Post-Hoc, and select Tukey. Note that you can also select LSD
to choose Fisher's Least Significant Difference test.
38