anova: part i

ANOVA: PART I
Quick check for clarity

Variable 1


Sex: Male vs Female
Variable 2
 Class:
Freshman vs Sophomore vs Junior vs Senior

How many levels in Variable 1? Variable 2?

Keep in mind:
 ‘Variable’
refers to what is being measured
 ‘Level’ refers to how many groups within the variable
Last week(s)


Since we’ve returned from break we’ve started
analyzing data by comparing groups
More specifically, we’ve compared groups using one
sample-, independent-, and paired samples t-tests
 Also
introduced the concepts of ‘degrees of freedom’ and
‘95% confidence intervals’

Let’s take a moment to summarize when to use the
different statistical tests we know…
When to use what…
# of IV
(format)
# of DV
(format)
Examining…
1
(continuous)
1
(continuous)
Association
1
(continuous)
1
(continuous)
Prediction
Multiple
1
(continuous)
Prediction
Test/Notes
# of IV
(format)
1 (grouping,
2 levels)
1 (grouping,
2 levels)
1 (grouping,
2 levels)
# of DV
(format)
1
(continuous)
1
(continuous)
1
(continuous)
Examining…
Test/Notes
Group
differences
When one group
is a ‘known’
population
Group
differences
When both
groups are
independent
Group
differences
When both
groups are
dependent
(related)
Different statistical tests…

All tests are based on calculating a test statistic
 Such


as a t-score, Pearson’s r, etc…
Using the test statistic, the sample size, and number of
groups (degrees of freedom) we estimate a p-value
While all of these tests are useful, they do have limits
 Can’t
have more than 1 independent variable
 Except
 Can’t
MLR
have more than 1 dependent variable
 The dependent variable must be continuous
Where to now?

Moving forward, we’ll eliminate these restrictions:
 ANOVA’s
compare groups, and can be used with:
 Multiple
IV’s
 IV’s with any number of levels

e.g., we can compare 5 variables with 3 levels each
 MANOVA’s
can be used with multiple DV’s
 Chi-Square
and Logistic Regression can make use of
categorical DV’s (not continuous)
 e.g.,
can predict heart attack vs no heart attack
Tonight’s topic

Tonight we’ll start discussing ANOVA

Like t-tests:
 ANOVA’s
are a family of statistical tests used to
compare groups
 ANalysis Of Variance
 There are (basically) 3 types of ANOVA’s

Unlike t-tests, ANOVA’s can be used to compare two
or more groups (levels)
 More
‘flexibility’ and options than t-tests
Side-by-Side
Comparison
t-test
ANOVA
Can analyze group
differences
Yes
Yes
How many levels per
variable?
Only 2
2 or more
Test Statistic used
t score
F score/F ratio
P-value calculated
using…
t score, sample size,
and number of
groups (degrees of
freedom)
F score, sample size,
and number of
groups (degrees of
freedom)
Types of ANOVA’s

1) One-Way ANOVA (basic, univariate)
 Can
compare one IV with any number of levels
 i.e.,

compare mean GRE scores of ISU, IWU, and UI students
2) Factorial ANOVA
 Can
do 1) above, plus…
 Can use multiple IV’s (compare GRE by school and sex)

3) Repeated Measures ANOVA
 Can
compare several groups (2 or more) in related
subjects (paired groups, longitudinal data, etc…)
Back to the same dataset

I’m re-using the fitness test and academics dataset.
 Dataset
has information about FITNESSGRAM fitness tests
and ISAT academic test scores in a group of adolescents

Again, I’m interested to know if academic success is
related to health/fitness
 We’ve
seen how we can compare two groups using a t-test
 But, if my question becomes more complicated, I’ll need to
use ANOVA
Example

Is academic success related to physical fitness?

The ISAT test categorizes students into 3 groups:
 Exceeding
Standard (very good)
 Meeting Standard (good enough)
 Below Standard (not as good)

If academic success is related to fitness, I should be
able to compare the fitness test results between these
three groups
 Do
kids exceeding the standard have the highest ‘fitness’
Example


3 Groups: Exceeds vs Meets vs Below Standard
I could use multiple t-tests to compare PACER laps
between the three groups, right?
 I’d
need three:
 t-test
1: Exceeds vs Meets
 t-test 2: Exceeds vs Below
 t-test 2: Meets vs Below

However, this violates a big statistical ‘law’. This
approach is frowned upon for one big reason…
Family-Wise Error Rate

Using several t-tests instead of 1 ANOVA is not
acceptable due to the Family-wise error rate
 Also

known as Experiment-wise error rate
Mathematically it can be complicated to explain,
but let’s think of it like this:
 If
I set alpha at 0.05, that means I’m willing to accept a
5% risk of Type I error (random sampling error)
 So, what happens if I complete 100 statistical tests on
the same sample of people?
 If
each of my t-tests had an p-value of 0.05, odds are that I
made a type I error 5 times out of 100
Even more simplistic explanation

Imagine I develop a pregnancy test and it is 95%
accurate
 Then,
I have 100 women take the test.
 I expect 95 tests will be correct – 5 tests will not 

The theory is that it works the same way with random
sampling error/Type I error.
 If
I’m 95% confident (alpha = 0.05) that I did not make a
Type I error on 1 statistical test…
 For every 100 tests, I can expect 5 to have Type I error
Family-wise Error

You can actually calculate this for yourself if you want to
– Desired Confidence^Number of Tests = Chance of Type I error
 Remember, our ‘desired confidence’ is 95%, or 0.95
1

If we did 1 t-test, then:
1



– 0.95^1 = 0.05 (notice, this is our normal chance of error)
3 t-tests = 1 – 0.95^3 = 0.14, 14% chance of error
13 t-tests = 1 – 0.95^13 = 0.49, 49% chance of error
The ‘goal’ of the ANOVA is to make multiple statistical
comparisons but minimize risk of Family-wise error
 By
providing only one p-value
Back to the example


Instead of using 3 different t-tests (and 3 p-values),
we use 1 ANOVA and create 1 p-value
For this example:
1
IV Academic Success, 3 levels: Exceeds, Meets, Below
 1 DV PACER Laps (continuous variable)


HO: There is no difference in aerobic fitness
between the three groups of academic success
HA: There is a difference in aerobic fitness between
the three groups of academic success
Coding the IV

Here is how I coded my IV, academic success:
Degrees of Freedom

Recall ‘degrees of freedom’ is based on your
number of groups and your number of subjects
 For
t-tests, we always have 2 levels so the df is always
easy to calculate
#

of Subjects - 2
We always want to have the biggest df as possible
(just like we want a large sample size) because it
means we have a lower chance of Type I error
df in ANOVA’s

For ANOVA’s, we can have more than two groups, so pay
close attention to your df – you will now have two
 Degrees
of Freedom 1 = # Groups – 1
 Degrees of Freedom 2 = # Subjects – # Groups

Df 1 is the ‘Between Groups’ df
 It
refers to making comparisons between our groups (ie,
comparing Exceeds vs Meets vs Below)

Df 2 is the “Within Groups’ df
 It
refers to making comparisons between our subjects (ie, the
total subjects ‘within’ all the groups)
Output from One-Way ANOVA



Here is your ANOVA output:
N = 245
The sum of squares and mean square (ignore them)
are used to calculate the F-ratio
Note df:
 ‘Between
Groups’ = 2 (3 groups – 1)
 ‘Within Groups’ = 242 (245 subjects – 3 groups)
Output from One-Way ANOVA




Here is your ANOVA output:
N = 245
We use df and the F-ratio to calculate the p-value
P = 0.006, which is less than 0.05, so we can say
the test was statistically significant. Reject the null:
HA: There is a difference in aerobic fitness between
the three groups of academic success
Output from One-Way ANOVA
N = 245

P = 0.006, reject the null:
 HA:
There is a difference in aerobic fitness between the
three groups of academic success



Do you have any other questions…? You should…
Notice, the ANOVA just says there is ‘a difference’
We have no idea what groups are different…
Post-Hoc Tests

Our ANOVA indicates that at least one of our three
groups is different from another one - but which one?
 Exceeds
vs Meets
 Exceeds vs Below
 Meets vs Below

We have to do a follow-up test, a Post-Hoc test, to
determine where the significant difference(s) are
 Post
hoc just means ‘after this’
 ‘Mini’-tests used to find differences between groups AFTER
a larger statistical test (like ANOVA)
WARNING with ANOVA’s

Please recognize:
 ANOVA’s

If your ANOVA is statistically significant – you HAVE
TO continue to complete post-hoc tests
 Run

only provide you with half of the information
more tests to find the specific group differences
If your ANOVA is not statistically significant – you
can STOP
 None
of the post hoc tests would be statistically
significant (because the ANOVA just said they weren’t)
Post-Hoc tests

A large group of statistical tests that function like t-tests
 They
compare ONLY two groups, but they do it multiple times
 SPSS aka ‘Pair-wise Comparisons’

They are designed to avoid the family-wise error rate
problem because they all ‘adjust’ the p-value based on
the number of comparisons you make
 i.e.,
they shrink your alpha level based on number of tests
 As post-hoc tests and ANOVAs are strongly linked (you
always run them together), SPSS accommodates this
Post-Hoc tests



Several types of post-hoc tests you could use:
 LSD
 Dunnett
 Sidak
 SNK
 Scheffe
 Bonferroni
 Duncan
 And
more…
They are pretty much all the same (for us)
The only one I want you to use in this class is Tukey
 Perhaps
the most commonly used post-hoc
 Ignore every other post hoc test, unless told otherwise
Post-Hoc tests

Let’s re-run our ANOVA, this time selecting a posthoc test
 If
you don’t tell it to, SPSS will not automatically run it
NOT Tukey’s-b
More options

‘Options’ can provide you with descriptive statistics
Descriptive Stats


The sample sizes, means, SD, and 95% CI for our
three groups (dependent variable PACER Laps)
individually and in total
Notice, this 95% CI is not for mean differences, but
just the group mean
Output from One-Way ANOVA


This is the same output for the ANOVA we saw
before, I just wanted to remind you of the p-value
and decision
P = 0.006, reject the null:
 HA:
There is a difference in aerobic fitness between the
three groups of academic success

Now, the post-hoc tests will tell us what groups
Post-Hoc: Tukey’s test, Multiple Comparisons

Now we have mean differences, p-values for each
comparison, and 95% CI’s for the mean differences
 Which
groups are significantly different?
 Remember, we are making 3 comparisons – but there
are 6 tests results?
Post-Hoc: Tukey’s test, Multiple Comparisons


The ‘Exceeds’ group is significantly higher than the
‘Meets’ and ‘Below’ group (p = 0.034 and 0.008)
The ‘Meets’ group is NOT significantly different
from the ‘Below’ group (p = 0.405)
Results in text


Results of the one-way ANOVA indicated that Pacer
Laps were significantly different between Science
Score groups (F(2, 242) = 5.17, p = 0.006). Tukey
post-hoc comparisons revealed that the Exceeds
group completed significantly more PACER laps than
the ‘Meets’ group (p = 0.034) and the ‘Below’
group (p = 0.008). However, the ‘Meets’ group was
not significantly different than the ‘Below’ group (p
= 0.405).
Questions on One-Way ANOVA?
If you wanted, you could also include the mean
differences or means with 95% CI’s, but usually this
is reported in a table since it can get complicated
A few more notes on ANOVA

SPSS also provides you with another output called
‘Homogenous Subsets’
 This
feature is supposed to make it easy to see which
groups are significantly different (or rather - which groups
are the same, or homogenous):
A few more notes on ANOVA

SPSS also provides you with another output called
‘Homogenous Subsets’
 The
problem with this feature is that it uses a slightly
different method to calculate the p-values
 It will sometimes give you different results! Ignore this!
In our example,
this output
actually conflicts
with what we
found from the
Tukey pairwise
comparisons!
A few more notes on ANOVA

Statistical assumptions for the ANOVA are the same
as those for the t-test!
 1)
Normally distributed data
 2) Sample is representative of the population
 3) Homogeneity of variance

Unlike the t-test, we will not be using Levene’s test
of Homogeneity – please ignore this as well
A few more notes on ANOVA

Our example compared 1 variable with 3 levels:
 Exceeds,
Meets, and Below
 We had 3 post-hoc comparisons
 Exceeds

vs Meets; Exceeds vs Below; and Meets vs Below
Keep in mind what happens if you change the
variable to have more levels:
 For
example, NHANES (a national health database)
codes race as a 5-level variable:
 Black,
 Assume
White, Mexican American, Other-Hispanic, Other
we wanted to compare average blood pressure
between these groups using a one-way ANOVA…
Multiple Comparisons Grow Quickly

Post-hoc tests would include several pair-wise comparisons:
 Black,
White, Mexican American, Other-Hispanic, Other
 Black
v White
 Black v MexAm
 Black v Oth-Hisp
 Black v Other
 White v MexAm
 White v Oth-Hisp
 White v Other
 MexAm v Oth-Hisp
 MexAm v Other
 Oth-Hisp v Other
This would be 10
comparisons
Be mindful of how
you organize your
groups and
variables, ANOVA’s
can quickly get out
of hand
Upcoming…

In-class activity

Homework:
 Cronk
complete 6.5
 Holcomb Exercises 49, 50, and 53 (on 95% CI’s)

More ANOVA next week
 Factorial
ANOVA!