Download Report

Running head: TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
Tutorial in Statistics: Sample Size Determination for ANOVA and MANOVA
Amy Williams
University of Calgary
1
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
2
Tutorial in Statistics: Sample Size Determination for ANOVA and MANOVA
Introduction
ANOVA and MANOVA are two forms of statistical analysis that are at the forefront of
statistical research today. ANOVA, which stands for analysis of variance, is an approach to
analyzing data for two or more groups that involves breaking down the dependent variable into
between-group and within-group variances (Kerlinger & Lee, 2000). MANOVA, or multivariate
analysis of variance, is similar to ANOVA; this approach, however, involves the analysis of two
or more groups on two or more dependent variables, as well as the examination of the
correlations that exist between these variables (Stevens, 2009). In order for researchers to carry
out either ANOVA or MANOVA effectively, certain factors must be taken into consideration.
One of these factors is the determination of an appropriate sample size, or the number of
participants needed to take part in a particular study. This tutorial sheds light on the importance
of sample size in research, highlights important theoretical information about this topic, and
gives an overview of how appropriate sample size determination is carried out for both ANOVA
and MANOVA.
The Importance of Sample Size Determination
Determining sample size is an important aspect of research for several reasons. If a
sample size is too small or too large, this can have economic implications for researchers in
terms of wasted or overused resources, respectively (Lenth, 2001). Conducting a study can be
expensive, so knowing ahead of time how many participants are needed allows researchers to
anticipate potential costs (Kerlinger & Lee, 2000). There are also ethical concerns related to
sample size. According to Lenth (2001),
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
3
An undersized experiment exposes the subjects to potentially harmful treatments without
advancing knowledge. In an oversized experiment, an unnecessary number of subjects
are exposed to a potentially harmful treatment, or are denied a potentially beneficial one
(p. 187).
For these reasons, researchers and statisticians are advised to invest time in planning statistical
studies and ensure that the number of subjects involved will provide them with statistically
significant results. Utts and Heckard (2006) encourage researchers to ask two important
questions to help guide research design planning: how precise will my results be if my study
contains a particular number of participants? And how large does my sample need to be in order
to obtain statistically significant results?
Important Theoretical Information
Despite the importance of sample size determination, the literature related to this topic is
not extensive (Lenth, 2001). Introductory statistics textbooks generally provide comprehensive
overviews of this topic; the authors of these books tend to emphasize the use of larger sample
sizes over smaller sample sizes. Kerlinger and Lee (2000) give an important reason for this
preference: “the smaller the sample the larger the error, and the larger the sample the smaller the
error” (p. 175). Utts and Heckard (2006) shed light on the greater accuracy and minimized
uncertainty that having larger sample sizes brings to research design and statistical studies, but
they also caution students and researchers about the statistically significant results that even
small effects of large samples can elicit. Sample size determination is therefore not just a matter
of simply choosing to use a large sample because more benefits are associated with this choice;
careful consideration of the overall statistical study is imperative.
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
4
How to Determine Sample Size
There are several steps involved in determining the sample size for a statistical study.
Lenth (2001) sheds light on the power approach, which involves five important steps: first, the
researcher must identify both the null and the alternative hypothesis; second, he or she must
decide on a significance level; third, the researcher must also decide on an effect size; fourth, he
or she must gather missing values from related studies or published literature; and fifth, he or she
must then decide on the power value for the study. Kerlinger and Lee (2000) focuses on three
similar aspects of sample size determination, beginning with a calculation of the actual or
estimated population standard deviation value (this information can come from previous studies)
and identifying the amount of error that will be tolerated, and then estimating the probability of
making a Type I error. In addition, Kerlinger and Lee (2000) offer a formula for sample size
determination which is given in Table 1.
Lenth (2001) stresses the importance of power analysis in estimating an appropriate
sample size, stating that this method is “one of the most popular approaches to sample size
determination” (p. 187). Power, after all, is an integral component of statistical analysis because
it is what analysis is based upon: the ability to detect statistical differences when they in fact
exist. Stevens (2009) refers to power as “the probability of making a correct decision” (p. 162).
Table 1
Sample Size Estimation Equation__________________________________________
n = ( Z2 * σ2 ) / d2
Z2 = standard score corresponding to the specified probability of risk
σ = the standard deviation of the population
d = specified deviation
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
5
Note. From Foundations of Behavioral Research (4th ed.) by F.N. Kerlinger and H.B. Lee
(2000). Belmont, CA: Cengage Learning, p. 297. Copyright 1992 by Fred N. Kerlinger.
Sample Size Determination Example: ANOVA
Description of the Study
A group of educational researchers in Kuwait wishes to conduct an experiment involving
fifth grade students that attend English-speaking private schools. This study would require
students – both boys and girls – to take part in a two-week (one 90-minute period a day) Social
Studies instructional session in which they would be randomly placed in one of three groups:
Group 1: No Study Plan (Control Group)
Group 2: Implementation of Social Studies “Study Plan”
Group 3: Memorization
The purpose of this study is to determine if having students take part in a Social Studies
“Study Plan” prior to writing a curriculum-based assessment would enhance their performance
on this assessment compared to those students who do not take part in a study plan or those who
are simply encouraged to memorize notes that they have taken during the two-week period.
A study plan involves a contract that students sign in class and then take home to have
their parents sign; it involves students making a commitment to review their notes using different
strategies several times a week before a scheduled test. Strategies include, but are not limited to,
comparing and contrasting ideas that students have learned using a Venn diagram, summarizing
main ideas, or illustrating an important concept. The study plan is meant to elicit higher-level
thinking and make students aware of their own learning styles in an attempt to discourage them
from simply memorizing their notes.
How many students should be placed in each group in order for the group of researchers
to detect a statistically significant result? To answer this question, the group of researchers will
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
6
follow the aforementioned steps outlined in Lenth’s (2001) power approach and calculate the
sample size using Kerlinger and Lee’s (2000) sample estimation formula (the researchers could
also use one of the online sample calculators listed in the Resource section of this tutorial to
calculate the sample size).
Power Analysis
First, the group of researchers will propose a hypothesis. For this particular study, they
predict that the mean score for both Group 1 and Group 3 will be lower than that for Group 2.
The null hypothesis, of course, states that a difference between the groups will not be present.
Next, the researchers will choose a significance level. In the social sciences, a significance level
of .05 is usually chosen for statistical studies, which means that the researcher has less that a 5%
chance of making a Type 1 error, or choosing to reject the null hypothesis when it is in fact true
(Stevens, 2009). The researchers will also choose an effect size. According to Stevens (2009),
the effect size refers to “how much of a difference the treatments make, or the extent to which
the groups differ in the population on the dependent variable(s) (p. 163). Generally, small or
medium effect sizes are chosen in research pertaining to the social sciences (Stevens, 2009). A
medium effect size is 0.5. Next, the population standard deviation is either estimated or derived
from past research; for this particular study, the researchers estimate a standard deviation of 0.75.
Finally, they decide on the power value. Lenth (2001) considers a power of .80 common in
statistical studies, and the researchers of this particular study have chosen it as their target.
n = ( Z2 * σ2 ) / d2
According to the above formula, Z represents the standard score associated with the level
of significance, or risk (Kerlinger & Lee, 2000). In the t-distribution probability chart in the
appendix of Kerlinger and Lee’s book entitled Foundations of Behavioral Research, Z = 1.96.
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
7
The aforementioned standard deviation of the population (σ) is 0.75 and the specified deviation
(d) which identifies the precision the researcher hopes to obtain is set at 0.3.
n = 1.96 (0.75) = 3.842 (0.56) = 2.16 = 24
0.3
0.09
0.09
The number of participants needed for each group for this particular ANOVA is 24 (or 72
participants in total). This formula is an effective method of obtaining the required sample size
for a study involving random sampling; a variation of this formula exists for studies in which the
population size from which the sample will be taken is known (Kerlinger & Lee, 2000).
(Lauter, 1978)
Sample Size Determination Example: MANOVA
Stevens (2009) presents a condensed table that indicates the number of subjects needed
per group in a MANOVA depending on the desired effect size (see Table 2). More
comprehensive tables for three, four, five, and six-group MANOVA, however, have been
adapted from those Lauter (1978) created and are included at the back of Stevens’ book entitled
Applied Multivariate Statistics for the Social Sciences; these tables are invaluable to both
students and researchers as they minimize the estimation and calculations the formulas such as
the one mentioned above requires.
In order to effectively use these statistical tables, a researcher must make three decisions:
the number of groups required for the study, the effect size, and the significance level (either
0.05 or 0.01). Once these decisions have been made, sample size for MANOVA can be
determined.
Table 2
Sample Size Estimation for MANOVA
Groups
Effect Size
3
4
5
6
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
8
Very Large
12-16
14-18
15-19
16-21
Large
25-32
28-36
31-40
33-44
Medium
42-54
48-62
54-70
58-76
Small
92-120
105-140
120-155
130-170
From Applied multivariate statistics for the social sciences (5th ed.) by J.P. Stevens. New York,
NY: Taylor & Francis Group. Copyright 2009.
Description of the Experiment
A researcher wants to determine the impact that both marital status and living in housing
provided by the school (independent variables) has on overseas teachers’ ratings of job
satisfaction and length of employment in Kuwait (dependent variables). The researcher randomly
obtains participants from this study from the various international schools in Kuwait and has
them fill out a survey that includes questions about job satisfaction (for example, “overall, how
would you rate your experience at your particular school”) and length of employment. Upon
receiving completed surveys, he plans on dividing the participants up into four groups based on
the independent variable:
Group 1: Married and living in school accommodations
Group 2: Single and living in school accommodations
Group 3: Married and living in own accommodations
Group 4: Single and living in own accommodations.
He has chosen a power of .80 and has decided upon a medium effect size (basing both decisions
on what is common in most social science studies; he could have also researched past studies on
a similar topic and chosen an effect size this way). According to the statistical table in the
appendix of Steven’s (2009) book, this researcher would require 50 participants per group (200
participants in total) in order for his study to produce statistically significant results.
If the researcher decides to change the study so that there are now four dependent
variables (‘job satisfaction,’ ‘length of employment,’ ‘overall school rating,’ and ‘work
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
environment rating’), and he now anticipates a large effect size, the number of participants
required per group would decrease to 37. Sample size determination for MANOVA is therefore
dependent upon the number of dependent variables as well as the effect size and the chosen
power value. This method of sample size determination is referred to as a priori estimation,
which is a method that relies extensively on power values (Stevens, 2009).
Summary and Conclusions
Determining the sample size for a statistical study is an important aspect of a quality
research design; it is also a difficult process (Lenth, 2001). Several methods for determining
sample sizes for ANOVA, MANOVA, and other analyses exist, including the power approach,
the random sampling formula, and a priori estimation. Each of these methods requires the
researcher’s knowledge of the effect size and power and is used before data for a particular
statistical study is collected. When used with careful consideration and planning, these methods
are effective tools.
The post hoc estimation of power is also an option for determining sample size; this
method, however, is used after a study has actually been carried out and involves the researcher
interpreting the results and identifying the effect sample size and effect size have on the power
(Stevens, 2009). Not all researchers and statisticians favour this method. Lenth (2001) cautions
against retrospective planning such as the post hoc, stating that the goal of this method involves
“collect[ing] enough additional data to obtain statistical significance, while ignoring scientific
meaning” (p. 191).
Regardless of which sample size estimation method a researcher chooses, he or she must
acknowledge that sample size, effect size, and power are dependent on one another and that in
order to estimate or determine one, information about the other two is needed. A wealth of
9
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
10
information, statistical software, and web resources are available to help researchers determine
the appropriate sample size needed for their statistical studies so that this important task – as
daunting as it may seem – is not overlooked or neglected altogether.
Online Resources
Java Applets for Power and Sample Size:
http://www.stat.uiowa.edu/~rlenth/Power/
IBM SPSS Sample Power Program:
http://www-01.ibm.com/software/analytics/spss/products/statistics/samplepower/
National Statistical Service Sample Size Calculator:
http://www.nss.gov.au/nss/home.nsf/pages/Sample+Size+Calculator+Description?OpenDocume
nt
TUTORIAL IN STATISTICS: SAMPLE SIZE DETERMINATION
11
References
Kerlinger, F. N., & Lee, H. B. (2000). Foundations of Behavioral Research (4th ed.). Belmont,
CA: Cengage Learning.
Lauter, J. (1978). Sample size requirements for the T2 test of MANOVA (tables for one-way
classification). Biometrical Journal, 20, 389-406.
Lenth, R. V. (2001, August). Some practical guidelines for effective sample size determination.
The American Statistician, 55, 187-193.
Stevens, J. P. (2009). Applied multivariate statistics for the social sciences (5th ed.). New York,
NY: Taylor & Francis Group.
Utts, J. M., & Heckard, R. F. (2006). Statistical Ideas and Methods. United States: Thomson
Brooks/Cole.