DESCRIPTIVE DATA ANALYSIS, SAMPLE SIZE Statistics 5/3/2010

5/3/2010
DESCRIPTIVE DATA
ANALYSIS, SAMPLE SIZE
Dr. Yan Liu
Department of Biomedical, Industrial and Human Factors Engineering
Wright State University
Statistics
Types of Statistics
Descriptive statistics
Inferential statistics
Comprises the statistical methods dealing with the collection, tabulation and
summarization of the collected data, so as to present meaningful information of the
data
Consists of the methods involved with the analysis and interpretation of data that will
enable the researcher to develop meaningful inferences about the population the data
represents
These two areas interrelated
While descriptive statistics organizes the collected data in a systematic manner,
inferential statistics analyzes the data and enables one to produce significant
inferences about it
2
1
5/3/2010
Measures of Central Tendency
Indicate the central point or the greatest frequency concerning a set of
data
Mean
The statistical mean of a set of data is its average
Population mean vs. sample mean
Population mean, µ, is the expected value E(x), such that if an infinite number of
measurements are made, the average of the infinite measurements is the result; this
represents the true value of a measurement
The sample mean, x , is the average value of a sample, which is a finite series of
measurements, and is an estimate of the population mean
Median
The median of a set of data, ~
x , is the value which, when the data are arranged in
an ascending or descending order, satisfies the following conditions: 1) If the
number of data is odd, the median is the middle value; and 2) If the number of
observations is even, the median is the average of the two middle values
The same as the 50th percentile of a set of data
3
Measures of Central Tendency (Cont’d)
Mode
The mode of a set of data is the specific value that occurs with the greatest
frequency
May be more than one or none
4
2
5/3/2010
Measures of Variation
Indicate the variability inherent in a set of data
Variance and standard deviation
Characterize the amount of spread in the distribution of the data
Population vs. sample variance and deviation
The population variance, σ2, and standard deviation, σ, indicate the deviation among
individual measurements from the population mean, for the entire population
Sample variance, s2, and standard deviation, s, indicate how much each individual
data value deviates from the sample mean
s2 =
1
n −1
∑ (x
i
−x ) 2
i
5
Symmetry and Skewness
Symmetric Distribution
The histogram of variable is symmetric with respect to a vertical axis passing
through its mean
For a symmetrically distributed population or sample, the mean, median and
mode have the same value
Skewed Distribution
It is positively skewed if a greater proportion of the data are less than or equal to
the mean; this indicates that the mean is larger than the median
It is negatively skewed if a greater proportion of the data are greater than or
equal to the mean; this indicates that the mean is less than the median
Positively skewed histogram
(skewed to the right)
Negatively skewed histogram
(skewed to the left)
6
3
5/3/2010
Correlation Analysis
Purpose
Measures how strongly two attributes correlate with each other
Correlation Coefficient
Correlation analysis for numerical variables
Indicates the strength and direction of a linear relationship between two numeric
random variables
Pearson’s product moment coefficient
n
rA, B =
∑( ai - A )(bi - B )
i =1
nσ A σ B
N
=
∑(ai bi )-nA B
i=1
nσ Aσ B
Curvilinear Relationship
If the relationship between two variables is curvilinear, the Pearson productmoment correlation coefficient will not indicate the existence of a relationship
To check whether the relationship is linear, the easiest way is to construct a
scatterplot which gives a visual indication of the shape of the relationship 7
Perfect Positive Linear Relationship
Curvilinear Relationship
(quadratic relationship with an
intermediate minimum)
Perfect Negative Linear Relationship
No Relationship
8
4
5/3/2010
Correlation Analysis (Cont’d)
Chi-Square (χ2) Test
Correlation analysis for categorical variables
A
A1
A1B1
A1B2
…
A1Bq
B1
B2
…
Bq
B
p
q
χ = ∑∑
2
i =1 j =1
eij =
A2
A2B1
A2B2
…
A2Bq
( oij - eij ) 2
eij
…
…
…
…
…
Ap
ApB1
ApB2
…
ApBq
Cell (Ai, Bj) represents the joint
event that A= Ai and B= Bj
statistic test of the hypothesis
χ2:that
A and B are independent
~ χ 2 ( p -1)( q -1)
count ( A= Ai )×count ( B = B j )
n
n: number of cases in each variable
Count(A=Ai): the observed frequency of the event A=Ai
Count(B=Bj): the observed frequency of the event B=Bj
9
Gender and Preferred Reading Example
Preferred_
Reading
fiction
non-fiction
Column Margin
e11 =
e12 =
e21 =
e22 =
Gender
male
female
250
200
Row
Margin
450
50
1000
1050
300
1200
n=1500
count ( male)×count ( fiction )
300×450
n
1500
count ( male )×count ( non -fiction )
300×1050
n
1500
count ( female)×count ( fiction )
1200×450
n
1500
count ( female )×count ( non -fiction )
1200×1050
n
1500
=
= 90
=
=
= 210
= 360
=
= 840
10
5
5/3/2010
Preferred_
Reading
fiction
non-fiction
Gender
male
female
200(360)
250(90)
50(210)
1000(840)
1050
300
1200
N=1500
Column Margin
p
q
χ 2 = ∑∑
i =1 j =1
( oij - eij ) 2
eij
=
Row
Margin
450
( 250 -90) 2
90
+
( 50- 210) 2
210
+
( 200 -360) 2
360
+
(1000 -840) 2
840
= 284.44 + 121.90 + 71.11 + 30.48 = 507.93
dof = (2-1)*(2-1)=1
χ 2 0.005 (1) = 7.879
P< 0.005
Conclusion: Gender and Preferred_Reading are strongly correlated! In
particular, males prefer fiction than non-fiction readings, whereas females
prefer non-fiction than fiction readings.
11
Effect Sizes
Problem with p-value
What are Effect Sizes
Probability of a unique result under a specific, conditional null hypothesis,
affected by the sample size and measurement scale
A family of indices that measure the strength of the relationship between two
variables, independent of the sample size and the measurement scale
Highly Recommended by APA
In scientific experiments, it is often useful to know not only whether an
experiment has a statistically significant effect, but also the size of any
observed effects
Effect size measures are the common currency of meta-analysis studies that
summarize the findings from a specific area of research
12
6
5/3/2010
Standardized Difference Between Means
Cohen’s d
Defined as mean difference divided by the pooled standard deviation
d=
X1 − X 2
σp is pooled population standard deviation of X1 and X2
σp
n1σ12 + n2 σ 22
n1 + n2
σp =
when n1=n2, σ p =
σ12 + σ 22
2
Interpretation of Cohen’s d
Small effect size: ~[0.0, 0.5)
Medium effect size: ~[0.5, 0.8)
Large effect size: ~[0.8, +∞)
13
Standardized Difference Between Means (Cont’d)
Hedges’ g
Virtually the same as Cohen’s d in large sample sizes
g=
X1 − X 2
Sp
when n1=n2,
=
X1 − X 2
( n1− 1) S12 +( n2− 1) S22
n1+n2 −2
Sp =
Sp is pooled sample standard deviation of X1 and X2
S1 = σ1√n1/(n1 -1), S2 = σ2√n2/(n2-1)
S12 + S 22
2
Some software (e.g. Effect Size Generator) calculates g by adjusting the overall
effect size based on the sample sizes, as follows
g adjusted =
X1 − X 2
Sp
(1 − 4( n1 +3n2 ) −9 ) =
X1 − X 2
( n1− 1) S12 +( n2− 1) S22
n1+ n2 − 2
(1 − 4( n1 +3n2 ) −9 )
14
7
5/3/2010
Visual Interface Example (I)
Interface A)
A1
6
5
5
7
4
3
5
4
A2
8
6
9
6
6
5
5
7
X 1 = 4.875 X 2 = 6.5
n
σ1 =
∑ ( X 1i - X 1 ) 2
i =1
n
n
= 1.166
σ2 =
S1 =
n -1
= 1 .246
S2 =
n
= 1.323
n
n
∑ ( X 1i - X 1 ) 2
i =1
∑ ( X 2i - X 2 )2
i =1
∑ ( X 2i - X 2 )2
i =1
n -1
= 1 .414
Cohen’s d
d=
4.875 - 6.5
1.166 2 +1.323 2 )
2
= -1.303
Hedges’ g
g=
4.875 - 6.5
1.246 2 +1.414 2 )
2
= -1.219
15
Standardized Difference between One Mean
to a Population Mean
Cohen’s d
d=
X −µ
σX
σX is population standard deviation of X
Hedges’ g
g=
X −µ
SX
SX is sample standard deviation of X
16
8
5/3/2010
Standardized Difference between Paired
Population Means
Cohen’s d
σD is population standard deviation of the paired difference between two
variables, D
d = σDD
Hedges’ g
g=
SD is sample standard deviation of the paired difference between two
variables, D
D
SD
17
Visual Interface Example
Interface A)
Participant (S)
4
5
7
4
A1
1
6
2
5
3
5
A2
8
6
9
6
D12
-2
-1
-4
1
6
3
7
5
8
4
6
5
5
7
-2
-2
0
-3
D12 = 1.625
n
n
σD =
∑ ( D i - D) 2
i =1
n
= 1.495
Cohen’s d
d = 11..625
495 = 1.087
Hedges’ g
g = 11..625
598 = 1.017
SD =
∑ ( D i - D) 2
i =1
n -1
= 1.598
18
9
5/3/2010
Effect Sizes of Correlation
Pearson Product-Moment Correlation Coefficient
Correlation between numeric variables
Point Biserial Correlation Coefficient (rpb)
Used when one variable, say X1, is continuous but the other variable, say X2, is
dichotomous
Assuming that X2 has two values, 0 and 1, the data set can be divided into two
groups, group 1 which receives the value "1" on X2 and group 2 which receives
the value "0" on X2. Then rpb is calculated as follows
rpb =
M1 − M 0
SX
n1 ⋅n0
( n1 + n0 )( n1 + n0 −1)
where M1 is the mean of X1 for all data points in group 1 of X2, M0 is the
mean of X for all data points in group 2 of X2, n1 is the number of data points
in group 1, n0 is the number of data points in group 2
19
Effect Sizes for ANOVA
Effect Sizes for ANOVA
Measure the degree of association between an effect (i.e., a main effect, an
interaction) and the dependent variable
Can be thought of as the correlation between an effect and the dependent
variable
If the value of the measure of association is squared, it can be interpreted as the
proportion of variance in the dependent variable that is attributable to each effect
Commonly used measures of effect size in AVOVA
Eta squared, η2
Partial Eta squared, ηp2
Omega squared, ω2
Intraclass correlation, ρI
η2 and ηp2 are estimates of degree of association for the sample, while ω2 and ρI are
estimates of the degree of association in the population
20
10
5/3/2010
Eta Squared
Eta Squared, η2
The proportion of the total variance that is attributed to an effect
η 2 = SSSS
Effect
T
Statistical issue
The effect size of an effect is dependent upon the number and magnitude of other
effects
Effect
Sum of
Squares
Drive
24
Reward
112
η2
=24/610
=3.93%
18.36%
Reward * Drive
144
23.61%
Error
330
54.10%
SST
610
21
Within-Subject Design
Effect
Interface
Participant
Participant * Interface
(error term)
Total
Sum of
Squares
10.56
15.94
η2
=10.56/35.43 =29.8%
8.94
35.43
22
11
5/3/2010
Partial Eta Squared
Partial Eta Squared, ηp2
The proportion of the (effect + error) variance that is attributable to the effect
η p2 = (SS SS +SS
Effect
Err )
Effect
Statistical issue
The partial Eta squared values of all effects do not sum to the amount of variance
of the dependent variable accounted for by all the independent variables
Effect
Sum of
Squares
Drive
24
ηp2
Reward
112
=24/(24+330)
= 6.78%
25.34%
Reward * Drive
144
30.38%
Error
330
SST
610
23
Within-Subject Design
Effect
Interface
Participant
Participant * Interface
(error term)
Total
Sum of
ηp2
Squares
10.56 =10.56/(10.56+8.94) =54.2%
15.94
8.94
35.43
24
12
5/3/2010
Omega Squared
Omega Squared, ω2
An estimate of the dependent variance accounted for by the independent
variable in the population for a fixed effects model
ω2 for between-subjects, fixed effects is
ω 2 = (SS
Effect − ( df Effect )( MSErr ))
(SST + MSErr )
ω2 is always smaller than either η2 or ηp2
Effect
Sum of
Squares
df
Mean
Squares
Drive
24
1
24
Reward
Reward *
Drive
Error
112
2
56
=(24-1*18.33)/(610+18.33)
= 0.90%
12.0%
144
2
72
17.08%
330
18
18.33
SST
610
ω2
25
Within-Subject Design
Effect
Sum of
Squares
df
Mean Squares
ω2
Interface
10.56
1
10.56
=(10.56-1*1.28)/(35.43+1.28)
=25.3%
Participant
Participant * Interface
(error term)
15.94
7
2.28
8.94
7
1.28
Total
35.43
15
26
13
5/3/2010
Intraclass Correlation
Intraclass Correlation, ρI
An estimate of the dependent variance accounted for by the independent
variable in the population for a random effects model
ρ I = (MS (MS+( df −MS)(MS)
Effect
Effect
Effect
Err
Err ))
27
Sample Size
Importance of Sample Size
When the sample size is too small, it can be difficult to detect differences
between experiment conditions (a low statistical power)
When the sample size is too large, we waste resources (both time and money)
Steps in Determining Sample Size
Determine the design of the experiment
Single-factor between-subject design, single-factor within-subject design, factorial
between-subject design, factorial within-subject design, mixed design, etc.
Decide the null and alternative hypotheses and statistical test to be used
Decide the amount of difference to be detected
Significance level, α
Power of the test, π
The probability of rejecting the null hypothesis when it is actually false
Usually between 0.50 and 0.90
When you don’t have a good idea of the effect you are seeking, choose π at
least 0.80
Combine the above information to find sample size
28
14
5/3/2010
Compare Two Population Means: Independent Samples (I)
H0 : µ1 = µ2
H1 : µ1 < µ2
µ1 = 0
µ2 = 50
σ2 = 60
n = 24
Pr(reject H0|H0 is false)= π = 0.8
Critical region
29
Compare Two Population Means: Independent Samples (II)
µ1 = 0
µ2 = 50
σ2 = 60
n = 48
Pr(reject H0|H0 is false)= π = 0.98
Critical region
30
15
5/3/2010
Compare Two Population Means with
Independent Samples
For large samples (n>30), the sample size per group n needs to satisfy
2 ( z1-α / 2 + z π ) 2 •σ 2error
n≥
∆2
2
2 ( z1−α + z π ) 2 •σ error
n≥
∆2
(Two-sided test)
(One-sided test)
σ2error: the within group variance
∆: the smallest difference between the two groups you wish to detect
z1-α/2 : the percentile of the normal distribution used as the critical value in a two-sided
test of size (1.96 for α = 0.05)
z1-α : the percentile of the normal distribution used as the critical value in a one-sided test
of size (1.645 for α = 0.05)
zπ : the π ×100-th percentile of the normal distribution (0.84 for the 80-th percentile)
31
Compare Two Population Means with
Independent Samples (Cont’d)
For small samples (n≤30), the sample size per group n needs to satisfy
n≥
n≥
2 ( t1-α / 2 ,n-1 + t π,,n-1 ) 2 •σ 2error
∆2
2
2 ( t1-α ,n-1 + t π,,n-1 ) 2 •σ error
∆2
(Two-sided test)
(One-sided test)
Since the particular t distribution depends on the sample size, the equation must be
solved iteratively (trial-and-error)
The sample size increases with σerror and decreases with ∆
32
16
5/3/2010
It is hypothesized that 40 year old men who drink more than three cups of coffee per
day will score more highly on the Cornell Medical Index (CMI) than men who do not
drink coffee. The CMI ranges from 0 to 195, and previous research has shown that
scores on the CMI increase by about 3.5 points for every decade of life. It is decided
that an increase, caused by drinking coffee, which was equivalent to about 10 years of
age would be enough to warrant concern.
1. ∆ = 3.5, suppose σerror = 7, α = 0.95, π = 0.8 one-sided test
n ≥ 50
2. ∆ = 3.5, suppose σerror = 4, α = 0.95, π = 0.8 one-sided test
n ≥ 19
33
Compare Two Population Means with
Independent Samples (Cont’d)
Estimate the Within-Group Standard Deviation
Often comes from previous similar studies
Sometimes it is necessary to a pilot study to get some idea of the inherent
variability
Conservative estimates (estimates that lead to a slightly larger sample size) are
preferable to underestimates
Rules of Thumb
For 80% power, need 393 samples for each group when Cohen’s d = 0.2, 64
samples when d=0.5, and 26 samples when d=0.8
34
17
5/3/2010
Compare Two Population Means with
Paired Samples
The formula for the total number of pairs is the same as for the number of
independent samples except that the factor of 2 is dropped, i.e.
n≥
( z1-α / 2 + z π ) 2 •σ 2error
n≥
n≥
n≥
∆2
2
( z1-α + z π ) 2 •σ error
∆2
(Two-sided test, n>30)
(One-sided test, n>30)
2
( t1-α / 2 ,n-1 + t π,,n-1 ) 2 •σ error
∆2
2
( t1-α ,n-1 + t π,,n-1 ) 2 •σ error
∆2
(Two-sided test, n≤30)
(One-sided test, n≤30)
Rules of Thumb
For 80% power, need 196 samples for each group when Cohen’s d = 0.2, 32
samples when d=0.5, and 13 samples when d=0.8
35
Sample Size for ANOVA Fixed Effects Model
Effect Size Omega Squared, ω2
The proportion of the total variance in the dependent variable that can be
explained by the effect (main effect, or interaction effect)
Table of Sample Size
A table of sample size needed to achieve a power of 0.60, 0.80, and 0.90 in a test
at α=0.05 for a single factor can be downloaded on the course website
36
18
5/3/2010
Drive Reward Example
ωdrive2=0.009, a=2, power=0.80, n>390 for each level of variable drive (n>130
for each combination of drive and reward levels)
ωreward2=0.12, a=3, power=0.80, n≥25 for each level of variable reward (n≥13
for each combination of drive and reward levels)
Interface Example, ω2=0.253, a=2, power=0.80, n<24 for each group
37
19