Document 292996

Hypothesis testing?
Hypothesis testing
introduction
+
Hypothesis testing with
one measurement variable
+
Hypothesis testing with
one categorical variable
(2 options)

women.

Data Preparation Practical Significance Inference (Statistical Significance) Hypothesis Testing and/or Confidence Intervals 
Is at the heart of hypothesis testing (in the narrow sense)

Does not replace Exploration and searching for practical
significance.
4 steps of hypothesis testing
What is a….?
see readings pg. 16
1. Set up a null hypothesis H0
2. a. Calculate the p
p--value using a hypothesis test
b. Calculate the confidence interval if possible
3. Make a decision about rejecting the H0
4. Write a nice final statement that summarizes
your decision & contributes to the final report
Statistical significance: Can we generalize findings from the
sample data to the population at large? If the pp--value is low
enough then yes.
Making inferences from a sample
to a population
Written Report Exploration
Exploration In Data analysis
analysis:: a way of testing a hypothesis using
inferences from a sample to a population with a
calculation of statistical significance as the key.

Hypothesis testing is
but one part of statistical analysis
Wider sense:
sense: a way of framing the goal of the research
as a statement to be tested. e.g. On average men are taller than

Null hypothesis. It is a ‘fake’ hypothesis
set up for testing. (e.g. There is no difference
between males and females))
 If we reject the null hypothesis then it looks
like we have some evidence for the original
(or alternate) hypothesis (On average males
are taller than females.)
1
What is a….?

What is ….?
P-value. It is the probability of getting the
sample statistic that we calculated(sample
proportion…
p
etc. ) assuming
g that it
mean or p
comes from a population in which the null
hypothesis is true


Making the decision: In this course it will
be easy..

If the pp-value is <0.05 then reject the null
hypothesis, (i.e. you have enough evidence that the null
hypothesis is not true. )

If the p
p--value is very low then either your
sample statistic is in the weird zone of the
distribution of sample means (i.e. it is very unlikely)
or we reject the null hypothesis (i.e. it is not true)
Don’t forget to explore and look at practical
significance…
What is a….?

Final statement.


Hypothesis testing
one measurement variable
(single mean)
This depends on the research question and
yp
that you
y are testing.
g
the null hypothesis
Just writing ‘I reject the null hypothesis’ is
often not enough….
Do 500ml water bottles really have 500ml of
water in them as claimed?
What does a hypothesis test of a
single mean look like?

Example: Do 500ml water bottles really have 500ml
Example:
of water in them as claimed?

A method was needed in order to test whether the
reported/estimated/ or previously calculated value of
the mean or proportion was true without opening and
measuring the volume of every bottle of water.

Wei Zhou, an HIM student conducted a mini
research study in which he used hypothesis testing
to find this out.
Open up water wei zhou on www.stataras.com
And in class exercise on page 6 of exercise booklet
We will replicate the analysis of his data set by following the 4
step process (readings p. 16), but first we will explore the
data and see what it tells us ‘practically’ parts A and B.
2
Do 500ml water bottles really have 500ml of
water in them as claimed?
Do 500ml water bottles really have 500ml of
water in them as claimed?
What the data says: exploration and practical significance:
Based on a sample of 30 bottles
N
Mean
Std. Error of Mean
Median
Mode
Std. Deviation
Skewness
Kurtosis
Range

Histogram
30
Valid
Missing
6
0
Frequency
.
525.20
.416
525.00
523(a)
2.280
.428
-.561
8
4

2
0
522
524
526
528
530
Mean =525.2
Std. Dev. =2.28
N =30
530
528

526
What do you see?
volume
Dark room moment

step 1:
H0: μ=500ml - I want to test whether μ=500ml.

HA: μ<500ml or μ>500ml - you can go either way with this
depending on whether you trust the
company.
Do 500ml water bottles really have 500ml of
water in them as claimed?
Step 2a: output of SPSS test
Test Value = 500ml
60.528
Does that mean we don’t need a hypothesis
test?
522
Do 500ml water bottles really have 500ml of
water in them as claimed?
The hypothesis test
tcalculated
Not only is the sample mean > 500ml; Every
single bottle had over 500ml of water in it.
volume
524
v
o
l
u
m
e
The sample is not normally distributed, but that is of
little consequence here.
df
Sig. (2-tailed)
29
.000
95% C.I.
C I of the
Difference
Mean
Upper
Difference Lower
25.200
24.35
26.05
Practical significance: If I find that the mean volume is
‘practically larger’ than 500ml then I can answer the research
question as ‘yes’, water bottles have more than 500ml in them,
but only for the sample bottles.
Statistical significance:
 If I reject the null hypothesis (H0), then my findings of
practical difference in the sample can be generalized to the
population of all water bottles.
 If I reject the null hypothesis then I have a ‘statistically
significant’ result to report.
 If I fail to reject the null hypothesis (H0), then any findings
of practical difference may be due to (random) sampling
error (and chance) there is nothing ‘statistically significant’
to report.
Do 500ml water bottles really have 500ml of
water in them as claimed?
Step 2: estimating p-value by hand
Set level of significance to α=0.05,
d.f.. = n – 1 = 29
d.f
look up tcriticall (two tail) = 2.045
2 045
525.2.  500
tcalculated 
 2.28



30 

P-value
tcalculated = 60.54
3
Step 2b.
The 95% Confidence interval
using the explore function
volume
Mean
95% C.I. for Mean
Lower Bound
Statistic
525.20
Upper Bound
5% Trimmed Mean
Median
Variance
Std. Deviation
Minimum
Maximum
Range
Interquartile Range
Skewness
Kurtosis
526.05
525.11
525.00
5.200
2.280
522
530
8
4
.428
-.561
Std. Error
.416



Each ‘weird zone’,
titled Reject H0 here, is
equal to 0.025 (or 2.5%)
since
i
we are using
i th
the
two tail method. Together
they make 0.05
524.35
Reject H0
Fail to Reject H0
Area =
0.025
Reject H0
-1.98
.427
.833
Step 3. make a decision

Step 2 cont’d. ::- p-value estimate by hand
All we can say is that the pp-value < 0.05(two tail) since all we
know is that we are in the weird zone
P-value <0.05 (sig = 0.000)
0
+1.98
t = 60.54
Why not just use the confidence
interval?
We know that we are in the ‘weird zone’ and
thus can reject H0 with 95% confidence.
Our sample mean (525.2ml) > 500ml; this fits with
HA : μ > 500ml.
Now we can argue (with 95% confidence) that the
actual mean μ is > 500ml.
Step 4. thoughtful concluding
statement

The bottled water company’s claim is not
accurate. I reject the claim that μ=500ml
(p=0.000)) and can sayy with more than 95%
(p
confidence that they are giving the buyer
between 524.35ml and 526.05 ml in the
average bottle.
Hypothesis testing
one categorical variable
(2 options)
Searching for weirdness in proportions.
4
Is the rate of RSI in Canada 7%?
What does a hypothesis test of a
single proportion look like?

Example: The RSI industry reports that the rate of
Example:
repetitive strain injury in Canada is 7%. We wish to
test the veracityy of that claim.

We will test whether the reported/estimated/
previously calculated value of the proportion is true
without contacting/examining every single Canadian.



Luckily the Canadian Community Health survey
asked the very same question: Did you sustain an
RSI in the past 12 months?
Again, we start with exploration and ‘practical’
analysis.
Open the ‘repstrain
‘repstrain’’ data set on www.stataras.com
and page 7 in exercise booklet
Is the rate of RSI in Canada 7%?
Exploration and practical significance
Choice of method for hypothesis
testing


n = 2000
p = 10.7%
Option 1: test of one proportion - π.
 good for 2 categories only – similar in
approach to test of single mean.
Rep. strain injury - past 12
mo.
Missing
Total
YES
NO
Total
Total
2,000
Frequency Percent
214
10.7
1777
88.9
1991
99.6
9
.4
2000
100.0
Frequency
Valid

1,500
1,000
Option 2: χ2 (chi
(chi--square) goodness of fit test
 good for two or more categories
500
0
YES
NO
Rep. strain injury - past
12 mo.
Concerns? Only 9 missing – that is very low (i.e. no concerns)
note: I removed a few of the columns from the frequency table output.
Is the rate of RSI in Canada 7%?
Step 1 – the null hypothesis
Option 1: test of single
proportion
H0: πyes = 0.07
HA: π yes > 0.07 (or π yes <
0.07)
Option 2: chichi-square
goodness of fit
H0:
RSI?
Expected
Yes
7%
no
93%
Is the rate of RSI in Canada 7%?
Step 2a: hypothesis test on SPSS option 1
Be careful in entering
the test proportion. H0:
π
yes = 0.07 but I needed
to enter the test
proportion as 0.93.
Let trial and error be
your guide.
Open up the pain data set on www.stataras.com
5
Is the rate of RSI in Canada 7%?
Step 2: hypothesis test option 2 – chi square
Is the rate of RSI in Canada 7%?
Step 2: output option 1
Binomial Test
Observed Test
Prop.
Prop.
Category
N
Rep. strain
NO
1777 .892516
injury past 12 mo. YES
214 .107484
Total 1991 1.000000
.93
Again, you will need
to be careful in
setting up the
hypothesis test.
Asymp.
Sig. (1tailed)
.000(a)
In this case indicate
all of the proportions
you wish to test.
I know that entering 0.93 is correct because in the
column ‘test proportion’ the value is in the row ‘no’
not ‘yes’ – which is what I would have expected.
This is one of those weird things about SPSS.
Is the rate of RSI in Canada 7%?
Step 2: option 2
Expected values are based on the
hypothesized proportions times the
Total.
e.g. Expected Yes = 0.07*1991 = 139.4
You can calculate the
chi square value by
using the formula on pg
16 of your handouts –
try it.
Step 3 visualizing option 1
All we can say is that the pp-value < 0.05(two tail) since all we
know is that we are in the weird zone.
Fail to Reject H0
95% of area under
curve
Each ‘weird zone’,
titled Reject H0 here, is
equal to 0.025 (or 2.5%)
since
i
we are using
i th
the
two tail method. Together
they make 0.05
Reject H0
Area =
0.025
Reject H0
-1.96
Πtest=0.07
+1.96
zcalculated = 60.54
p (1  p )
n
C. I for proportion





П
yes
=p±z
p 1  p 
n
П = .107
107 ±1.96(.0073)
1 96( 0073)
П = .107 ±.014
0.093 ≤ П ≤ 0.121
9.3% ≤ П ≤ 12.1% which is nowhere near
the 7% claim.
yes
Is the rate of RSI in Canada 7%?
Step 3: option 1& 2
Since pp-value is < 0.05, I can reject the null
hypothesis with 95% confidence.
I can be 95% confident that π yes ≠ 00.07;
07;
yes
yes
yes
6
Is the rate of RSI in Canada 7%?
Step 4: final statement


I am 95% confident that π yes ≠ 0.07.
The observed proportion of RSI in the sample of
Canadians is pyes =10.7%.
10 7% Based on this sample I am
95% Confident that the rate of RSI is higher than 7%.
The 95% C.I. is 9.3% ≤ П yes ≤ 12.1%
χ2 goodness of fit test
Example 2 - testing when expectation is uniform
distribution i.e. all categories with equal proportion
Hypothesis testing
one categorical variable
(>2 categories)
Searching for weirdness in proportions.


Step 1 of the hypothesis test

colour of smartie
12.5
F
Frequency
Frequency Percent
8
14.5
7
12.7
10
18.2
8
14.5
13
23.6
9
16.4
55
100.0
One pack of smarties was randomly chosen for purchase.
Open up the ‘smarties’
‘smarties’ data set on www.stataras.com and
start with an exploration
Following page 8 in exercise booklet
Smarties exploration (practical
significance)
blue
green
orange
purple
red
yellow
Total
Research question: Are there an equal number of
smarties of each colour produced?
H0: We expect that there will be an equal
number of smarties of each colour
colour..
10.0
7.5
5.0
2.5
0.0
blue
green orange purple
red
yellow
Thi can nott be
This
b solved
l d with
ith a straightforward
t i htf
d
binomial test. We have to use option 2.
colour of smartie
It seems that there are more red smarties than other
colours – is 23.6% vs 18.2% a large enough difference
practically speaking?
7
Equality of distribution of smarties
step 2: the test
blue
green
orange
purple
red
yellow
Total
Observed N Expected N Residual
8
9.2
-1.2
7
9.2
-2.2
10
9.2
.8
8
9.2
-1.2
13
9.2
3.8
9
9.2
-.2
55

Chi-Square(a)
df
Asymp. Sig.
Decision: Since the p
p--value is >0.05 (it is not
even close) we don’t have enough evidence to
reject the null hypothesis –
P-value

colour of smartie
Both of these tables are
produced with the chi-sq
function.
Smarties distribution
step 3 and 4 – decision and statement
2.491
5
.778
Final statement: Even though in the sample we
saw what looked like a difference, we do not have
enough evidence to generalize that claim to all
smarties boxes and thus cannot question the claim
that there are an equal number of each colour
smarties in the smartie population.
More

practice with hypothesis testing
8