Download Report

STAT 101
Dr. Kari Lock Morgan
Essential Synthesis
SECTION 4.4, 4.5, ES A, ES B
• Connecting bootstrap and randomization (4.4)
• Connecting intervals and tests (4.5)
• Review (Ch 1-4)
Statistics: Unlocking the Power of Data
Lock5
Exam Details
 Wednesday, 2/26
• Closed to everything except one double-sided
page of notes prepared by you (no sharing) and a
non-cell phone calculator
• Best ways to prepare:
• #1: WORK LOTS OF PROBLEMS!
• Make a good page of notes
• Read sections you are still confused about
• Come to office hours and clarify confusion
• Covers chapters 1-4 (except 2.6) and anything
covered in lecture
Statistics: Unlocking the Power of Data
Lock5
Practice Problems
• Practice exam online (under resources)
• Solutions to odd essential synthesis and review
problems online (under resources)
• Solutions to all odd problems in the book on
reserve at Perkins
Statistics: Unlocking the Power of Data
Lock5
Office Hours and Help
 Monday 4–6pm: Stephanie Sun, Old Chem 211A
 Tuesday 3:30–5pm (extra): Prof Morgan, Old Chem 216
 Tuesday 5-7pm: Wenjing Shi (new TA), Old Chem 211A
 Tuesday 7-9pm: Mao Hu, Old Chem 211A
 REVIEW SESSION: 5 – 6 pm Tuesday (if we can get a
room… I’ll keep you posted)
Statistics: Unlocking the Power of Data
Lock5
Review from Last Class
You will all do a hypothesis test for Project 1. If
all of you are doing tests for which the nulls are
true, about how many of you will get
statistically significant results using α = 0.05?
(there are 110 students in the class)
a) 110
b) 105
c) 6
d) 0
Statistics: Unlocking the Power of Data
0.05*110 = 5.5
Lock5
Multiple Testing
When multiple hypothesis tests are
conducted, the chance that at least one test
incorrectly rejects a true null hypothesis
increases with the number of tests.
If the null hypotheses are all true, α of the
tests will yield statistically significant
results just by random chance.
Statistics: Unlocking the Power of Data
Lock5
Multiple Comparisons
• Consider a topic that is being
investigated by research teams all over
the world
 Using α = 0.05, 5% of teams are going
to find something significant, even if the
null hypothesis is true
Statistics: Unlocking the Power of Data
Lock5
Multiple Comparisons
•Consider a research team/company
doing many hypothesis tests
 Using α = 0.05, 5% of tests are going
to be significant, even if the null
hypotheses are all true
Statistics: Unlocking the Power of Data
Lock5
Multiple Comparisons
• This is a serious problem
• The most important thing is to be aware of this
issue, and not to trust claims that are obviously
one of many tests (unless they specifically
mention an adjustment for multiple testing)
•There are ways to account for this (e.g.
Bonferroni’s Correction), but these are beyond
the scope of this class
Statistics: Unlocking the Power of Data
Lock5
Publication Bias
• publication bias refers to the fact that
usually only the significant results get
published
• The one study that turns out significant gets
published, and no one knows about all the
insignificant results
• This combined with the problem of multiple
comparisons, can yield very misleading results
Statistics: Unlocking the Power of Data
Lock5
Jelly Beans Cause Acne!
http://xkcd.com/882/
Statistics: Unlocking the Power of Data
Lock5
Statistics: Unlocking the Power of Data
Lock5
Statistics: Unlocking the Power of Data
Lock5
http://xkcd.com/882/
Statistics: Unlocking the Power of Data
Lock5
Connections
 Today we’ll make connections between…
 Chapter
1: Data collection (random sampling?,
random assignment?)
 Chapter
2: Which statistic is appropriate, based on
the variable(s)?
 Chapter
3: Bootstrapping and confidence intervals
 Chapter
4: Randomization distributions and
hypothesis tests
Statistics: Unlocking the Power of Data
Lock5
Connections
 Today we’ll make connections between…
 Chapter
1: Data collection (random sampling?,
random assignment?)
 Chapter
2: Which statistic is appropriate, based on
the variable(s)?
 Chapter
3: Bootstrapping and confidence intervals
 Chapter
4: Randomization distributions and
hypothesis tests
Statistics: Unlocking the Power of Data
Lock5
Randomization Distribution
For a randomization distribution, each
simulated sample should…
• be consistent with the null hypothesis
• use the data in the observed sample
• reflect the way the data were collected
Statistics: Unlocking the Power of Data
Lock5
Randomized Experiments
• In randomized experiments the “randomness”
is the random allocation to treatment groups
• If the null hypothesis is true, the response
values would be the same, regardless of
treatment group assignment
• To simulate what would happen just by random
chance, if H0 were true:
o reallocate cases to treatment groups,
keeping the response values the same
Statistics: Unlocking the Power of Data
Lock5
Observational Studies
 In observational studies, the “randomness” is
random sampling from the population
 To simulate what would happen, just by random
chance, if H0 were true:
 Simulate resampling from
a population in which H0 is true
 How do we simulate resampling from a population
when we only have sample data?
 Bootstrap!
 How can we generate randomization samples for
observational studies?
 Make
H0 true, then bootstrap!
Statistics: Unlocking the Power of Data
Lock5
Body Temperatures
•  = average human body temperature
H0 :  = 98.6
Ha :  ≠ 98.6
• 𝑥 = 98.26
• We can make the null true just by adding
98.6 – 98.26 = 0.34 to each value, to make
the mean be 98.6
• Bootstrapping from this revised sample lets
us simulate samples, assuming H0 is true!
Statistics: Unlocking the Power of Data
Lock5
Body Temperatures
• In StatKey, when we enter the null hypothesis,
this shifting is automatically done for us
StatKey
p-value
= 0.002
Statistics: Unlocking the Power of Data
Lock5
Exercise and Gender
• H0: m = f , Ha: m > f
• How might we make the null true?
• One way (of many): add 3 to every female
• Bootstrap from this modified sample
• In StatKey, the default randomization
method is “reallocate groups”, but “Shift
Groups” is also an option, and will do this
Statistics: Unlocking the Power of Data
Lock5
Exercise and Gender
p-value =
0.095
Statistics: Unlocking the Power of Data
Lock5
Exercise and Gender
The p-value is 0.095. Using α = 0.05, we
conclude….
a) Males exercise more than females, on
average
b) Males do not exercise more than
females, on average
Do not reject the
c) Nothing
null… we can’t
conclude anything.
Statistics: Unlocking the Power of Data
Lock5
Blood Pressure and Heart Rate
• H0:  = 0 , Ha:  < 0
• Two variables have correlation 0 if they are
not associated. We can “break the
association” by randomly
permuting/scrambling/shuffling one of the
variables
• Each time we do this, we get a sample we
might observe just by random chance, if there
really is no correlation
Statistics: Unlocking the Power of Data
Lock5
Blood Pressure and Heart Rate
p-value =
0.219
Statistics: Unlocking the Power of Data
Even if blood pressure and
heart rate are not correlated,
we would see correlations this
extreme about 22% of the
time, just by random chance.
Lock5
Randomization Distribution
 Paul the Octopus or ESP(single proportion):
 Flip a coin or roll a die
 Cocaine Addiction (randomized experiment):
 Rerandomize cases to treatment groups, keeping
response values fixed
 Body Temperature (single mean):
 Shift to make H0 true, then bootstrap
 Exercise and Gender (observational study):
 Shift to make H0 true, then bootstrap
 Blood Pressure and Heart Rate (correlation):
 Randomly permute/scramble/shuffle one variable
Statistics: Unlocking the Power of Data
Lock5
Connections
 Today we’ll make connections between…
 Chapter
1: Data collection (random sampling?,
random assignment?)
 Chapter
2: Which statistic is appropriate, based on
the variable(s)?
 Chapter
3: Bootstrapping and confidence intervals
 Chapter
4: Randomization distributions and
hypothesis tests
Statistics: Unlocking the Power of Data
Lock5
Body Temperature
 We created a bootstrap distribution for average
body temperature by resampling with
replacement from the original sample (𝑥 =
92.26):
Statistics: Unlocking the Power of Data
Lock5
Body Temperature
 We also created a randomization distribution to see
if average body temperature differs from 98.6F by
adding 0.34 to every value to make the null true, and
then resampling with replacement from this
modified sample:
Statistics: Unlocking the Power of Data
Lock5
Body Temperature
 These two distributions are identical (up to
random variation from simulation to
simulation) except for the center
 The bootstrap distribution is centered around
the sample statistic, 98.26, while the
randomization distribution is centered around
the null hypothesized value, 98.6
 The randomization distribution is equivalent
to the bootstrap distribution, but shifted over
Statistics: Unlocking the Power of Data
Lock5
Bootstrap and Randomization Distributions
Bootstrap Distribution
Randomization Distribution
Our best guess at the
distribution of sample
statistics
Centered around the observed
sample statistic
Simulate sampling from the
population by resampling from
the original sample
Our best guess at the
distribution of sample
statistics, if H0 were true
Centered around the null
hypothesized value
Simulate samples assuming H0
were true
 Big difference: a randomization distribution assumes H0
is true, while a bootstrap distribution does not
Statistics: Unlocking the Power of Data
Lock5
Which Distribution?
 Let  be the average amount of sleep college students get
per night. Data was collected on a sample of students, and
for this sample 𝑥 = 6.7 hours.
 A bootstrap distribution is generated to create a
confidence interval for , and a randomization distribution
is generated to see if the data provide evidence that  > 7.
 Which distribution below is the bootstrap distribution?
(a) is centered
around the
sample statistic,
6.7
Statistics: Unlocking the Power of Data
Lock5
Which Distribution?
 Intro stat students are surveyed, and we find that 152 out
of 218 are female. Let p be the proportion of intro stat
students at that university who are female.
 A bootstrap distribution is generated for a confidence
interval for p, and a randomization distribution is
generated to see if the data provide evidence that p > 1/2.
 Which distribution is the randomization distribution?
(a) is centered
around the null
value, 1/2
Statistics: Unlocking the Power of Data
Lock5
Connections
 Today we’ll make connections between…
 Chapter
1: Data collection (random sampling?,
random assignment?)
 Chapter
2: Which statistic is appropriate, based on
the variable(s)?
 Chapter
3: Bootstrapping and confidence intervals
 Chapter
4: Randomization distributions and
hypothesis tests
Statistics: Unlocking the Power of Data
Lock5
Intervals and Tests
 A confidence interval represents the range of
plausible values for the population parameter
 If the null hypothesized value IS NOT within
the CI, it is not a plausible value and should be
rejected
 If the null hypothesized value IS within the CI,
it is a plausible value and should not be
rejected
Statistics: Unlocking the Power of Data
Lock5
Intervals and Tests
If a 95% CI contains the parameter in H0,
then a two-tailed test should not reject H0
at a 5% significance level.
If a 95% CI misses the parameter in H0,
then a two-tailed test should reject H0
at a 5% significance level.
Statistics: Unlocking the Power of Data
Lock5
Body Temperatures
• Using bootstrapping, we found a 95%
confidence interval for the mean body
temperature to be (98.05, 98.47)
• This does not contain 98.6, so at α = 0.05 we
would reject H0 for the hypotheses
H0 :  = 98.6
Ha :  ≠ 98.6
Statistics: Unlocking the Power of Data
Lock5
Both Father and Mother
“Does a child need both a father and a mother to
grow up happily?”
•
Let p be the proportion of adults aged 18-29 in
2010 who say yes. A 95% CI for p is (0.487, 0.573).
•
Testing H0: p = 0.5 vs Ha: p ≠ 0.5 with α = 0.05, we
a) Reject H0
b) Do not reject H0
c) Reject Ha
d) Do not reject Ha
0.5 is within the
CI, so is a plausible
value for p.
http://www.pewsocialtrends.org/2011/03/09/formillennials-parenthood-trumps-marriage/#fn-7199-1
Statistics: Unlocking the Power of Data
Lock5
Both Father and Mother
“Does a child need both a father and a mother to
grow up happily?”
•
Let p be the proportion of adults aged 18-29 in
1997 who say yes. A 95% CI for p is (0.533, 0.607).
•
Testing H0: p = 0.5 vs Ha: p ≠ 0.5 with α = 0.05, we
a) Reject H0
b) Do not reject H0
c) Reject Ha
d) Do not reject Ha
0.5 is not within
the CI, so is not a
plausible value for p.
http://www.pewsocialtrends.org/2011/03/09/formillennials-parenthood-trumps-marriage/#fn-7199-1
Statistics: Unlocking the Power of Data
Lock5
Intervals and Tests
 Confidence intervals are most useful when you
want to estimate population parameters
 Hypothesis tests and p-values are most useful
when you want to test hypotheses about
population parameters
 Confidence intervals give you a range of
plausible values; p-values quantify the strength
of evidence against the null hypothesis
Statistics: Unlocking the Power of Data
Lock5
Interval, Test, or Neither?
Is the following question best assessed using a
confidence interval, a hypothesis test, or is
statistical inference not relevant?
On average, how much more do adults who played
sports in high school exercise than adults who did
not play sports in high school?
a) Confidence interval
b) Hypothesis test
c) Statistical inference not relevant
Statistics: Unlocking the Power of Data
Lock5
Interval, Test, or Neither?
Is the following question best assessed
using a confidence interval, a hypothesis
test, or is statistical inference not relevant?
Do a majority of adults riding a bicycle wear
a helmet?
a) Confidence interval
b) Hypothesis test
c) Statistical inference not relevant
Statistics: Unlocking the Power of Data
Lock5
Interval, Test, or Neither?
Is the following question best assessed using a
confidence interval, a hypothesis test, or is
statistical inference not relevant?
On average, were the players on the 2014
Canadian Olympic hockey team older than the
players on the 2014 US Olympic hockey team?
a) Confidence interval
b) Hypothesis test
c) Statistical inference not relevant
Statistics: Unlocking the Power of Data
Lock5
Summary
 Using α = 0.05, 5% of all hypothesis tests will lead to
rejecting the null, even if all the null hypotheses are true
 Randomization samples should be generated
 Consistent with
the null hypothesis
 Using the observed data
 Reflecting the way the data were collected
 If a null hypothesized value lies inside a 95% CI, a
two-tailed test using α = 0.05 would not reject H0
 If a null hypothesized value lies outside a 95% CI, a
two-tailed test using α = 0.05 would reject H0
Statistics: Unlocking the Power of Data
Lock5
The Big Picture
Population
Sampling
Sample
Statistical
Inference
Statistics: Unlocking the Power of Data
Descriptive
statistics
Lock5
Cases and Variables
We obtain information about cases or units.
A variable is any characteristic that is
recorded for each case.
 Generally each case makes up a row in a
dataset, and each variable makes up a column
 Variables are either categorical or quantitative
Statistics: Unlocking the Power of Data
Lock5
Sampling
 Sampling bias occurs when the method of
selecting a sample causes the sample to differ
from the population in some relevant way.
 If sampling bias exists, we cannot generalize
from the sample to the population
 To avoid sampling bias, select a random
sample
Statistics: Unlocking the Power of Data
Lock5
Sampling
Population
Sample
Sample
GOAL: Select a sample that is similar to the population,
only smaller
Statistics: Unlocking the Power of Data
Lock5
Observational Studies
 A third variable that is associated with both
the explanatory variable and the response
variable is called a confounding variable
 There are almost always confounding
variables in observational studies
Observational
studies
cannever
almost
 Observational
studies
can almost
be
used
to establish
never
be usedcausation
to establish causation
Statistics: Unlocking the Power of Data
Lock5
Randomized Experiments
 In a randomized experiment the explanatory
variable for each unit is determined randomly,
before the response variable is measured
 Because the explanatory variable is randomly
assigned, it is not associated with any other
variables.
 Confounding variables are eliminated!!!
 Randomized experiments make it possible to
infer causation!
Statistics: Unlocking the Power of Data
Lock5
Randomized Experiments
Confounding
Variable
RANDOMIZED
EXPERIMENT
Explanatory
Variable
Statistics: Unlocking the Power of Data
Response
Variable
Lock5
Chapter 1: Data Collection
Was the sample
randomly selected?
Yes
No
Possible to
generalize to
the population
Should not
generalize to
the
population
Statistics: Unlocking the Power of Data
Was the explanatory
variable randomly
assigned?
Yes
Possible to
make
conclusions
about causality
No
Can not make
conclusions
about causality
Lock5
Chapter 2: Descriptive Statistics
 In order to make sense of data, we need ways
to summarize and visualize it
 Summarizing and visualizing variables and
relationships between two variables is often
known as descriptive statistics (also known as
exploratory data analysis)
 Type of summary statistics and visualization
methods depend on the type of variable(s) being
analyzed (categorical or quantitative)
Statistics: Unlocking the Power of Data
Lock5
Variable(s)
Visualization
Summary Statistics
Categorical
bar chart,
pie chart
frequency table,
relative frequency table,
proportion
Quantitative
dotplot,
histogram,
boxplot
mean, median, max, min,
standard deviation,
z-score, range, IQR,
five number summary
Categorical vs
Categorical
side-by-side bar chart, two-way table, difference
segmented bar chart
in proportions
Quantitative vs
Categorical
side-by-side boxplots
statistics by group,
difference in means
Quantitative vs
Quantitative
scatterplot
correlation
Statistics: Unlocking the Power of Data
Lock5
Descriptive Statistics
Think of a topic or question you would like to
use data to help you answer.

What would the cases be?

What would the variables be?
(Limit to one or two variables)
Statistics: Unlocking the Power of Data
Lock5
Descriptive Statistics
How would you visualize and summarize the
variable or relationship between variables?
a) bar chart/pie chart, proportions, frequency
table/relative frequency table
b) dotplot/histogram/boxplot, mean/median,
sd/range/IQR, five number summary
c) side-by-side or segmented bar charts, difference in
proportions, two-way table
d) side-by-side boxplot, difference in means
e) scatterplot, correlation
Statistics: Unlocking the Power of Data
Lock5
Statistic vs Parameter
• A sample statistic is a number computed
from sample data.
• A population parameter is a number that
describes some aspect of a population
• Statistical inference is the process of
drawing conclusions about the entire
population based on information in a sample
Statistics: Unlocking the Power of Data
Lock5
Sampling Distribution
• A sampling distribution is the distribution of
statistics computed for different samples of the
same size taken from the same population
• The spread of the sampling distribution helps us
to assess the uncertainty in the sample statistic
• In real life, we rarely get to see the sampling
distribution – we usually only have one sample
Statistics: Unlocking the Power of Data
Lock5
Bootstrap
• A bootstrap sample is a random sample taken
with replacement from the original sample, of the
same size as the original sample
• A bootstrap statistic is the statistic computed on
the bootstrap sample
• A bootstrap distribution is the distribution of
many bootstrap statistics
Statistics: Unlocking the Power of Data
Lock5
Original
Sample
Sample
Statistic
Bootstrap
Sample
Bootstrap
Statistic
Bootstrap
Sample
Bootstrap
Statistic
.
.
.
Bootstrap
Sample
Statistics: Unlocking the Power of Data
.
.
.
Bootstrap
Distribution
Bootstrap
Statistic
Lock5
Confidence Interval
• A confidence interval for a parameter is
an interval computed from sample data by
a method that will capture the parameter
for a specified proportion of all samples
• A 95% confidence interval will contain
the true parameter for 95% of all samples
Statistics: Unlocking the Power of Data
Lock5
Confidence Intervals
 The parameter is fixed
 The statistic is random
(depends on the sample)
 The interval is random
(depends on the statistic)
 95% of 95% confidence
intervals will capture the truth
Statistics: Unlocking the Power of Data
Lock5
Margin of Error
 One common form for a confidence interval is
statistic ± margin of error
 The margin of error is determined by the
uncertainty in the sample statistic…
 which depends on how much the statistic
varies from sample to sample…
 which is measured by the standard error
Statistics: Unlocking the Power of Data
Lock5
Standard Error
• The standard error (SE) is the standard
deviation of the sample statistic
• The SE can be estimated by the standard
deviation of the bootstrap distribution
• For symmetric, bell-shaped distributions, a
95% confidence interval is
statistic  2  SE
Statistics: Unlocking the Power of Data
Lock5
Confidence Intervals
Bootstrap
Sample
Sample
statistic ± ME
Bootstrap
Sample
Bootstrap
Sample
...
Bootstrap
Sample
Bootstrap
Sample
Margin of Error (ME)
(95% CI: ME = 2×SE)
Bootstrap Distribution
Calculate statistic
for each bootstrap
sample
Statistics: Unlocking the Power of Data
Standard Error (SE):
standard deviation of
bootstrap distribution
Lock5
Percentile Method
• If the bootstrap distribution is approximately
symmetric, a P% confidence interval can be
gotten by taking the middle P% of a
bootstrap distribution
Statistics: Unlocking the Power of Data
Lock5
Bootstrap
Distribution
Best Guess at Sampling Distribution
P%
Lower
Bound
2
3
Observed
Statistic
4
5
Upper
Bound
6
7
8
Statistic
Statistics: Unlocking the Power of Data
Lock5
Hypothesis Testing
• How unusual would it be to get results as
extreme (or more extreme) than those
observed, if the null hypothesis is true?
• If it would be very unusual, then the null
hypothesis is probably not true!
• If it would not be very unusual, then there is
not evidence against the null hypothesis
Statistics: Unlocking the Power of Data
Lock5
p-value
• The p-value is the probability of getting a
statistic as extreme (or more extreme) as that
observed, just by random chance, if the null
hypothesis is true
• The p-value measures evidence against the
null hypothesis
Statistics: Unlocking the Power of Data
Lock5
Randomization Distribution
• A randomization distribution is the
distribution of sample statistics we would
observe, just by random chance, if the null
hypothesis were true
• The p-value is calculated by finding the
proportion of statistics in the randomization
distribution that fall beyond the observed
statistic
Statistics: Unlocking the Power of Data
Lock5
Hypothesis Testing
p-value
Observed Statistic
Statistics: Unlocking the Power of Data
Lock5
Statistical Conclusions
Strength of evidence against H0:
Formal decision of hypothesis test, based on  = 0.05 :
statistically significant
Statistics: Unlocking the Power of Data
not statistically significant
Lock5
Formal Decisions
For a given significance level, ,
p-value <   Reject Ho
p-value >   Do not Reject Ho
“If the p-value is low, the ho must go”
Statistics: Unlocking the Power of Data
Lock5
Errors
Decision
Reject H0
Truth
H0 true TYPE I ERROR
If H 0 true,
probability = 
H0 false

Statistics: Unlocking the Power of Data
Do not reject H0

TYPE II ERROR
Lock5
More on Significance
 Statistical significance is closely connected to
sample size
 Larger
n: easier to get a significant result
 Smaller n: easier to make a Type II error
 Statistical significance and practical
significance are not always the same
 Problem of multiple testing: even if all null
hypotheses are true, α of all tests will find
significant results
Statistics: Unlocking the Power of Data
Lock5
Connecting Intervals and Tests
 A confidence interval represents the range of
plausible values for the population parameter
 If the null hypothesized value IS NOT within
the CI, it is not a plausible value and should be
rejected
 If the null hypothesized value IS within the CI,
it is a plausible value and should not be
rejected
Statistics: Unlocking the Power of Data
Lock5
Intervals and Tests
 Confidence intervals are most useful when you
want to estimate population parameters
 Hypothesis tests and p-values are most useful
when you want to test hypotheses about
population parameters
 Confidence intervals give you a range of
plausible values; p-values quantify the strength
of evidence against the null hypothesis
Statistics: Unlocking the Power of Data
Lock5
You’ve now learned how to
successfully collect and analyze
data to answer a question!
Let’s put it all together…
Statistics: Unlocking the Power of Data
Lock5
Tongue Curling
 What proportion of people can roll
their tongue?
Can you roll your tongue? (a) Yes (b) No
 Visualize and summarize the data. What is your
point estimate?
 Give and interpret a confidence interval.
 Tongue rolling has been said to be a dominant
trait, in which case theoretically 75% of all people
should be able to roll their tongues. Do our data
provide evidence otherwise?

Statistics: Unlocking the Power of Data
Lock5
To Do
 Read Section 4.4, 4.5, ES A, ES B
 Study for Exam 1!
Statistics: Unlocking the Power of Data
Lock5