Download Report

Lec.1
STAT2912
Lecture 1
• Introduction
• Methodology of statistical tests
Assumptions: sample X1, X2, ..., Xn.
Hypotheses: H
vs
HA
Test Statistics: T = T (X1, X2, ..., Xn).
p-value: p = PH (T ≥ t).
Conclusion: by ”interpreting” the p-value.
Introduction and Methodology
1
Lec.1
STAT2912
Introduction
You often come across data in various sizes and
formats. Sometimes you have to answer a “yes” or
“no” question for certain natures of a data set.
Data are the raw material for statisticians
Example 1.1.(Beer contents) A brand of beer
claims its beer content is 375 (in millilitres) on the
label. A sample of 40 bottles of the beer gave a
sample average of 373.9 and a standard deviation of
2.5.
As a consumer one may like to ask:
is there evidence that the mean content of the
beer is less than the 375 mL claimed on the
label?
Introduction and Methodology
2
Lec.1
STAT2912
Example 1.2.(Height comparison) A fourth grade
class has 10 girls and 13 boys. The children’s heights
are recorded on their 10th birthday as follows:
Boys: 135.3 137.0 136.0 139.7 136.5 139.2 138.8 139.6 140.0 142.7
135.5 134.9 139.5;
Girls: 140.3 134.8 138.6 135.1 140.0 136.2 138.7 135.5 134.9 140.0
An interesting question is:
is there evidence that girls are taller than boys
on their 10th birthday?
Introduction and Methodology
3
Lec.1
STAT2912
Example 1.3.
(Body temprature) The
dataset beav1$temp contains measurements of body
temperature (in degree Celsius) over a certain time
period.
> beav1$temp
[1] 36.33 36.34 36.35 36.42 36.55 36.69 36.71 36.75 36.81 36.88 36.89 36.91
[13] 36.85 36.89 36.89 36.67 36.50 36.74 36.77 36.76 36.78 36.82 36.89 36.99
[25] 36.92 36.99 36.89 36.94 36.92 36.97 36.91 36.79 36.77 36.69 36.62 36.54
[37] 36.55 36.67 36.69 36.62 36.64 36.59 36.65 36.75 36.80 36.81 36.87 36.87
[49] 36.89 36.94 36.98 36.95 37.00 37.07 37.05 37.00 36.95 37.00 36.94 36.88
[61] 36.93 36.98 36.97 36.85 36.92 36.99 37.01 37.10 37.09 37.02 36.96 36.84
[73] 36.87 36.85 36.85 36.87 36.89 36.86 36.91 37.53 37.23 37.20 37.25 37.20
[85] 37.21 37.24 37.10 37.20 37.18 36.93 36.83 36.93 36.83 36.80 36.75 36.71
[97] 36.73 36.75 36.72 36.76 36.70 36.82 36.88 36.94 36.79 36.78 36.80 36.82
[109] 36.84 36.86 36.88 36.93 36.97 37.15
We may ask:
is there evidence that the observation is not
from a normal population?
Introduction and Methodology
4
Lec.1
STAT2912
The aim of this course
This course will attempt to answer these typical
yes or no questions. Explicitly we will introduce
the methods of applied statistics, both parametric
and non-parametric, for the analysis of data and the
techniques of statistical inference for one, two and
several samples from populations, for simple regression
and for categorical data.
Some of this has been touched on before in the
ﬁrst year Statistics course. Here we will give a more
systematic treatment. Computer intensive methods
and simple data analysis are also discussed in some
cases.
Introduction and Methodology
5
Lec.1
STAT2912
Some basic concepts
A population is a collection of all possible results
that are the target of our interest, described by a
random variable, say X.
A sample is a subset selected from a population,
say X1, X2, ..., Xn.
We require that each element in a dataset is a
”representative” of the population. Theoretically,
this is equivalent to saying that X1, X2, ..., Xn, X are
required to be independent and identically distributed
(iid) random variables.
A dataset x1, x2, ..., xn is a observed value of
the sample X1, X2..., Xn. Sometimes we also say
x1, x2, ..., xn to be a sample or sample value without
confusion.
Introduction and Methodology
6
Lec.1
STAT2912
Methodology of statistical tests
Five steps of a general test
Step 1: Model and assumptions.
The model represents our belief about the
probability distribution F describing the population.
In general, we write
X ∼ Fθ (x),
where Fθ (x) is a speciﬁc distribution function only
depending on the unknown parameter θ.
For example, we can write X ∼ N (µ, σ 2) or
X ∼ B(n, p), meaning that we assume normal or
binomial model, respectively. In this case, µ, σ 2, p, θ,
etc are called parameters.
The corresponding
statistical analysis is therefore called parametric.
A statistical analysis is called nonparametric if our
assumptions on the population distribution F do not
specify a particular class of probability distributions.
Introduction and Methodology
7
Lec.1
STAT2912
Assumptions on the model
Parametric hypothesis testing:
X1, X2, ..., Xn are independent random variables
with
Xi ∼ Fθ (x), i = 1, . . . , n.
Nonparametric hypothesis testing:
X1, X2, ..., Xn are iid random variables with
distribution function F .
Note: In general, we also require the F is not
heavy (long)-tailed, that is, we require certain moment
conditions. When the F is heavy tailed, the boxplot
(stem-and-leaf) plot of the dataset should be skewed.
For skewed data, we can sometimes transform them
into symmetric. This will be discussed later.
Introduction and Methodology
8
Lec.1
STAT2912
Step 2: Hypotheses on (the parameter in) the
population.
The statement against which we are searching for
evidence is called the null hypothesis, and is denoted
by H0 (or simply H). The null hypothesis is often a
”no diﬀerence” statement.
The statement we will consider if H0 is false is
called the alternative hypothesis, and is denoted by
H1 (or HA).
For example, if X ∈ Fθ (x), we may have
H0 :
θ = θ0; against
H1 :
θ > θ0
(right-sided alternative), or
θ < θ0
(left-sided alternative), or
θ ̸= θ0
(two-sided alternative),
depending on the context, or more generally
H0 : θ ∈ Θ0
VS H1 : θ ∈ Θ − Θ0
where Θ is a parameter space.
Introduction and Methodology
9
Lec.1
STAT2912
Examples (cont.)
In Example 1.1, if we assume that claim amount
is X ∼ N (µ, σ 2), then the null and alternative
hypotheses are as follows:
H0 : µ = 375,
H1 : µ < 375.
In Example 1.2, if we assume that the height
of girls X ∼ N (µ1, σ12), and the height of boys
Y ∼ N (µ2, σ22), then the null and alternative
hypotheses are:
H0 : µ1 − µ2 = 0,
H1 : µ1 − µ2 > 0.
In Example 1.3, the null hypothesis is
H0: the measurement X has a normal distribution.
Introduction and Methodology
10
Lec.1
STAT2912
Step 3: Find a test statistic, T
A function T := T (X1, ..., Xn), of the sample
X1, X2, ..., Xn, is called a test statistic. To be a
useful statistic for testing the null hypothesis H0, T is
chosen in such a way that:
A. the distribution of T is completely determined when
H0 is true; i.e., when θ ∈ Θ0;
B. the particular observed values of T , called rejection
region, can be taken as evidence of poor agreement
with the assumption that H0 is true.
Introduction and Methodology
11
Lec.1
STAT2912
Step 4: Calculate the p-value (observed signiﬁcance
level).
For given data x1, x2, ..., xn, we can calculate
tobs = T (x1, ..., xn), the observed value of the test
statistic T .
The p-value, or observed signiﬁcance level, is
deﬁned formally as the smallest α level, such that
its corresponding rejection region R contains tobs.
Example: Assume large values of T support the
alternative. That is, R has the form [r, ∞), where r
can be calculated for a speciﬁc α. Then the p-value is
p = P(T ≥ tobs|H0),
where the probability is
assumption that H0 is true.
calculated
under
the
Depending on the context, the p-value can be
calculated as p = P(T ≤ tobs|H0) (for left-sided tests),
or as p = P(|T | ≥ tobs|H0) (for two-sided tests).
Introduction and Methodology
12
Lec.1
STAT2912
Step 5: Conclusion
The p-value is a probability, so it is a number
between 0 and 1. What does it mean?
We can assess the signiﬁcance of the evidence
provided by the data against the null hypothesis by
”interpreting” the p-value.
Although such rules are arbitrary, in this course we
shall use the following convention:
p-value
p-value
p-value
p-value
p-value
> 0.10
∈ [0.05, 0.10)
∈ [0.025, 0.05)
∈ [0.01, 0.025)
< 0.01
Introduction and Methodology
Data are consistent with H0
Borderline evidence against H0
Reasonably strong evidence against H0
Strong evidence against H0
Very strong evidence against H0.
13
Lec.1
STAT2912
Remarks
Recall that the p-value is equal to the probability
(under the null hypothesis) that the test statistic T is
as extreme, or even more extreme, as tobs. It quantiﬁes
the amount of evidence against H0 in the following
sense:
• If the p-value is small, then either H0 is true and
the poor agreement is due to an unlikely event, or
H0 is false. Therefore,
The smaller the p-value, the stronger the
evidence against H0 in favour of H1.
• A large p-value does not mean that there is evidence
that H0 is true, but only that the test detects no
inconsistency between the predictions of the null
hypothesis and the results of the experiment.
Introduction and Methodology
14
Lec.1
STAT2912
Review of some important results
1
n
∑n
Denote the sample mean by X =
i=1 Xi , and
the sample variance
∑n
1
2
S = n−1 i=1(Xi − X)2;
√
S = S 2 is called the sample standard deviation.
The following results will be used in this course
without proofs. If X1, . . . , Xn are iid N (µ, σ 2), then
X −µ
√ ∼ N (0, 1)
σ/ n
∑n
2
(X
−
µ)
i
2
i=1
∼
χ
n
σ2
(n − 1)S 2
2
∼
χ
n−1
σ2
X −µ
√ ∼ tn−1
S/ n
X and S 2 are independent random variables
Introduction and Methodology
15
Lec.1
STAT2912
Some useful R commands
pnorm(x, m, s) = P(X ≤ x), where X ∼ N (m, s2).
pt(x, d) = P(X ≤ x), where X ∼ td.
pchisq(x, d) = P(X ≤ x), where X ∼ χ2d.
qnorm(p, m, s) is a number x
P(X ≤ x) = p, where X ∼ N (m, s2).
such
that
Examples:
pnorm(1.96) gives Φ(1.96) (= 0.9750021).
qnorm(0.95) gives Φ−1(0.95) (= 1.644854).
Similarly for the t and χ2 variables:
pt(2,3) gives P(t3 ≤ 2) = 0.930337,
pchisq(2,3) gives P(χ3 ≤ 2) = 0.4275933,
qchisq(0.1,5) gives 1.610308, that is,
P(χ25 ≤ 1.610308) = 0.1 .
Introduction and Methodology
16
Lec.2
STAT2912
Lecture 2
Tests for the population mean
• Normal population with unknown variance
• One sample and paired sample
• One sample t-test and paired sample t-test
Tests for mean
17
Lec.2
STAT2912
Tests for mean: normal population
One sample:
Suppose we have a sample
(X1, X2, ..., Xn) of the size n drawn from a normal
population with unknown variance. We want to make
some statement about the population mean µ.
The t-test for one sample
• Assumptions: Xj are iid random variables with
Xj ∼ N (µ, σ 2),
where σ 2 is unknown.
• Hypothesis:
H : µ = µ0
vs
HA : µ > µ0, or
µ < µ0, or
µ ̸= µ0.
Tests for mean
18
Lec.2
STAT2912
• Test statistic (Student’s t-statistic):
)
√ (
Tn = n X − µ0 /S,
where
1
X = n
∑n
2
1
i=1 Xi and S = n−1
∑n
2
i=1 (Xi − X) .
Note that Tn ∼ tn−1 under H : µ = µ0.
• p-value: Let (x1, x2, ..., xn) be a dataset, and t be
the observed value of Tn, i.e.
√
x − µ0)/s,
t = n(¯
where
1
x
¯= n
∑n
2
1 ∑n (x − x
x
and
s
=
¯ )2 .
i
i=1
i=1 i
n−1
(
)
p = P tn−1 ≥ t for HA : µ > µ0;
(
)
p = P tn−1 ≤ t for HA : µ < µ0;
(
)
p = 2 P tn−1 ≥ |t| for HA : µ ̸= µ0.
• Decision: by ”interpreting” the p-value.
Tests for mean
19
Lec.2
STAT2912
Remarks and the idea behind the t-test:
• We have that Tn ∼ tn−1 under H0 : µ = µ0.
¯ is an ideal estimator of µ, which means that X
¯
• X
is near µ0 if H : µ = µ0 is true. Therefore it is
reasonable to believe√ that we should reject H in
favour of HA if t = n(¯
x − µ0)/s (observed value
of Tn) is large (negative large or both depending on
HA). On the other hand, if t is large, then p-value
is small. This coincides with the ”interpretation” to
p-value.
(
)
(
)
• Note
( that P t)n−1 ≥ t( = P tn−1
) ≤ −t for t ≥ 0.
P |tn−1| ≥ |t| = 2P tn−1 ≥ |t| .
(
)
in R, pt(t, n-1) gives P tn−1 ≤ t .
Tests for mean
20
Lec.2
STAT2912
Example 2.1. Beer contents in a six pack of VB
(in millilitres) are:
374.8,
375.0,
375.3,
374.8,
374.4,
374.9.
Question: Is the mean content of the beer is less
than the 375 mL claimed on the label?
Solution: We consider the sample X1, X2, ..., X6 are
from a normal population, that is,
Xj ∼ N (µ, σ 2),
Tests for mean
where σ 2 is unknown.
21
Lec.2
STAT2912
Hypothesis: H0 : µ = 375
vs
H1 : µ < 375.
For the dataset
374.8, 375.0, 375.3, 374.8, 374.4, 374.9,
[∑
]
(
)
2
1
x
¯ = 374.87, s2 = n−1
−n x
¯
= 0.087
and
observed value of the test statistic Tn (=
√ the
¯ − 375)/S with n = 6) is
n(X
n
2
j=1 xj
t=
√
6(¯
x − 375)/s = −1.1094.
p-value: p = P (t5 ≤ t) = 0.1589 (using table or
pt(t,5) in R).
Conclusion: Data is consistent with the claim on
the label.
Tests for mean
22
Lec.2
STAT2912
In R
t.test(x, alternative=”???”,mu = µ0)
is used to construct the one-sample t-test, where x is a
dataset, ”???” can be ”less”, ”greater” or ”two.sided”
depending on the context.
Example 2.1. (cont.) For the dataset,
374.8, 375.0, 375.3, 374.8, 374.4, 374.9,
and the hypothesis H0 : µ = 375
we get the result in R as follows:
vs
H1 : µ < 375,
> x< -c(374.8, 375.0, 375.3, 374.8, 374.4, 374.9)
> t.test(x, alternative="less", mu=375)
One-sample t-Test
data: x
t = -1.1094, df = 5, p-value = 0.1589
alternative hypothesis: true mean is less than 375
95 percent confidence interval:
NA 375.1088
sample estimates:
mean of x
374.8667
Tests for mean
23
Lec.2
STAT2912
Remarks:
• For small sample size n (n ≤ 20), the t-test is
sensitive to the assumption that the sample is from
a normal population. It is recommended that all
data should be checked for shape by looking at a
boxplot and a normal qq-plot of the data (using R
commands boxplot and qqnorm).
• For small sample size n (n ≤ 20), it is required that
the boxplot of the data should be as symmetric as
possible and the points in qq-plot should be nearly
around a straightline. Note that the boxplot and the
qq-plot may be misguided if n is too small (n ≤ 10).
• If the sample size is large (n ≥ 20), as long as the
data do not come from a long-tail distribution, the
t-test should be ﬁne for use. This will be discussed
in the next Lecture.
Tests for mean
24
Lec.2
STAT2912
Test for mean: normal population
The paired sample: Suppose we have a sample
of paired observations (i.e., before/after data or twin
data). We want to make some statement about the
mean diﬀerence.
A typical example is as follows.
Example 2.2. (From Chapter 11.3 of Rice) Blood
samples from 11 individuals before and after they
smoked a cigarette are used to measure aggregation of
blood platelets.
Before: 25 25 27 44 30 67 53 53 52 60 28;
After: 27 29 37 36 46 82 57 80 61 59 43.
Question: is the aggregation aﬀected by smoking?
Tests for mean
25
Lec.2
STAT2912
Analysis: Let Xj and Yj be the aggregation of
blood platelets before and after smoking by same
individual. In Example 2.2, we get a sample of paired
observations, (x1, y1), ..., (xn, yn). Note that Xj and
Yj are dependent. However the question is only
concerned with the diﬀerence of the paired sample,
dj = Xj − Yj ,
j = 1, 2, ..., n.
Because (Xj , Yj ) come from diﬀerent individuals, we
may assume that dj are independent random variables
with
dj ∼ N (µ, σ 2),
where σ 2 is unknown,
and answer the question by testing the following
hypothesis:
H:µ=0
vs
HA : µ ̸= 0.
This reduces the paired-sample problem to a single
sample (of diﬀerence) problem testing for µ = 0.
Tests for mean
26
Lec.2
STAT2912
In general, let (x1, y1), ..., (xn, yn) be a sample of
paired observations. We wish to test
H : µx = µy
vs
HA : µx > µy , or
µx < µy , or
µx ̸= µy .
This is same as observing the diﬀerence
dj = Xj − Yj , j = 1, , ..., n
and testing (write µd = µx − µy )
H : µd = 0
vs
HA : µd > 0, or
µd < 0, or
µd ̸= 0.
Tests for mean
27
Lec.2
STAT2912
The paired sample t-test
• Assumptions: The dj (= Xj − Yj ) are a random
sample from a normal population with unknown
variance, that is, dj ∼ N (µ, σ 2), where σ 2 is
unknown.
• √
Test statistic (Student’s t-statistic):
¯ d ∼ tn−1 under H, where d¯ =
nd/s
and s2d =
1
n−1
∑n
i=1 (di
¯ 2.
− d)
Tn =
1
n
∑n
i=1
di
• p-value:
(
)
p = P tn−1 ≥ t0
(
)
p = P tn−1 ≤ t0
(
)
p = 2 P tn−1 ≥ |t0|
for HA : µd > 0;
for HA : µd < 0;
for HA : µd ̸= 0,
√ ¯
where t0 is the observed value of Tn = nd/sd.
Tests for mean
28
Lec.2
STAT2912
in R
The function in R
t.test(x, y, alternative=”??”, mu=0, paired=T)
is used to the paired t-test, where x and y are
before and after observations of the dataset (xi, yi)
respectively, ”???” can be ”less”, ”greater” or
”two.sided” depending on the context.
Or, you may use
t.test(x-y, alternative=”??”, mu=0)
Tests for mean
29
Lec.2
STAT2912
Example 2.2 (cont.) Blood samples from 11
individuals before and after they smoked a cigarette
are used to measure aggregation of blood platelets.
Before: 25 25 27 44 30 67 53 53 52 60 28;
After: 27 29 37 36 46 82 57 80 61 59 43.
Question: is the aggregation aﬀected by smoking?
Solution: We consider the aggregation diﬀerence
dj ∼ N (µ, σ 2), where σ 2 is unknown. Testing the
hypothesis:
H : µ = 0 vs
Tests for mean
H : µ ̸= 0.
30
Lec.2
STAT2912
in R,
>
>
x<-c(25, 25, 27, 44, 30, 67, 53, 53, 52, 60, 28)
y<-c(27, 29, 37, 36, 46, 82, 57, 80, 61, 59, 43)
>
t.test(x,y, alternative="two.sided", mu=0, paired=T)
Paired t-Test
data: x and y t = -2.9065, df = 10, p-value = 0.0157 alternative
hypothesis: true mean of differences is not equal to 0 95 percent
confidence interval:
-14.93577 -1.97332
sample estimates:
mean of x - y
-8.454545
The p-value = 0.0157. Hence there is strong
evidence against H0, that is there is strong evidence
that the aggregation is aﬀected by smoking.
Tests for mean
31
Lec.2
STAT2912
Remarks:
• It is recommended that the data diﬀerence should
be checked for shape by looking at a boxplot and
a normal qq-plot of the data (using R commands
boxplot and qqnorm).
• For small sample size n (n ≤ 20), it is required
that the boxplot of the data diﬀerence should be
as symmetric as possible and the points in qq-plot
should be nearly around a straightline. We have
to assume the data diﬀerence is from a normal
distribution if the sample size n is too small (n ≤
10).
• If the sample size is large (n ≥ 20), as long as
the data diﬀerence do not come from a long-tail
distribution, the t-test should be ﬁne for use.
Tests for mean
32
Lec.3
STAT2912
Lecture 3
Tests for population mean
• large sample tests
If sample size n (n ≥ 20 in general) is large enough,
it is not necessary to assume that the sample is from a
normal population.
Tests for mean
33
Lec.3
STAT2912
Tests for mean
Suppose we want to test a population mean µ
based on a random sample X1, X2, ..., Xn, where the
sample size n is considered to be large enough (n ≥ 20
in general).
Let (x1, x2, ..., xn) be a observed value.
If the sample does not come from a heavy tail
distribution (too many outliers etc in a boxplot), we
may use the following procedures to test hypothesis
about the population mean µ.
Tests for mean
34
Lec.3
STAT2912
Common large sample tests
• Hypothesis:
H : µ = µ0
vs
HA : µ > µ0, or
µ < µ0, or
µ ̸= µ0.
• Test statistic
)
√ (
Student’s t-statistic: Tn = n X − µ0 /S,
∑
1 ∑n (X − X)2 .
where X = n1 ni=1 Xi and S 2 = n−1
i
i=1
or z-statistic (given population variance σ 2):
Zn =
where X =
Tests for mean
1
n
∑n
i=1
√
n(X − µ0)/σ,
Xi.
35
Lec.3
STAT2912
• p-value: Let t be the observed value of Tn or Zn,
that is,
t =
t =
√
√
n(¯
x − µ0)/s,
n(¯
x − µ0)/σ,
or
with given σ.
If the sample size n is large enough (n ≥ 20 in
general), the p-value is approximately given by
p ≈ 1 − Φ(t)
for HA : µ > µ0;
p ≈ Φ(t)
for HA : µ < µ0;
(
)
p ≈ 2 1 − Φ(|t|) for HA : µ ̸= µ0.
Tests for mean
36
Lec.3
STAT2912
Remarks
• The paired sample may be discussed in a similar
manner.
• The choice of the test statistics is based on the fact
¯ is a point estimate of µ. It means that X
¯
that X
is close to µ0 if H : µ = µ0 is true. If, in reality,
¯ − µ0| is more likely to be large.
µ ̸= µ0, |X
• If the sample is from a normal population with
given variance σ 2, the test procedure for µ (with
statistic Zn) is still applicable for small sample
size (n√ ≤ 20). Note that, under H : µ = µ0,
Zn = n(X − µ0)/σ ∼ N (0, 1) for each n ≥ 1 in
this case.
• Calculation of p-value is based on the central limit
theorem.
• Large sample test for µ is equivalent to t-test if the
sample size is large enough (n ≥ 20 in general).
Tests for mean
37
Lec.3
STAT2912
Example 3.1. (From MSWA, Chapter 10) A vice
president in charge of sales for a large corporation
claims that salespeople are averaging no more than 15
sales contacts each week. As a check on his claim, 24
salespeople are selected at random, and the number
of the contacts made by each is recorded for a single
randomly selected week as follows:
18, 17, 18, 13, 15, 16, 21, 14, 24, 12, 19, 18,
17, 16, 15, 14, 17, 18, 19, 20, 13, 14, 12, 15
Does the evidence contradict the vice president’s
claim?
Solution: We are interested in the hypothesis that
the vice president’s claim is incorrect. Thus, we are
interested in testing
H : µ = 15
vs
HA : µ > 15,
where µ is the mean number of sales contacts each
week.
Tests for mean
38
Lec.3
STAT2912
The data is chosen in random, the sample size is
large n ≥ 20, the following boxplot shows the data
is not too skew. These
imply
(
) that we can use
√ facts
the test statistic Tn = n X − µ0 /S, and obtain the
p-value by
p = 1 − Φ(t),
where t =
√
24(¯
x − 15)/s.
in R,
> x<-c(18,17,18,13,15,16,21,14,24,12,19,18,
17,16,15,14,17,18,19,20,13,14,12,15)
> barx<-mean(x)
> s<-sqrt(var(x))
> t<-sqrt(24)*(barx - 15)/s
> p<-1-pnorm(t)
> p
[1] 0.007954637
The p-value is less than 0.01. Hence there is very
strong evidence to indicate that the vice president’s
claim is incorrect, that is, the average number of sales
contacts per week exceeds 15.
Tests for mean
39
x
24
22
20
18
14
12
16
24
22
20
18
16
14
12
Tests for mean
-2
•
•
0
1
Quantiles of Standard Normal
-1
• •
• ••
•••
••
•••
••• •
• •
•
•
2
•
Lec.3
STAT2912
40
Lec.3
STAT2912
One-sample t-Test
in R,
>
x<-c(18,17,18,13,15,16,21,14,24,12,19,18,
17,16,15,14,17,18,19,20,13,14,12,15)
> t.test(x, alternative="greater",mu=15)
One-sample t-Test
data: x
t = 2.411, df = 23, p-value = 0.0121
alternative hypothesis: true mean is greater than 15
95 percent confidence interval:
15.42167
NA
sample estimates:
mean of x
16.45833
The p-value is 0.0121. This gives nearly same
conclusion as in large sample test.
Tests for mean
41
Lec.3
STAT2912
Theoretical explanation
The central limit theorem:
If X1, X2, ..., Xn are independent and identically
distributed random variables with mean µ and ﬁnite
variance σ 2, then, for n large enough (n ≥ 20, in
general),
√
√
¯ − µ)
n(X
∼ N (0, 1),
σ
(approximately);
¯ − µ)
n(X
∼ N (0, 1) or tn−1,
S
Tests for mean
(approximately).
42
Lec.3
STAT2912
Notes:
• Let (x1, x2, ..., xn) be a observed value. If the
observations come from a random design, the
central limit theorem holds true in general.
• If the boxplot of the data is too skew,
transformations (discuss later) should be done prior
to the use of central limit theorem.
• If n is large enough (n ≥ 20), P (tn ≤ x) ≈ Φ(x)
for each x. See the following density plots.
This gives the explanation of that large sample test
for µ is equivalent to t-test for µ if the sample size
is large enough.
Tests for mean
43
Tests for mean
44
dnorm(x) & dt(x, df=5)
0.1
0.0
0.2
0.3
0.4
-4
0.1
0.0
-2
x
0
2
4
dnorm(x) & dt(x, df=20)
0.2
0.3
0.4
-4
-2
x
0
2
4
Lec.3
STAT2912
Lec.3
STAT2912
Some corollaries of the central limit theorem
• If Y ∼ B(n, θ), 0 < θ < 1, then for large n (n ≥ 20)
(Y − nθ)
√
∼ N (0, 1),
nθ(1 − θ)
(approximately);
• If X1, X2, ..., Xn are i.i.d. random variables from a
Poisson population X, i.e., for some λ > 0,
λk e−λ
P (X = k) =
,
k!
k = 0, 1, 2, ...
then for large n (n ≥ 20)
√
¯ − λ)
n(X
√
∼ N (0, 1),
λ
(approximately);
Note that EX = λ and var(X) = λ.
Tests for mean
45
Lec.3
STAT2912
• If X1, X2, ..., Xn are i.i.d. random variables from a
exponential population X, i.e., for some β > 0, X
has density function:
f (x) =
1 −x/β
e
, if x ≥ 0;
β
0, otherwise,
then for large n (n ≥ 20)
√
¯ − β)
n(X
∼ N (0, 1),
β
(approximately).
Note that EX = β and var(X) = β 2.
Tests for mean
46
Lec.3
STAT2912
Example 3.2. (From MSWA, Chapter 10) A
manufacturer of automatic washers oﬀers a particular
model in one of three colors : A, B or C. Of the
ﬁrst 100 washers sold, 40 were of color A. Would you
conclude that customers have a preference for color A?
Solution: Let θ denote the probability that a
customer prefers color A, and Y denote the number of
customers who prefer color A in the ﬁrst 100 customers.
Then Y ∼ B(100, θ).
If customers have no preference for color A, then
θ = 1/3. Otherwise θ > 1/3. Hence we have to test
the following hypothesis:
H : θ = 1/3 vs
Tests for mean
HA : θ > 1/3.
47
Lec.3
STAT2912
By the central limit theorem, we can use the test
statistic:
3(Y − 100/3)
√
Tn = √
=
.
200
100(1/3)(1 − 1/3)
Y − 100/3
The observed value of this test statistic is given by
√
3(40 − 100/3)
√
= 20/ 200 = 1.414.
t=
200
The p-value
p ≃ 1 − Φ(t) = 0.078,
approximately.
Conclusion: There is a borderline evidence against
the null hypothesis H, that is, there is borderline
evidence that customers prefer color A.
Tests for mean
48
Lect.4
STAT2912
Lecture 4
Fixed level tests for mean
• Signiﬁcance level. Decision rule.
• Critical-values and ﬁxed-level tests.
• Rejection region
Critical values and fixed level tests
49
Lect.4
STAT2912
Test for mean (Summary)
Suppose we want to test the mean µ of a population
with data x = (x1, x2, ..., xn).
• Hypothesis: H : µ = µ0 vs HA : µ > µ0 or ...
• Test statistic: T = T (X1, X2, ..., Xn)
(t-test and z-test)
• p-value: p = PH (T ≥ t) or ..., where
t = T (x1, x2, ..., xn).
• Decision: ”interpreting” the p-value
p-value
p-value
p-value
p-value
p-value
> 0.10
∈ (0.05, 0.10)
∈ (0.025, 0.05)
∈ (0.01, 0.025)
< 0.01
Critical values and fixed level tests
Data consistent with H
Borderline evidence against H
Reasonably strong evidence against H
Strong evidence against H
Very strong evidence against H .
50
Lect.4
STAT2912
significance level
In this course, to test the hypothesis, our decision
rule is to compare the p-value to ﬁxed value 0.05, 0.10,
etc.
In practice, based on their requirements, one may
make their own decision rule. They may give a
preassigned level, say α, and then decide simply to
accept H or reject H (accept HA) according to p > α
or p ≤ α.
The α is usually called the significance level of
the test, which is the boundary between accept and
reject the null hypothesis.
Critical values and fixed level tests
51
Lect.4
STAT2912
Critical value and rejection region
For the hypothesis
H : µ = µ0 vs HA : µ > µ0, or ...
given a signiﬁcance level α, we may ﬁnd a cα such that
PH (T ≥ cα) = α,
or ...
This cα is called the critical value at the signiﬁcance
level α. The critical value depends on both the
distribution of T under H and α.
The DECISION RULE at level α can be deﬁned
in terms of the observed value t of the test statistic T ,
with the following convention:
if t > cα, we reject H,
if t < cα, we accept H,
Critical values and fixed level tests
or ...;
or ...
52
Lect.4
STAT2912
Summary (ﬁxed level α tests)
• Hypothesis: H : µ = µ0 vs HA : µ > µ0, or ...
• Test statistic: T = T (X1, X2, ..., Xn).
• Critical value: cα, which is given by
PH (T ≥ cα) = α,
or ...
• Decision:
reject H if t > cα, or ...;
accept H if t < cα, or ...
where t is a observed value of the test statistic T .
Critical values and fixed level tests
53
Lect.4
STAT2912
Test for mean (fixed level α)
Suppose we have a sample of size n drawn from
some population. Given a signiﬁcance level α, we want
to make some statement about the mean.
Assumption 1: The (X1, X2, ..., Xn) is a random
sample from normal population with unknown variance
σ 2, i.e.,
Xj ∼ N (µ, σ 2),
where σ 2 is unknown.
• Hypothesis:
H : µ = µ0
vs
HA : µ > µ0, or
µ < µ0, or
µ ̸= µ0.
Critical values and fixed level tests
54
Lect.4
STAT2912
• Test statistic:
)
√ (
Tn = n X − µ0 /S,
where
1
X = n
∑n
2
1
i=1 Xi and S = n−1
∑n
2
i=1 (Xi − X) .
• Decision: Let t be a observed value of Tn. We
reject H and accept HA
if t > tα
(for HA : µ > µ0);
if t < −tα
(for HA : µ < µ0);
if |t| ≥ tα/2
(for HA : µ ̸= µ0),
where tβ (β = α or α/2) is the value given by
P (tn−1 ≥ tβ ) = β.
Critical values and fixed level tests
55
Lect.4
STAT2912
Table 3 is used to ﬁnd the tα.
With α = 0.1,
tα(3) = 1.638,
tα(5) = 1.476, ...
With α = 0.05,
tα(3) = 2.353,
tα(5) = 2.015, ...
.......
in R, qt(1 − β, n − 1) gives tβ (n − 1).
Critical values and fixed level tests
56
Lect.4
STAT2912
Computing Critical Values in R
Student’s t-Distribution with ’n’ degrees of freedom
> qt(p, n)
# = v,
P (tn < v) = p.
Standard Normal Distribution
> qnorm(p)
# = v,
P (Z < v) = p.
χ2-Distribution with ’n’ degrees of freedom
> qchisq(p, n)
# = v,
P (χ2n < v) = p.
F -Distribution with ’n1’ and ’n2’ degrees of freedom
> qf (p, n1, n2)
# = v,
Critical values and fixed level tests
P (Fn1,n2 < v) = p.
57
Lect.4
STAT2912
Example
Let’s re-visit
Example 2.2. (From Chapter 11.3 of Rice) Blood
samples from 11 individuals before and after they
smoked a cigarette are used to measure aggregation of
blood platelets.
Before: 25 25 27 44 30 67 53 53 52 60 28;
After: 27 29 37 36 46 82 57 80 61 59 43.
Question: is the aggregation aﬀected by smoking
at the 0.05 signiﬁcance level?
Solution: This is a paired sample. We may
reduce the problem to one sample by considering the
aggregation diﬀerence dj (yj − xj ):
2, 4, 10, -8,
Critical values and fixed level tests
16, 15, 4, 27, 9, -1, 15
58
Lect.4
STAT2912
The sample size is not large enough (n ≤ 20).
It needs to assume that the dj is from a normal
population, i.e., dj ∼ N (µ, σ 2). We are interested in
the testing:
H: µ=0
vs
HA : µ ̸= 0.
In terms of the unknown population variance, the
test statistic is chosen to be
Tn =
√
n(d¯ − 0)/sd,
with
1
s2d = n−1
∑n
¯2
j=1 (dj − d) .
Note that d¯ = 8.45, s2d =
√ 93.07 and n = 11. The
observed value of Tn is t = 11(d¯ − 0)/sd = 2.90.
With given signiﬁcance level α = 0.05, the critical
value for HA : µ ̸= 0 is t0.025(10) = 2.228.
Because t = 2.90 > t0.025(10), we reject the null
hypothesis and accept the alternative hypothesis, that
is, the aggregation is aﬀected by smoking at the 0.05
signiﬁcance level.
Critical values and fixed level tests
59
Lect.4
STAT2912
Test for mean (fixed level α)
Suppose we have a sample of the size n drawn from
some population. Given a signiﬁcance level α, we want
to make some statement about the mean.
Assumption 2: The sample (X1, X2, ..., Xn) is not
from a population having a long tail, and sample size
n is large (n ≥ 20).
• Hypothesis:
H : µ = µ0
vs
HA : µ > µ0, or
µ < µ0, or
µ ̸= µ0.
Critical values and fixed level tests
60
Lect.4
STAT2912
• Test statistic
Student’s t-statistic:
)
√ (
Tn = n X − µ0 /S,
where
1
X = n
∑n
2
1
i=1 Xi and S = n−1
∑n
2
i=1 (Xi − X) .
or z-statistic (given population variance σ 2):
Zn =
√
n(X − µ0)/σ.
• Decision: Let t be a observed value of Tn or Zn,
that is,
t =
t =
√
√
n(¯
x − µ0)/s,
n(¯
x − µ0)/σ,
Critical values and fixed level tests
or
with given σ.
61
Lect.4
STAT2912
We reject H and accept HA if
if t > zα
(for HA : µ > µ0);
if t < −zα
(for HA : µ < µ0);
if |t| ≥ zα/2
(for HA : µ ̸= µ0),
where zβ (β = α or α/2) is the value which is
approximately given by
Φ(zβ ) = 1 − β.
Table 2 can be used to ﬁnd zα.
For instance,
z0.025 = 1.96,
z0.05 = 1.645,
z0.10 = 1.28.
Critical values and fixed level tests
62
Lect.4
STAT2912
Remarks
• The paired sample may be discussed in a similar
method.
• If the sample is from a normal population with given
variance σ 2, the testing procedure (with statistic
Zn) is still applicable for small sample size (n ≤ 20).
• The set {x : t ≥ tα}, ..., or {x : t ≥ zα} ,..., is
generally called rejection region. If an observation
x is in the rejection region then we say that we
reject the null hypothesis at the level α or that the
test is signiﬁcant at the level α.
• If the sample size is large enough (n ≥ 20), the
rejection regions of large sample test and t-test are
nearly same.
Critical values and fixed level tests
63
Lect.4
STAT2912
Example 4.1. Let X1, X2, ..., X25 be a sample
from a normal distribution with mean µ and variance
100. Find the rejection region for a test at level
α = 0.10 of the hypothesis:
H: µ=0
vs
µ>0
¯
based on a observed value of the sample mean X.
Solution: The sample is from a normal population
with given variance σ 2 = 100. It is known that we
should reject the null hypothesis at level α = 0.10 if
Critical values and fixed level tests
64
Lect.5
STAT2912
Lecture 5
Performances of statistical tests
Some tests may perform better than others.
• Type I and Type II errors.
• Power of a ﬁxed-level-test.
• A method to improve power: increasing the sample
size.
Power and sample size
65
Lect.5
STAT2912
Signiﬁcance tests with ﬁxed level α gives a rule for
making decision:
if the values of the test statistic fall in the
rejection region, we reject the null hypothesis
H, and accept the alternative hypothesis HA.
Note that for any ﬁxed rejection region, two types
of errors can be made in reaching a decision.
A type I error:
we reject H, but in fact H is true.
A type II error:
we accept H, but in fact H is false.
Power and sample size
66
Lect.5
STAT2912
Revisit the test for mean µ (ﬁxed level α)
• Hypothesis: H : µ = µ0 vs HA : µ > µ0, or ...
• Test statistic: T = T (X1, X2, ..., Xn).
• Decision: We reject H and accept HA if t ≥ cα (or
...), where t is a observed value of T .
A type I error: with some observation value t ≥ cα,
or ... we reject H : µ = µ0, but in fact H : µ = µ0 is
true.
A type II error: with some observation value t <
cα, or ... we accept H : µ = µ0, but in fact H : µ = µ0
is false (hence the true value of µ is some µ1 > µ0).
Note: The discussion is only for HA : µ > µ0, but
the idea still holds for other cases.
Power and sample size
67
Lect.5
STAT2912
We have that
the probability of a type I error
= PH (T ≥ cα) = α,
i.e., the signiﬁcance level of the test.
With the true value µ = µ1, the probability of a
type II error is given by
βα(µ1) = Pµ1 (T < cα),
µ1 ≥ µ0,
where Pµ1 (.) denotes that this probability is calculated
on the assumption that µ = µ1.
Power and sample size
68
Lect.5
STAT2912
1 − βα(µ1) is called Power of the test at the
signiﬁcance level α for µ1 ∈ HA.
Note that the power:
Power(µ1) := 1 − βα(µ1) = Pµ1 (T ≥ cα),
is a function of the parameter µ1 ∈ HA. It is the
probability of rejecting H with the true value µ = µ1.
Power and sample size
69
Lect.5
STAT2912
Remarks
Recall that we may have diﬀerent tests (teststatistics) for same Hypothesis H vs HA.
• With a given signiﬁcance level α, ideally we would
like to have a test such that the power Power(µ1)
is 1 for all µ1 ∈ HA. However it is NOT possible.
• A more practical method is to choose the test so
that the power is as large as possible for all, or
at least for some µ1 ∈ HA. This is sometimes
possible and the theory of optimal tests stresses this
question. This course will not consider this formal
theory, but will just choose the tests which seem to
produce high power.
Power and sample size
70
Lect.5
STAT2912
Power of the z-test
Example 5.1 Suppose that X1, X2, ..., Xn are
independent N (µ, 1), and we wish to test
H:µ=0
vs
HA : µ > 0,
at the level α = 0.05. Find the power of the z-test.
With given variance σ 2 = 1, the test-statistic is
given by
Power and sample size
71
Lect.5
STAT2912
The power of the z-test for µ ∈ HA is calculated
by
Power and sample size
72
Lect.5
STAT2912
With given n = 9,
Power(0.2) = 0.148,
Power(1) = 0.912,
Power(0.5) = 0.442,
...
With given n = 16,
Power(0.2) = 0.199,
Power(1) = 0.99,
Power(0.5) = 0.639, ,
...
With given n = 25,
Power(0.2) = 0.259,
Power(1) = 0.999,
Power(0.5) = 0.803, ,
...
.....
Power and sample size
73
20
40
60
80
100
0.0
0.2
0.4
0.6
0.8
1.0
••
•••
•••
•
•
••
•••
••
•
••
••
••
•
••
••
•
•
••
••
•
••
••
•
••
••
•
••
••
•
••
••
•
••
••
•
•
••
••
•
•
••
••
•
•
••
•••
•
•
••
•••
•••
•
•
•
•••
•••••
••••
••••••
•••••
•
•
•
•
••
••••
••••
•
•
•
••
•••
•••
•
•
•••
•••
•
•
••
••
••
•
•
••
••
•
•
••
••
•
••
••
•
••
••
•
•
••
•
••
••
•
••
•
•
•
•
•
•
•
•
•
•
0.8
0.6
0.4
0.2
0.8
0.6
0.4
0.2
Power and sample size
0
The Power as a function of mu, n=9
The Power as a function of n, mu=0.3
Lect.5
STAT2912
Power curve
74
Lect.5
STAT2912
Remarks
There are four ways to increase the power:
• Increase α. A 5% test of signiﬁcance will have a
greater chance of rejecting the alternative than 1%
test because of evidence required for rejection is
less.
• Consider a particular alternative that is farther away
from µ0. Values of µ that are in HA but lie close
to the hypothesized value µ0 are harder to detect
(lower power) than values of µ that are far from µ0.
• Increase the sample size. More data will provide
more information about x
¯ so we have a better
chance of distinguishing values of µ.
• Decrease σ. This has the same eﬀect as increasing
the sample size: more information about µ.
Power and sample size
75
Lect.5
STAT2912
Choice of sample size
Let us consider the following hypothesis:
H : µ = µ0
vs
HA : µ > µ0.
We may be interested in:
what is the smallest sample size such that the
test at the level α has power at least 100(1−β)%
when µ = µ1?
Power and sample size
76
Lect.5
STAT2912
Example 5.2.
Suppose X1, . . . , Xn
independent N (µ, 1), and we wish to test
H:µ=0
vs
are
H : µ > 0.
How large must n be to be at least 95% sure of
ﬁnding evidence at the 5% level when µ = 1?
Solution:
Power and sample size
77
Lect.5
STAT2912
Example 5.3. We plan to observe X from B(n, θ).
Consider the hypothesis:
H : θ = 1/2
vs
HA : θ = 0.6.
How large must n be to make it 90% certain that
evidence against H will be obtained at the 5% level?
Solution: We will use a normal approximation. We
have a result signiﬁcant at the 0.05 level if
X − n/2
√
≥ 1.645,
n/4
i.e.
√
X ≥ 1.645 n/4 + n/2.
The power for the test at the 5% level is
(
)
√
Power(0.6) = P0.6 X ≥ 1.645 n/4 + n/2
(
)
√
1.645 n/4 + n/2 − 0.6n
√
≈ 1−Φ
.
0.6 × 0.4 × n
Power and sample size
78
Lect.5
STAT2912
So to get Power(0.6) = 0.9 we need to equate this
probability with 0.9.
Note that Φ(−1.28) = 0.10. We need to solve
√
1.645 n/4 − n/10
√
−1.28 =
0.24n
giving
(
n=
)2
√
1.28 0.24 + 0.8225
= 210.1.
0.1
So the sample size must be more than 210.
Power and sample size
79
Lect.5
STAT2912
More exact results could now be obtained using R
function pbinom, using n = 210 as a starting point:
qbinom(.95,210,.5) # gives 117
> 1-pbinom(117,210,.6)
[1] 0.8840604
## power only 88%
> qbinom(.95,211,.5) # gives 117
> 1-pbinom(117,211,.6)
[1] 0.8990565
## almost there!
qbinom(.95,212,.5) gives 118
> 1-pbinom(118,212,.6)
[1] 0.8883305
##worse!
> qbinom(.95,213,.5) gives 118
> 1-pbinom(118,213,.6)
[1] 0.9028448
#OK
So we need n = 213 and critical region x ≥ 119.
Power and sample size
80
Lect.6
STAT2912
Lecture 6
Confidence intervals and tests
• Deﬁnition of conﬁdence interval
• Conﬁdence interval for the mean
• Relationships between testing procedures and
conﬁdence intervals
Confidence intervals and tests
81
Lect.6
STAT2912
Definition of confidence interval
A point estimate of a parameter θ is a ”guess” of
the unknown θ without the discussion of accuracy.
A method of giving accurate information about the
parameter θ is to ﬁnd an interval estimate or conﬁdence
interval for it.
Definition: Let θˆL and θˆR be two statistics. If
P (θˆL ≤ θ ≤ θˆR) = 1 − α,
then the random interval [θˆL, θˆR] is called a 100(1 −
α)% confidence interval for θ, and 100(1 − α)% is
called the confidence level of the interval.
Typically, α is chosen to be 0.01, 0.05, 0.10, etc,
and then we get 99%, 95%, 90% conﬁdence interval
accordingly.
Confidence intervals and tests
82
Lect.6
STAT2912
Remarks:
• There could have many conﬁdence intervals for θ.
The ideal one should be as narrow as possible.
• θˆL can be chosen to be −∞ and θˆR can be
chosen to be ∞. Then we get the so-called
one-sided conﬁdence interval (−∞, θˆR] and [θˆL, ∞)
accordingly.
In these cases, the θˆR and θˆL are also called 100(1−
α)% upper bound for θ, and 100(1 − α)% lower
bound for θ, respectively.
Confidence intervals and tests
83
Lect.6
STAT2912
Confidence intervals for the mean µ
Assumption 1: The (X1, X2, ..., Xn) is a random
sample from normal population, that is,
Xj ∼ N (µ, σ 2),
where σ 2 is unknown.
A 100(1 − α)% conﬁdence interval for the µ is
√
√
¯
¯
[X − tα/2,n−1S/ n, X + tα/2,n−1S/ n],
¯ and S are deﬁned as before, and tα/2,n−1
where X
(denoted by tα/2 below) is given by
(
)
P tn−1 ≥ tα/2 = α/2.
100(1 − α)% one-sided conﬁdence intervals for the µ
are given by
√
¯
[X − tαS/ n, ∞);
Confidence intervals and tests
√
¯
(−∞, X + tαS/ n].
84
Lect.6
STAT2912
In fact, by using the fact that
for any µ,
√
¯ −µ)/S ∼ tn−1
n(X
[
√
√ ]
¯
¯
P X − tα/2S/ n ≤ µ ≤ X + tα/2S/ n
√ ¯
]
[
n(X − µ)
≤ tα/2
= P − tα/2 ≤
S
= 1 − α.
This gives the result on the two-sided conﬁdence
interval for the µ by deﬁnition.
By analogous arguments, one-sided conﬁdence
intervals for the µ are derived.
Confidence intervals and tests
85
Lect.6
STAT2912
Relationships between testing procedures and
confidence intervals
Consider testing for the mean µ:
H : µ = µ0
vs
HA : µ ̸= µ0
at the level α.
Under the assumption 1, the result in Lecture 4
indicates that we would use a t-test based on the test
statistic
√ ¯
n(X − µ0)
,
Tn =
S
and would reject H if the observed value t of Tn fell
in the rejection region {|t| > tα/2}.
The complement of the rejection region is
sometimes called ”acceptance region” for the test.
Confidence intervals and tests
86
Lect.6
STAT2912
Hence for any of our two-sided tests at level α, the
”acceptance region” is given by
{|t| < tα/2}
That is, we do not reject H : µ = µ0 in favour of
HA : µ ̸= µ0 if
√
−tα/2 <
n(¯
x − µ0)
< tα/2.
s
Restated, the null hypothesis is not rejected (is
accepted) at level α if
√
√
x
¯ − tα/2s/ n < µ0 < x
¯ + tα/2s/ n.
√
√
Notice that x
¯ − tα/2s/ n and x
¯ + tα/2s/ n just
are the lower and upper endpoints, respectively, of a
100(1 − α)% two sided conﬁdence interval for µ.
Confidence intervals and tests
87
Lect.6
STAT2912
Thus, a duality exists between conﬁdence intervals
and tests.
To test the hypothesis (at level α):
H : µ = µ0
vs
HA : µ ̸= µ0.
• accept H : µ = µ0 (at level α), if the µ0 lies inside
a 100(1 − α)% conﬁdence interval for the µ
√
√
¯ + tα/2s/ n];
[¯
x − tα/2s/ n, x
• reject H : µ = µ0, if µ0 lies outside the interval.
Equivalently, a 100(1 − α)% conﬁdence interval for
µ can be interpreted as the set of all values of µ0 for
which H : µ = µ0 is ”acceptable” at level α.
Confidence intervals and tests
88
Lect.6
STAT2912
Confidence intervals for the mean µ
Assumption 2: The sample (X1, X2, ..., Xn) is
from a population without long tails (boxplot of data
is not too skew), and sample size n is large (n ≥ 20).
A 100(1 − α)% conﬁdence interval for the µ
(approximately)
Unknown population variance σ 2:
√
√
¯
¯
[X − zα/2S/ n, X + zα/2S/ n],
¯ and S are deﬁned as before, and zα/2 is given
where X
by
Φ(zα/2) = 1 − α/2.
Given population variance σ 2: as above expect that
the S is replaced by σ.
Confidence intervals and tests
89
Lect.6
STAT2912
Remarks
• The large sample conﬁdence intervals are obtained
by using the central limit theorem (see Lecture 3).
• By analogous arguments as in Assumption 1,
(a). Assume that Assumption 2 holds true and σ 2 is
unknown.
To test the hypothesis (at level α):
H : µ = µ0
vs
HA : µ ̸= µ0
∗ accept H : µ = µ0 , if the µ0 lies inside
√
√
[¯
x − zα/2s/ n, x
¯ + zα/2s/ n]
∗ reject H : µ = µ0 if µ0 lies outside the interval.
Confidence intervals and tests
90
Lect.6
STAT2912
(b). Assume that Assumption 2 holds true and σ 2 is
given.
To test the hypothesis (at level α):
H : µ = µ0
vs
HA : µ ̸= µ0
∗ accept H : µ = µ0, if the µ0 lies inside
√
√
[¯
x − zα/2σ/ n, x
¯ + zα/2σ/ n]
∗ reject H : µ = µ0 if µ0 lies outside the interval.
Confidence intervals and tests
91
Lect.6
STAT2912
One-sided confidence interval and duality
Under Assumption 1, to test the hypothesis (at
level α):
/
H : µ = µ0
vs
HA : µ > µ0
µ < µ0
• accept H : µ = µ0, if the µ0 lies inside
/
√
√
[¯
x − tαs/ n, ∞)
(−∞, x
¯ + tαs/ n];
• reject H : µ = µ0, if µ0 lies in
/
√
√
[¯
x + tαs/ n, ∞).
(−∞, x
¯ − tαs/ n]
Confidence intervals and tests
92
Lect.6
STAT2912
Under Assumption 2, to test the hypothesis (at
level α):
/
H : µ = µ0
vs
HA : µ > µ0
µ < µ0
• (unknown σ 2) accept H : µ = µ0, if the µ0 lies
inside
/
√
√
[¯
x − zαs/ n, ∞)
(−∞, x
¯ + zαs/ n];
reject H : µ = µ0 if µ0 lies in
/
√
√
(−∞, x
¯ − zαs/ n]
[¯
x + zαs/ n, ∞).
• (given σ 2) accept H : µ = µ0, if the µ0 lies inside
/
√
√
[¯
x − zασ/ n, ∞)
(−∞, x
¯ + zασ/ n];
reject H : µ = µ0 if µ0 lies in
/
√
√
(−∞, x
¯ − zασ/ n]
[¯
x + zασ/ n, ∞).
Confidence intervals and tests
93
Lect.6
STAT2912
Example 6.1. A brand of beer claims its beer
content is 375 (in millilitres) on the label. A sample of
40 bottles of the beer gave a sample average of 373.9
and a standard deviation of 2.5.
Questions:
(a). is there evidence that the mean content of the beer
is less than the 375 mL claimed on the label at level
α = 0.05?
(b). Construct a 95% one-sided conﬁdence interval that
is related to the hypothesis in (a) for the true
average beer content.
(c). Based on this interval, should the claim on the label
be rejected?
(d). To accept the claim on the label at level α = 0.05,
how many millilitres do we need for the sample
average with a standard deviation of 2.5?
Confidence intervals and tests
94
Lect.6
STAT2912
Solution:
(a). Since it is necessary to test whether the mean
content µ is less than the claim 375, the hypothesis
to be tested is
H : µ = 375
vs
µ < 375.
The population variance is unknown.
√ ¯We need to
use the student t-statistic: Tn = n(X − 375)/S.
The sample size is large (n = 40). The critical
value for the test at the level α = 0.05 is given by
−z0.05 = −1.645.
The observed value of Tn is
√
t = 40(373.9 − 375)/2.5 = −2.7828 < −z0.05.
So we should reject the null H at the level α = 0.05,
that is, there is evidence that the mean content
of the beer is less than the 375 mL at the level
α = 0.05.
Confidence intervals and tests
95
Lect.6
STAT2912
(b). With unknown population variance, the 95% onesided conﬁdence interval related to the hypothesis
in (a) for the true average beer content is
√
(−∞, 373.9 + z0.05s/ 40] = (−∞, 374.55].
(c). Based on this interval, we should reject the claim on
the label because 375 is in outside of the interval.
It is same conclusion as in question (a).
(d). To make the claim on the label acceptable at level
α = 0.05, the sample average should satisﬁes that
√
375 ≤ x
¯ + z0.05s/ 40.
With s = 2.5, the x
¯ should be at least
√
375 − 1.645 × 2.5/ 40 = 374.3498.
Confidence intervals and tests
96