Statistical Interval of Single Sample Lecture #9 Statistical Interval

Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Lecture #9
Statistical Interval of Single Sample
BMIR Lecture Series on Probability and Statistics
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
Ching-Han Hsu, Ph.D.
Department of Biomedical Engineering
and Environmental Sciences
National Tsing Hua University
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.1
Confidence Interval: Motivations
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
• Because of sampling variability, it is important to
understand how good is the estimated obtained.
• An interval estimate for a population parameter is
called a confidence interval (CI).
• The length of the interval conveys information about
the precision estimation.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
• We are not sure that the interval contains the true but
unknown population parameter.
• We only have confidence that the interval does
contain the unknown population parameter.
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.2
Normal RV: µ Unknown and σ 2 Known
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
• Suppose that X1 , X2 , . . . , Xn is a random sample from
a normal distribution with unknown mean µ and
known variance σ 2 .
P
¯ = (X1 + · · · + Xn )/n is also
• The sample mean X
normally distributed with mean µ and variance σ 2 /n.
¯ by
• We can also standardize X
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
¯ −µ
X
√
Z=
σ/ n
• The random variable Z has a standard normal
distribution ∼ N(0, 1).
(1)
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.3
Statistical Interval
of Single Sample
Confidence Interval
• A confidence interval estimate for µ is an interval of
the form
Ching-Han Hsu,
Ph.D.
l≤µ≤u
where l and u are computed from the sample data.
• Since the values of l and u are derived from the
sample, l and u are the values of random variables
L and U.
• We want to determine the values of L and U that
satisfy the following condition:
P(L ≤ µ ≤ U) = 1 − α,
(2)
where 0 ≤ α ≤ 1.
• The CI will contain the true mean µ with a probability
of 1 − α.
• The end-points l and u are called the lower- and
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
upper-confidence limits, respectively.
• 1 − α is called the confidence coefficient.
9.4
Normal RV: µ Unknown and σ 2 Known
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
¯
X−µ
√ has a standard normal distribution,
• Since Z = σ/
n
we may write
¯ −µ
X
√ ≤ zα/2 = 1 − α
P −zα/2 ≤
σ/ n
σ
σ
¯ − zα/2 √ ≤ µ ≤ X
¯ + zα/2 √
P X
=1−α
n
n
• The corresponding lower- and upper-confidence
¯ + zα/2 √σ ,
¯ − zα/2 √σ and X
intervals L and U are X
n
n
respectively.
(3)
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
(4)
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.5
Statistical Interval
of Single Sample
Confidence Interval I
Ching-Han Hsu,
Ph.D.
Definition
if ¯x is the sample mean of a random sample of size n from
a normal population with known σ 2 , a 100(1 − α)% CI on
µ is given by
σ
σ
¯x − zα/2 √ ≤ µ ≤ ¯x + zα/2 √
n
n
(5)
where zα/2 is the upper 100(α/2) percentage point of the
standard normal distribution.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.6
Statistical Interval
of Single Sample
Confidence Interval II
Ching-Han Hsu,
Ph.D.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
Figure 1: P −zα/2 ≤ Z ≤ zα/2
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
Table 1: Some common critical values of α.
Confidence Level
90
95
97
α
0.1
0.05
0.03
α/2
0.05
0.025
0.015
zα/2
1.645
1.960
2.17
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.7
Confidence Interval III
• Eq. (5), ¯
x − zα/2 √σn ≤ µ ≤ ¯x + zα/2 √σn , is also called as
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
a two-sided CI.
• A CI is a random interval because both end-points, L
and U, are random variables.
• If an infinite number of random samples are collected
and a 100(1 − α)% CI for µ is computed from each
sample, 100(1 − α)% these intervals will contain the
true value of µ.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
• In practice, we obtain one random sample and only
calculate one confidence.
• Since the interval may or may not contain the true
mean µ, the appropriate statement is that the
observed interval [l, u] brackets the true mean µ with
confidence 100(1 − α).
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.8
Confidence Interval IV
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
Figure 2: Repeated construction of CIs for µ.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.9
Precision of Estimation
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
• The length of confidence interval of is 2zα/2 √σn .
• The length of the 95% confidence interval is
√
√
2(1.96σ/ n) = 3.92σ/ n.
• The length of the 99% confidence interval is
√
√
2(2.56σ/ n) = 5.16σ/ n.
• For a fixed sample size n and standard deviation σ,
the higher the confidence level, the longer the final
CI.
• The length of a CI is also a measure of the precision
of estimation.
• We can choose the sample size n to give a CI of
specified length or precision.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.10
Example Normal RV with µ Unknown and σ 2 Known I
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Example
ASTM Standard E23 defines standard test methods for
notched bar impact testing of metallic materials. The
Charpy V-notch (CVN) technique measures impact
energy and is often used to determine whether or not a
material experiences a ductile-to-brittle transition with
decreasing temperature. Ten measurements of impact
energy (J) on specimens of A238 steel cut at 60◦ C are as
follows: 64.1, 64.7, 64.5, 64.6, 64.5, 64.3, 64.6, 64.8,
64.2, and 64.3. Assume that impact energy is normally
distributed with σ = 1J. We want to find a 95% CI for µ,
the mean impact energy.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.11
Example Normal RV with µ Unknown and σ 2 Known II
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
The required quantities are zα/2 = z0.025 = 1.96, n = 10,
σ = 1, and ¯x = 64.46. The resulting 95% CI is found from
Eq. (5) as follows:
σ
σ
¯x − zα/2 √ ≤ µ ≤ ¯x + zα/2 √
n
n
1
1
64.46 − 1.95 √ ≤ µ ≤ 64.46 + 1.95 √
10
10
63.84 ≤ µ ≤ 65.08
(6)
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
(7)
(8)
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.12
Statistical Interval
of Single Sample
Choice of Sample Size
Ching-Han Hsu,
Ph.D.
Definition
If ¯x is used as an estimate of µ, we can be 100(1 − α)%
confident that the error E = |¯x − µ| will not exceed a
specified amount E when the sample size n is
n=
α/2 σ
z
2
E
(9)
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
Figure 3: Error in estimating µ with ¯x.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.13
Statistical Interval
of Single Sample
Choice of Sample Size: Example
Ching-Han Hsu,
Ph.D.
Example
Suppose that we want to ensure a 95% CI on µ with
confidence length 1.0. Given the standard deviation is 1,
how many samples do we need?
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
E = 0.5 × CI length = 0.5.
One-Sided Confidence
Bounds
Large Sample CI for Mean
n=
z
α/2
E
σ 2
=
1.96 × 1.0
0.5
2
= 15.37.
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
The required sample size is n = 16.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.14
Choice of Sample Size
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Notice the general relationship between sample size,
desired length of the confidence interval 2E, confidence
level 100(1 − α), and standard deviation σ:
• As the desired length of the interval 2E decreases,
the required sample size n increases for a fixed value
of σ and specified confidence.
• As σ increases, the required sample size n increases
for a fixed desired length 2E and specified
confidence.
• As the level of confidence increases, the required
sample size n increases for fixed desired length 2E
and standard deviation σ.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.15
Statistical Interval
of Single Sample
One-Sided Confidence Bounds
Ching-Han Hsu,
Ph.D.
Definition
A 100(1 − α)% upper-confidence bound for µ is
σ
µ ≤ u = ¯x + zα √
n
(10)
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
A 100(1 − α)% lower-confidence bound for µ is
σ
¯x − zα √ = l ≤ µ
n
Confidence
Interval
Large Sample CI for Mean
(11)
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.16
Statistical Interval
of Single Sample
Example: One-Sided Confidence Bounds
Ching-Han Hsu,
Ph.D.
Example
Consider that the same data for impact testing are used
to construct a lower, one-sided 95% confidence interval
for the mean energy µ.
Confidence
Interval
Recall that ¯x = 64.46 n = 10, and σ = 1. The resulting
95% lower CI is:
σ
¯x − zα √ ≤ µ
n
1
64.46 − 1.64 √ ≤ µ
10
63.94 ≤ µ
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
(12)
(13)
(14)
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.17
Procedure to Derive a CI
• Let X1 , X2 , . . . , Xn be a random sample of size n.
• Suppose that we found a statistics g(X1 , X2 , . . . , Xn ; θ)
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
with the following properties:
g(X1 , X2 , . . . , Xn ; θ) depends on both the sample and
θ.
2 The probability distribution of g(X1 , X2 , . . . , Xn ; θ) does
not depend on θ or any other unknown parameter.
1
• We need to find constants CL and CU so that
P [CL ≤ g(X1 , X2 , . . . , Xn ; θ) ≤ CU ] = 1 − α
(Note that CL and CU do not depend on θ.)
• Finally, we need to manipulate the inequality so that
P [L(X1 , X2 , . . . , Xn ) ≤ θ ≤ U(X1 , X2 , . . . , Xn )] = 1 − α
• This gives the L(X1 , X2 , . . . , Xn ) and U(X1 , X2 , . . . , Xn )
as the lower and upper confidence limits defining the
100(1 − α)% CI for θ.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.18
Statistical Interval
of Single Sample
Large Sample CI for Mean
Definition
Ching-Han Hsu,
Ph.D.
When n is large, the quantity
¯ −µ
X
√ ∼ N(0, 1)
S/ n
(15)
has an approximate standard normal distribution.
Consequently,
s
s
¯x − zα/2 √ ≤ µ ≤ ¯x + zα/2 √
n
n
is a large sample confidence interval for µ, with
confidence level of approximately 100(1 − α)%.
• Eq. (16) holds regardless of distribution.
• Generally n should be at least 40. Here a larger
sample size is required when compared to the
central limit theorem which holds for n ≥ 30.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
(16)
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.19
Example: Large Sample CI for Mean I
Example
An article in the 1993 volume of the Transactions of the
American Fisheries Society reports the results of a study
to investigate the mercury contamination in largemouth
bass. A sample of fish was selected from 53 Florida lakes
and mercury concentration in the muscle tissue was
measured (ppm). The mercury concentration values are
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
We want to find a 95% CI on µ.
Estimating a
Proportion
The summary statistics is:
9.20
Statistical Interval
of Single Sample
Example: Large Sample CI for Mean II
Ching-Han Hsu,
Ph.D.
The required quantities are n = 53, ¯x = 0.5250, s = 0.3486,
and z0.025 = 1.96. The approximate 95% CI on µ is:
s
s
¯x − zα/2 √ ≤ µ ≤ ¯x + zα/2 √
n
n
0.3486
0.3486
0.525 − 1.96 √
≤ µ ≤ 0.525 + 1.96 √
53
53
0.4311 ≤ µ ≤ 0.6189
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
(17)
(18)
(19)
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.21
Statistical Interval
of Single Sample
Large Sample CI for a Parameter
Suppose that θ is a parameter of a probability distribution,
ˆ be an estimator of θ. If Θ
ˆ
and let Θ
1
has an approximate normal distribution,
2
is approximately unbiased for θ,
has standard deviation σΘˆ can be estimated from
sample data,
ˆ − θ)/σ ˆ has an approximate normal
then the quantity (Θ
Θ
distribution.
Ching-Han Hsu,
Ph.D.
3
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
Definition
CI of Normal
Distribution with
Unknown Mean
and Variance
A large-sample approximate CI for θ is given by
θˆ − zα/2 σΘˆ ≤ θ ≤ θˆ + zα/2 σΘˆ
(20)
Maximum likelihood estimators usually satisfy the three
ˆ is the
conditions, Eq. (20) is often used when (Θ
maximum likelihood estimator of θ.
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.22
Normal Random Sample with Unknown Mean and
Variance
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
• Let X1 , X2 , . . . , Xn be observations of a random
sample of size n form the normal distribution
N(µ, σ 2 ).
• Both mean µ and variance σ 2 are unknown.
¯ and S2 are the sample mean and variance,
• X
respectively.
• If σ is known, we can construct a two-sided CI on µ
by using
¯ −µ
X
√
Z=
σ/ n
where Z is N(0, 1).
• If σ is unknown and n is small, we can replace it by
the sample standard deviation S and construct a new
random variable
¯ −µ
X
T= √
S/ n
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.23
Statistical Interval
of Single Sample
t Distribution I
Ching-Han Hsu,
Ph.D.
Theorem
Let X1 , X2 , . . . , Xn be a random sample of size n form a
normal distribution with unknown mean µ and unknown
variance σ 2 . The random variable
T=
¯ −µ
X
√
S/ n
(21)
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
has a t distribution with n − 1 degree of freedom.
One-Sided Confidence
Bounds
Large Sample CI for Mean
The probability density function a t distribution with k
degree of freedom is
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
Γ[(k + 1)/2]
1
f (t) = √
·
πkΓ(k/2) [(t2 /k) + 1](k+1)/2
(22)
The mean of variance of the t distribution are zero and
k/(k − 2) for k > 2, respectively.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.24
t Distribution II
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
Figure 4: Probability density functions of several t distribution.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.25
t Distribution III
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
Figure 5: Percentage points of the t distribution.
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.26
t Distribution IV
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
Figure 6: Percentage Points tα,ν of the t-Distribution.ν is the
degrees of freedom.
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.27
t Distribution V
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
Figure 7: Percentage Points tα,ν of the t-Distribution. ν is the
degrees of freedom.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.28
Statistical Interval
of Single Sample
Derivation of t Distribution I
• Let
Ching-Han Hsu,
Ph.D.
Z
T=p
U/r
where Z is the standard normal distribution, N(0, 1),
U is χ2 (r), and Z and U are independent.
• Motivations:
¯
X−µ
T =
=
√
¯ −µ
X
σ/ n
√ =
S/σ
S/ n
Z
Z
p
=p
U/r
S2 /σ 2
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
• Thus the joint distribution of Z and U is
1
1
2
g(z, u) = f (z)·f (u) = √ e−z /2 ·
ur/2−1 e−u/2 ,
r/2
Γ(r/2)2
2π
−∞ < z < ∞, 0 < u < ∞.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.29
Statistical Interval
of Single Sample
Derivation of t Distribution II
Ching-Han Hsu,
Ph.D.
• The distribution function F(t) = P(T ≤ t) of T is given
by
Z
F(t) = P(T ≤ t) = P( p
≤ t)
U/r
p
= P(Z ≤ t U/r)
Z ∞ Z t√u/r
=
g(z, u)dzdu.
0
−∞
∞
"Z √
F(t) = √
1
πΓ(r/2)
0
t
−∞
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
• That is,
Z
Confidence
Interval
Derivation of t Distribution
u/r
2
e−z /2
2(r+1)/2
#
dz ur/2−1 e−u/2 du
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.30
Statistical Interval
of Single Sample
Derivation of t Distribution III
Ching-Han Hsu,
Ph.D.
• The pdf of T is the derivative of F(t),
f (t) = F 0 (t)
=
=
√
1
πΓ(r/2)
∞
Z
1
√
πrΓ(r/2)
2
e−(u/2)(t /r)
2(r+1)/2
0
Z
0
∞
r
Confidence
Interval
u r/2−1 −u/2
u
e
du
r
u(r+1)/2−1 −(u/2)(1+t2 /r)
e
du
2(r+1)/2
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.31
Derivation of t Distribution IV
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
1
• Let y = (1 + t2 /r)u and du
dy = 1+t2 /r . We find that
f (t)
=
1
√
πrΓ(r/2)
Z
0
∞
u(r+1)/2−1 −(u/2)(1+t2 /r)
e
du
2(r+1)/2
(r+1)/2−1
y
Z ∞
1
1
1+t2 /r
dy
= √
e−y/2
(r+1)/2
(1 + t2 /r)
πrΓ(r/2) 0
2
Γ((r + 1)/2)
1
= √
·
πrΓ(r/2)
(1 + t2 /r)(r+1)/2−1+1
Z ∞
y(r+1)/2−1
·
e−y/2 dy
(r+1)/2
Γ((r + 1)/2)2
0
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.32
Derivation of t Distribution V
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
y(r+1)/2−1
• Since the pdf of χ2 (r + 1) is
e−y/2 , we
Γ((r+1)/2)2(r+1)/2
have
Z
0
∞
y(r+1)/2−1
e−y/2 dy = 1
(r+1)/2
Γ((r + 1)/2)2
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
• Therefore, the pdf of t distribution is
One-Sided Confidence
Bounds
Large Sample CI for Mean
Γ((r + 1)/2)
1
f (t) = √
,
2
πrΓ(r/2) (1 + t /r)(r+1)/2
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
−∞ < t < ∞.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.33
Statistical Interval
of Single Sample
CI on Mean with Unknown Variance
Theorem
Ching-Han Hsu,
Ph.D.
If ¯x and s are the sample mean and standard deviation of
a random sample from a normal distribution with
unknown variance σ 2 , a 100(1 − α)% confidence interval
on µ is given by
s
s
¯x − tα/2,n−1 √ ≤ µ ≤ ¯x + tα/2,n−1 √
n
n
(23)
CI on Mean of a Normal
Dist with Known Variance
where tα/2,n−1 is the upper 100α/2 percentage point of the
t distribution with n − 1 degrees of freedom.
A 100(1 − α)% upper-confidence bound for µ is
s
µ ≤ ¯x + tα,n−1 √
n
A 100(1 − α)% lower-confidence bound for µ is
s
¯x − tα,n−1 √ ≤ µ
n
Confidence
Interval
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
(24)
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
(25)
9.34
Example:CI on Mean with Unknown Variance I
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Example
Given a random sample of size n = 22 is as follows:
19.8
15.4
11.4
19.5
15.4
10.1
18.5
14.1
8.8
11.4
14.9
7.9
17.6
13.6
7.5
12.7
16.7
11.9
15.4
11.9
15.8
11.4
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
Find the 95% CI on µ.
• The sample mean is ¯
x = 13.71.
• The sample deviation is s = 3.55.
• The degrees of freedom is n − 1 = 21.
• t0.0025,21 = 2.08.
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.35
Statistical Interval
of Single Sample
Example:CI on Mean with Unknown Variance II
Ching-Han Hsu,
Ph.D.
• The 95% CI on µ is:
s
s
¯x − tα/2,n−1 √ ≤ µ ≤ ¯x + tα/2,n−1 √
n
n
3.55
3.55
13.71 − 2.08 √ ≤ µ ≤ 13.71 + 2.08 √
22
22
13.71 − 1.57 ≤ µ ≤ 13.71 + 1.57
12.14 ≤ µ ≤ 15.28
(26)
Confidence
Interval
(27)
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
(28)
(29)
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.36
Statistical Interval
of Single Sample
Chi-Square Distribution
Theorem
Ching-Han Hsu,
Ph.D.
If X1 , X2 , . . . , Xn are independent random variables and
have normal distribution N(µ, σ 2 ). Let S2 be the sample
variance. Then the random variable
χ2 =
(n − 1)S2
σ2
Confidence
Interval
has a chi-square (χ2 (n − 1)) distribution with n − 1 degree
of freedom.
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
• The probability density function of a χ2 random
variable is
Derivation of t Distribution
1
f (x) = k/2
x(k/2)−1 e−x/2 , x > 0
2 Γ(k/2)
(30)
where k is the degree of freedom.
• The mean and variance of the χ2 distribution are k
and 2k, respectively.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.37
Chi-Square Distribution
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Figure 8: Probability density functions of several χ2
distribution.
Estimating a
Proportion
9.38
Chi-Square Distribution
Define χ2α,k as the percentage point or value of the
chi-square random variable with k degree of freedom
such that the probability that χ2 exceeds this value is α:
Z ∞
2
2
P χ > χα,k =
f (u)du = α
(31)
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
χ2α,k
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
Figure 9: Percentage point of the χ2 distribution. (a) The
percentage point χ2α,k . (b) The upper percentage point
χ20.05,10 = 18.31. and the lower percentage point χ20.95,10 = 3.94.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.39
Construction CI on Variance
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.40
Construction CI on Variance
Statistical Interval
of Single Sample
• The random variable
Ching-Han Hsu,
Ph.D.
χ2 =
(n − 1)S2
σ2
is chi-square with n − 1 degrees of freedom.
• We may write
P(χ21−α/2,n−1
2
≤χ ≤
χ2α/2,n−1 )
=1−α
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
so that
(n − 1)S2
2
2
≤ χα/2,n−1 = 1 − α
P χ1−α/2,n−1 ≤
σ2
P
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
• We rearrange the above equation into
(n − 1)s2
(n − 1)s2
2
≤
σ
≤
χ2α/2,n−1
χ21−α/2,n−1
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
!
=1−α
Estimating a
Proportion
9.41
Statistical Interval
of Single Sample
CI on Variance
Ching-Han Hsu,
Ph.D.
Theorem
If s2 is the sample variance from a random sample of n
observations from a normal distribution with unknown
variance σ 2 , then a 100(1 − α)% confidence interval on σ 2
is
(n − 1)s2
(n − 1)s2
2
≤
σ
≤
(32)
χ2α/2,n−1
χ21−α/2,n−1
where
(n−1)s2
χ2α/2,n−1
and
(n−1)s2
χ21−α/2,n−1
are the upper and lower
100(α/2) percentage points of the chi-square distribution
with n − 1 degrees of freedom, respectively.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.42
CI on Variance
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Theorem
The 100(α/2) lower and upper confidence bounds on σ 2
are
(n − 1)s2
(n − 1)s2
2
2
≤
σ
and
σ
≤
(33)
χ2α,n−1
χ21−α,n−1
respectively.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.43
Statistical Interval
of Single Sample
Example CI on Variance
Ching-Han Hsu,
Ph.D.
Example
An automatic filling machine is used to fill bottles with
liquid detergent. A random sample of 20 bottles results in
a sample variance of fill volume of s2 = 0.0153 (fluid
ounces)2 . If the variance of fill volume is too large, an
unacceptable proportion of bottles will be under- or
overfilled. We will assume that the fill volume is
approximately normally distributed. Find the 95% upper
confidence bound.
A 95% upper-confidence interval is found from Eq. (33) as
follows:
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
σ2 ≤
σ
2
≤
1)s2
1)s2
19s2
(n −
(20 −
= 2
= 2
χ21−α,n−1
χ1−0.05,20−1
χ0.95,19
19 × 0.0153
= 0.0287 (fluid ounce)2
10.117
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.44
Statistical Interval
of Single Sample
Recall That: Bernoulli Distribution
Ching-Han Hsu,
Ph.D.
Example
Let X be a Bernoulli random variable. The probability
mass function is
(
px (1 − p)1−x , x = 0, 1
f (x; p) =
0,
elsewhere,
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
where p is the parameter to be estimated. The likelihood
function of a random sample of size n is
Pn
L(p) = p
i=1 xi
(1 − p)n−
Pn
i=1 xi
We have shown that the MLE of p is ˆ
p=
.
1 Pn
n
i=1 Xi .
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.45
Statistical Interval
of Single Sample
Binomial Experiments
• A point estimator of the proportion p in a binomial
Ching-Han Hsu,
Ph.D.
experiment is given by the statistic:
ˆ=X
P
n
where X represents the number of successes in n
trials.
• The sample proportion ˆ
p = x/n will be used as the
point estimate of the parameter p.
ˆ is approximately
• By the CLT, for n sufficiently large, P
normally distributed with mean,
np
X
ˆ
=
=p
µPˆ = E(P) = E
n
n
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
and variance
2
σPˆ2 = σX/n
9.46
CI on Proportion Estimation
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
• The random random variable
ˆ−p
P
Z=p
pq/n
is a standard normal distribution.
• The corresponding 100(1 − α)% CI is
P(−zα/2 ≤ Z ≤ zα/2 ) = 1 − α
ˆ−p
P
P(−zα/2 ≤ p
≤ zα/2 ) = 1 − α
pq/n
r
r
ˆ
ˆ
p
q
pˆ
q
ˆ − zα/2
ˆ + zα/2 ˆ
P(P
≤p≤P
) ≈ 1−α
n
n
where ˆp = x/n and ˆ
q=1−ˆ
p.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.47
CI on Proportion Estimation
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Theorem
If ˆ
p is the proportion successes in a random sample of
size n and ˆq = 1 − ˆp, an approximately 100(1 − α)%
confident interval, for the binomial parameter p is given by
r
r
ˆpˆ
ˆ
q
pˆ
q
ˆp − zα/2
≤p≤ˆ
p + zα/2
(34)
n
n
where zα/2 is the z-value having an area of α/2 to the
right.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
• If n is small and p is close to 0 or 1, Eq. (34) should
not be used.
• Both nˆ
p and nˆq should be greater than or equal to 5.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.48
Statistical Interval
of Single Sample
CI on Proportion Estimation: Accurate Formula
Ching-Han Hsu,
Ph.D.
• We can also solve p in the quadratic inequality:
ˆ−p
P
−zα/2 ≤ p
≤ zα/2
pq/n
Confidence
Interval
• We obtain another form of the confidence interval for
p with limits
ˆp +
1+
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
z2α/2
2n
z2α/2
n
s
±
zα/2
1+
z2α/2
n
2
ˆ
pˆ
q zα/2
+ 2
n
4n
(35)
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.49
CI on Proportion Estimation: Example I
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Example
In a random sample of size n = 500 families owning cable
TV in a community, it is found that x = 340 subscribe to
the NTHU program. Find a 95% confident interval for the
actual proportion of the families subscribing to the
program.
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
• The point estimate of p is ˆ
p=
ˆq = 1 − ˆp = 0.32.
340
500
= 0.68.
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
• Both nˆ
p > 5 and nˆq > 5.
• z0.025 = 1.96.
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.50
Statistical Interval
of Single Sample
CI on Proportion Estimation: Example II
Ching-Han Hsu,
Ph.D.
• The 95% confident interval is
r
ˆp − zα/2
r
0.68−1.96
ˆ
pˆ
q
≤p≤ˆ
p + zα/2
n
r
ˆ
pˆ
q
n
r
(0.68)(0.32)
≤ p ≤ 0.68+1.96
500
0.6391 ≤ p ≤ 0.7209
(0.68)(0.32)
500
• If we use the accurate form, we obtain
0.68 +
1+
1.962
(2)(500)
1.962
500
±
1.96
1+
1.962
500
s
1.962
(0.68)(0.32)
+
500
(4)(500)2
= 06786 ± 0.0408
0.6378 ≤ p ≤ 0.7194
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.51
Statistical Interval
of Single Sample
CI on Proportion: Choice of Sample Size
Theorem
Ching-Han Hsu,
Ph.D.
If ˆ
p is used as the estimate of p, we can be 100(1 − α)%
confident that the error will be less than a specified
amount E when the same size is approximately
n=
z2α/2 ˆ
pˆ
q
E2
(36)
Confidence
Interval
CI on Mean of a Normal
Dist with Known Variance
One-Sided Confidence
Bounds
Theorem
Large Sample CI for Mean
Since the maximum value of p(1 − p) locates at p = 0.5,
we can substitute ˆp by 0.5. Then, we can be at least
100(1 − α)% confident that the error will be less than a
specified amount E when the same size is
n=
z2α/2
4E2
(37)
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.52
CI on Proportion: Choice of Sample Size
Statistical Interval
of Single Sample
Ching-Han Hsu,
Ph.D.
Example
How large a sample is required if we want to be 95%
confident that our estimate of p in the previous TV
problem is within 0.02 of the true vale?
Confidence
Interval
• First approach:
n=
z2α/2 ˆpˆq
E2
CI on Mean of a Normal
Dist with Known Variance
(1.96)2 (0.68)(0.32)
=
= 2080.8 ≈ 2081
0.022
• Second approach:
n=
z2α/2
4E2
One-Sided Confidence
Bounds
Large Sample CI for Mean
CI of Normal
Distribution with
Unknown Mean
and Variance
Derivation of t Distribution
=
(1.96)2
= 2401
(4)(0.02)2
CI on Variance and
STD of Normal
Distribution
Estimating a
Proportion
9.53