Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Lecture #9 Statistical Interval of Single Sample BMIR Lecture Series on Probability and Statistics Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution Ching-Han Hsu, Ph.D. Department of Biomedical Engineering and Environmental Sciences National Tsing Hua University CI on Variance and STD of Normal Distribution Estimating a Proportion 9.1 Confidence Interval: Motivations Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. • Because of sampling variability, it is important to understand how good is the estimated obtained. • An interval estimate for a population parameter is called a confidence interval (CI). • The length of the interval conveys information about the precision estimation. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean • We are not sure that the interval contains the true but unknown population parameter. • We only have confidence that the interval does contain the unknown population parameter. CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.2 Normal RV: µ Unknown and σ 2 Known Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. • Suppose that X1 , X2 , . . . , Xn is a random sample from a normal distribution with unknown mean µ and known variance σ 2 . P ¯ = (X1 + · · · + Xn )/n is also • The sample mean X normally distributed with mean µ and variance σ 2 /n. ¯ by • We can also standardize X Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean ¯ −µ X √ Z= σ/ n • The random variable Z has a standard normal distribution ∼ N(0, 1). (1) CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.3 Statistical Interval of Single Sample Confidence Interval • A confidence interval estimate for µ is an interval of the form Ching-Han Hsu, Ph.D. l≤µ≤u where l and u are computed from the sample data. • Since the values of l and u are derived from the sample, l and u are the values of random variables L and U. • We want to determine the values of L and U that satisfy the following condition: P(L ≤ µ ≤ U) = 1 − α, (2) where 0 ≤ α ≤ 1. • The CI will contain the true mean µ with a probability of 1 − α. • The end-points l and u are called the lower- and Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion upper-confidence limits, respectively. • 1 − α is called the confidence coefficient. 9.4 Normal RV: µ Unknown and σ 2 Known Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. ¯ X−µ √ has a standard normal distribution, • Since Z = σ/ n we may write ¯ −µ X √ ≤ zα/2 = 1 − α P −zα/2 ≤ σ/ n σ σ ¯ − zα/2 √ ≤ µ ≤ X ¯ + zα/2 √ P X =1−α n n • The corresponding lower- and upper-confidence ¯ + zα/2 √σ , ¯ − zα/2 √σ and X intervals L and U are X n n respectively. (3) Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds (4) Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.5 Statistical Interval of Single Sample Confidence Interval I Ching-Han Hsu, Ph.D. Definition if ¯x is the sample mean of a random sample of size n from a normal population with known σ 2 , a 100(1 − α)% CI on µ is given by σ σ ¯x − zα/2 √ ≤ µ ≤ ¯x + zα/2 √ n n (5) where zα/2 is the upper 100(α/2) percentage point of the standard normal distribution. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.6 Statistical Interval of Single Sample Confidence Interval II Ching-Han Hsu, Ph.D. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean Figure 1: P −zα/2 ≤ Z ≤ zα/2 CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution Table 1: Some common critical values of α. Confidence Level 90 95 97 α 0.1 0.05 0.03 α/2 0.05 0.025 0.015 zα/2 1.645 1.960 2.17 CI on Variance and STD of Normal Distribution Estimating a Proportion 9.7 Confidence Interval III • Eq. (5), ¯ x − zα/2 √σn ≤ µ ≤ ¯x + zα/2 √σn , is also called as Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. a two-sided CI. • A CI is a random interval because both end-points, L and U, are random variables. • If an infinite number of random samples are collected and a 100(1 − α)% CI for µ is computed from each sample, 100(1 − α)% these intervals will contain the true value of µ. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean • In practice, we obtain one random sample and only calculate one confidence. • Since the interval may or may not contain the true mean µ, the appropriate statement is that the observed interval [l, u] brackets the true mean µ with confidence 100(1 − α). CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.8 Confidence Interval IV Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution Figure 2: Repeated construction of CIs for µ. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.9 Precision of Estimation Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. • The length of confidence interval of is 2zα/2 √σn . • The length of the 95% confidence interval is √ √ 2(1.96σ/ n) = 3.92σ/ n. • The length of the 99% confidence interval is √ √ 2(2.56σ/ n) = 5.16σ/ n. • For a fixed sample size n and standard deviation σ, the higher the confidence level, the longer the final CI. • The length of a CI is also a measure of the precision of estimation. • We can choose the sample size n to give a CI of specified length or precision. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.10 Example Normal RV with µ Unknown and σ 2 Known I Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Example ASTM Standard E23 defines standard test methods for notched bar impact testing of metallic materials. The Charpy V-notch (CVN) technique measures impact energy and is often used to determine whether or not a material experiences a ductile-to-brittle transition with decreasing temperature. Ten measurements of impact energy (J) on specimens of A238 steel cut at 60◦ C are as follows: 64.1, 64.7, 64.5, 64.6, 64.5, 64.3, 64.6, 64.8, 64.2, and 64.3. Assume that impact energy is normally distributed with σ = 1J. We want to find a 95% CI for µ, the mean impact energy. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.11 Example Normal RV with µ Unknown and σ 2 Known II Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. The required quantities are zα/2 = z0.025 = 1.96, n = 10, σ = 1, and ¯x = 64.46. The resulting 95% CI is found from Eq. (5) as follows: σ σ ¯x − zα/2 √ ≤ µ ≤ ¯x + zα/2 √ n n 1 1 64.46 − 1.95 √ ≤ µ ≤ 64.46 + 1.95 √ 10 10 63.84 ≤ µ ≤ 65.08 (6) Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds (7) (8) Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.12 Statistical Interval of Single Sample Choice of Sample Size Ching-Han Hsu, Ph.D. Definition If ¯x is used as an estimate of µ, we can be 100(1 − α)% confident that the error E = |¯x − µ| will not exceed a specified amount E when the sample size n is n= α/2 σ z 2 E (9) Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution Figure 3: Error in estimating µ with ¯x. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.13 Statistical Interval of Single Sample Choice of Sample Size: Example Ching-Han Hsu, Ph.D. Example Suppose that we want to ensure a 95% CI on µ with confidence length 1.0. Given the standard deviation is 1, how many samples do we need? Confidence Interval CI on Mean of a Normal Dist with Known Variance E = 0.5 × CI length = 0.5. One-Sided Confidence Bounds Large Sample CI for Mean n= z α/2 E σ 2 = 1.96 × 1.0 0.5 2 = 15.37. CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution The required sample size is n = 16. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.14 Choice of Sample Size Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Notice the general relationship between sample size, desired length of the confidence interval 2E, confidence level 100(1 − α), and standard deviation σ: • As the desired length of the interval 2E decreases, the required sample size n increases for a fixed value of σ and specified confidence. • As σ increases, the required sample size n increases for a fixed desired length 2E and specified confidence. • As the level of confidence increases, the required sample size n increases for fixed desired length 2E and standard deviation σ. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.15 Statistical Interval of Single Sample One-Sided Confidence Bounds Ching-Han Hsu, Ph.D. Definition A 100(1 − α)% upper-confidence bound for µ is σ µ ≤ u = ¯x + zα √ n (10) CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds A 100(1 − α)% lower-confidence bound for µ is σ ¯x − zα √ = l ≤ µ n Confidence Interval Large Sample CI for Mean (11) CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.16 Statistical Interval of Single Sample Example: One-Sided Confidence Bounds Ching-Han Hsu, Ph.D. Example Consider that the same data for impact testing are used to construct a lower, one-sided 95% confidence interval for the mean energy µ. Confidence Interval Recall that ¯x = 64.46 n = 10, and σ = 1. The resulting 95% lower CI is: σ ¯x − zα √ ≤ µ n 1 64.46 − 1.64 √ ≤ µ 10 63.94 ≤ µ CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean (12) (13) (14) CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.17 Procedure to Derive a CI • Let X1 , X2 , . . . , Xn be a random sample of size n. • Suppose that we found a statistics g(X1 , X2 , . . . , Xn ; θ) Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. with the following properties: g(X1 , X2 , . . . , Xn ; θ) depends on both the sample and θ. 2 The probability distribution of g(X1 , X2 , . . . , Xn ; θ) does not depend on θ or any other unknown parameter. 1 • We need to find constants CL and CU so that P [CL ≤ g(X1 , X2 , . . . , Xn ; θ) ≤ CU ] = 1 − α (Note that CL and CU do not depend on θ.) • Finally, we need to manipulate the inequality so that P [L(X1 , X2 , . . . , Xn ) ≤ θ ≤ U(X1 , X2 , . . . , Xn )] = 1 − α • This gives the L(X1 , X2 , . . . , Xn ) and U(X1 , X2 , . . . , Xn ) as the lower and upper confidence limits defining the 100(1 − α)% CI for θ. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.18 Statistical Interval of Single Sample Large Sample CI for Mean Definition Ching-Han Hsu, Ph.D. When n is large, the quantity ¯ −µ X √ ∼ N(0, 1) S/ n (15) has an approximate standard normal distribution. Consequently, s s ¯x − zα/2 √ ≤ µ ≤ ¯x + zα/2 √ n n is a large sample confidence interval for µ, with confidence level of approximately 100(1 − α)%. • Eq. (16) holds regardless of distribution. • Generally n should be at least 40. Here a larger sample size is required when compared to the central limit theorem which holds for n ≥ 30. Confidence Interval CI on Mean of a Normal Dist with Known Variance (16) One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.19 Example: Large Sample CI for Mean I Example An article in the 1993 volume of the Transactions of the American Fisheries Society reports the results of a study to investigate the mercury contamination in largemouth bass. A sample of fish was selected from 53 Florida lakes and mercury concentration in the muscle tissue was measured (ppm). The mercury concentration values are Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution We want to find a 95% CI on µ. Estimating a Proportion The summary statistics is: 9.20 Statistical Interval of Single Sample Example: Large Sample CI for Mean II Ching-Han Hsu, Ph.D. The required quantities are n = 53, ¯x = 0.5250, s = 0.3486, and z0.025 = 1.96. The approximate 95% CI on µ is: s s ¯x − zα/2 √ ≤ µ ≤ ¯x + zα/2 √ n n 0.3486 0.3486 0.525 − 1.96 √ ≤ µ ≤ 0.525 + 1.96 √ 53 53 0.4311 ≤ µ ≤ 0.6189 Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean (17) (18) (19) CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.21 Statistical Interval of Single Sample Large Sample CI for a Parameter Suppose that θ is a parameter of a probability distribution, ˆ be an estimator of θ. If Θ ˆ and let Θ 1 has an approximate normal distribution, 2 is approximately unbiased for θ, has standard deviation σΘˆ can be estimated from sample data, ˆ − θ)/σ ˆ has an approximate normal then the quantity (Θ Θ distribution. Ching-Han Hsu, Ph.D. 3 Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean Definition CI of Normal Distribution with Unknown Mean and Variance A large-sample approximate CI for θ is given by θˆ − zα/2 σΘˆ ≤ θ ≤ θˆ + zα/2 σΘˆ (20) Maximum likelihood estimators usually satisfy the three ˆ is the conditions, Eq. (20) is often used when (Θ maximum likelihood estimator of θ. Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.22 Normal Random Sample with Unknown Mean and Variance Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. • Let X1 , X2 , . . . , Xn be observations of a random sample of size n form the normal distribution N(µ, σ 2 ). • Both mean µ and variance σ 2 are unknown. ¯ and S2 are the sample mean and variance, • X respectively. • If σ is known, we can construct a two-sided CI on µ by using ¯ −µ X √ Z= σ/ n where Z is N(0, 1). • If σ is unknown and n is small, we can replace it by the sample standard deviation S and construct a new random variable ¯ −µ X T= √ S/ n Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.23 Statistical Interval of Single Sample t Distribution I Ching-Han Hsu, Ph.D. Theorem Let X1 , X2 , . . . , Xn be a random sample of size n form a normal distribution with unknown mean µ and unknown variance σ 2 . The random variable T= ¯ −µ X √ S/ n (21) Confidence Interval CI on Mean of a Normal Dist with Known Variance has a t distribution with n − 1 degree of freedom. One-Sided Confidence Bounds Large Sample CI for Mean The probability density function a t distribution with k degree of freedom is CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution Γ[(k + 1)/2] 1 f (t) = √ · πkΓ(k/2) [(t2 /k) + 1](k+1)/2 (22) The mean of variance of the t distribution are zero and k/(k − 2) for k > 2, respectively. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.24 t Distribution II Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution Figure 4: Probability density functions of several t distribution. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.25 t Distribution III Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean Figure 5: Percentage points of the t distribution. CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.26 t Distribution IV Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean Figure 6: Percentage Points tα,ν of the t-Distribution.ν is the degrees of freedom. CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.27 t Distribution V Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution Figure 7: Percentage Points tα,ν of the t-Distribution. ν is the degrees of freedom. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.28 Statistical Interval of Single Sample Derivation of t Distribution I • Let Ching-Han Hsu, Ph.D. Z T=p U/r where Z is the standard normal distribution, N(0, 1), U is χ2 (r), and Z and U are independent. • Motivations: ¯ X−µ T = = √ ¯ −µ X σ/ n √ = S/σ S/ n Z Z p =p U/r S2 /σ 2 Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution • Thus the joint distribution of Z and U is 1 1 2 g(z, u) = f (z)·f (u) = √ e−z /2 · ur/2−1 e−u/2 , r/2 Γ(r/2)2 2π −∞ < z < ∞, 0 < u < ∞. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.29 Statistical Interval of Single Sample Derivation of t Distribution II Ching-Han Hsu, Ph.D. • The distribution function F(t) = P(T ≤ t) of T is given by Z F(t) = P(T ≤ t) = P( p ≤ t) U/r p = P(Z ≤ t U/r) Z ∞ Z t√u/r = g(z, u)dzdu. 0 −∞ ∞ "Z √ F(t) = √ 1 πΓ(r/2) 0 t −∞ CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance • That is, Z Confidence Interval Derivation of t Distribution u/r 2 e−z /2 2(r+1)/2 # dz ur/2−1 e−u/2 du CI on Variance and STD of Normal Distribution Estimating a Proportion 9.30 Statistical Interval of Single Sample Derivation of t Distribution III Ching-Han Hsu, Ph.D. • The pdf of T is the derivative of F(t), f (t) = F 0 (t) = = √ 1 πΓ(r/2) ∞ Z 1 √ πrΓ(r/2) 2 e−(u/2)(t /r) 2(r+1)/2 0 Z 0 ∞ r Confidence Interval u r/2−1 −u/2 u e du r u(r+1)/2−1 −(u/2)(1+t2 /r) e du 2(r+1)/2 CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.31 Derivation of t Distribution IV Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. 1 • Let y = (1 + t2 /r)u and du dy = 1+t2 /r . We find that f (t) = 1 √ πrΓ(r/2) Z 0 ∞ u(r+1)/2−1 −(u/2)(1+t2 /r) e du 2(r+1)/2 (r+1)/2−1 y Z ∞ 1 1 1+t2 /r dy = √ e−y/2 (r+1)/2 (1 + t2 /r) πrΓ(r/2) 0 2 Γ((r + 1)/2) 1 = √ · πrΓ(r/2) (1 + t2 /r)(r+1)/2−1+1 Z ∞ y(r+1)/2−1 · e−y/2 dy (r+1)/2 Γ((r + 1)/2)2 0 Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.32 Derivation of t Distribution V Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. y(r+1)/2−1 • Since the pdf of χ2 (r + 1) is e−y/2 , we Γ((r+1)/2)2(r+1)/2 have Z 0 ∞ y(r+1)/2−1 e−y/2 dy = 1 (r+1)/2 Γ((r + 1)/2)2 Confidence Interval CI on Mean of a Normal Dist with Known Variance • Therefore, the pdf of t distribution is One-Sided Confidence Bounds Large Sample CI for Mean Γ((r + 1)/2) 1 f (t) = √ , 2 πrΓ(r/2) (1 + t /r)(r+1)/2 CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution −∞ < t < ∞. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.33 Statistical Interval of Single Sample CI on Mean with Unknown Variance Theorem Ching-Han Hsu, Ph.D. If ¯x and s are the sample mean and standard deviation of a random sample from a normal distribution with unknown variance σ 2 , a 100(1 − α)% confidence interval on µ is given by s s ¯x − tα/2,n−1 √ ≤ µ ≤ ¯x + tα/2,n−1 √ n n (23) CI on Mean of a Normal Dist with Known Variance where tα/2,n−1 is the upper 100α/2 percentage point of the t distribution with n − 1 degrees of freedom. A 100(1 − α)% upper-confidence bound for µ is s µ ≤ ¯x + tα,n−1 √ n A 100(1 − α)% lower-confidence bound for µ is s ¯x − tα,n−1 √ ≤ µ n Confidence Interval One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution (24) CI on Variance and STD of Normal Distribution Estimating a Proportion (25) 9.34 Example:CI on Mean with Unknown Variance I Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Example Given a random sample of size n = 22 is as follows: 19.8 15.4 11.4 19.5 15.4 10.1 18.5 14.1 8.8 11.4 14.9 7.9 17.6 13.6 7.5 12.7 16.7 11.9 15.4 11.9 15.8 11.4 Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean Find the 95% CI on µ. • The sample mean is ¯ x = 13.71. • The sample deviation is s = 3.55. • The degrees of freedom is n − 1 = 21. • t0.0025,21 = 2.08. CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.35 Statistical Interval of Single Sample Example:CI on Mean with Unknown Variance II Ching-Han Hsu, Ph.D. • The 95% CI on µ is: s s ¯x − tα/2,n−1 √ ≤ µ ≤ ¯x + tα/2,n−1 √ n n 3.55 3.55 13.71 − 2.08 √ ≤ µ ≤ 13.71 + 2.08 √ 22 22 13.71 − 1.57 ≤ µ ≤ 13.71 + 1.57 12.14 ≤ µ ≤ 15.28 (26) Confidence Interval (27) CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds (28) (29) Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.36 Statistical Interval of Single Sample Chi-Square Distribution Theorem Ching-Han Hsu, Ph.D. If X1 , X2 , . . . , Xn are independent random variables and have normal distribution N(µ, σ 2 ). Let S2 be the sample variance. Then the random variable χ2 = (n − 1)S2 σ2 Confidence Interval has a chi-square (χ2 (n − 1)) distribution with n − 1 degree of freedom. CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance • The probability density function of a χ2 random variable is Derivation of t Distribution 1 f (x) = k/2 x(k/2)−1 e−x/2 , x > 0 2 Γ(k/2) (30) where k is the degree of freedom. • The mean and variance of the χ2 distribution are k and 2k, respectively. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.37 Chi-Square Distribution Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Figure 8: Probability density functions of several χ2 distribution. Estimating a Proportion 9.38 Chi-Square Distribution Define χ2α,k as the percentage point or value of the chi-square random variable with k degree of freedom such that the probability that χ2 exceeds this value is α: Z ∞ 2 2 P χ > χα,k = f (u)du = α (31) Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. χ2α,k Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution Figure 9: Percentage point of the χ2 distribution. (a) The percentage point χ2α,k . (b) The upper percentage point χ20.05,10 = 18.31. and the lower percentage point χ20.95,10 = 3.94. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.39 Construction CI on Variance Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.40 Construction CI on Variance Statistical Interval of Single Sample • The random variable Ching-Han Hsu, Ph.D. χ2 = (n − 1)S2 σ2 is chi-square with n − 1 degrees of freedom. • We may write P(χ21−α/2,n−1 2 ≤χ ≤ χ2α/2,n−1 ) =1−α Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds so that (n − 1)S2 2 2 ≤ χα/2,n−1 = 1 − α P χ1−α/2,n−1 ≤ σ2 P Derivation of t Distribution CI on Variance and STD of Normal Distribution • We rearrange the above equation into (n − 1)s2 (n − 1)s2 2 ≤ σ ≤ χ2α/2,n−1 χ21−α/2,n−1 Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance ! =1−α Estimating a Proportion 9.41 Statistical Interval of Single Sample CI on Variance Ching-Han Hsu, Ph.D. Theorem If s2 is the sample variance from a random sample of n observations from a normal distribution with unknown variance σ 2 , then a 100(1 − α)% confidence interval on σ 2 is (n − 1)s2 (n − 1)s2 2 ≤ σ ≤ (32) χ2α/2,n−1 χ21−α/2,n−1 where (n−1)s2 χ2α/2,n−1 and (n−1)s2 χ21−α/2,n−1 are the upper and lower 100(α/2) percentage points of the chi-square distribution with n − 1 degrees of freedom, respectively. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.42 CI on Variance Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Theorem The 100(α/2) lower and upper confidence bounds on σ 2 are (n − 1)s2 (n − 1)s2 2 2 ≤ σ and σ ≤ (33) χ2α,n−1 χ21−α,n−1 respectively. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.43 Statistical Interval of Single Sample Example CI on Variance Ching-Han Hsu, Ph.D. Example An automatic filling machine is used to fill bottles with liquid detergent. A random sample of 20 bottles results in a sample variance of fill volume of s2 = 0.0153 (fluid ounces)2 . If the variance of fill volume is too large, an unacceptable proportion of bottles will be under- or overfilled. We will assume that the fill volume is approximately normally distributed. Find the 95% upper confidence bound. A 95% upper-confidence interval is found from Eq. (33) as follows: Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution σ2 ≤ σ 2 ≤ 1)s2 1)s2 19s2 (n − (20 − = 2 = 2 χ21−α,n−1 χ1−0.05,20−1 χ0.95,19 19 × 0.0153 = 0.0287 (fluid ounce)2 10.117 CI on Variance and STD of Normal Distribution Estimating a Proportion 9.44 Statistical Interval of Single Sample Recall That: Bernoulli Distribution Ching-Han Hsu, Ph.D. Example Let X be a Bernoulli random variable. The probability mass function is ( px (1 − p)1−x , x = 0, 1 f (x; p) = 0, elsewhere, Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds where p is the parameter to be estimated. The likelihood function of a random sample of size n is Pn L(p) = p i=1 xi (1 − p)n− Pn i=1 xi We have shown that the MLE of p is ˆ p= . 1 Pn n i=1 Xi . Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.45 Statistical Interval of Single Sample Binomial Experiments • A point estimator of the proportion p in a binomial Ching-Han Hsu, Ph.D. experiment is given by the statistic: ˆ=X P n where X represents the number of successes in n trials. • The sample proportion ˆ p = x/n will be used as the point estimate of the parameter p. ˆ is approximately • By the CLT, for n sufficiently large, P normally distributed with mean, np X ˆ = =p µPˆ = E(P) = E n n Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion and variance 2 σPˆ2 = σX/n 9.46 CI on Proportion Estimation Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. • The random random variable ˆ−p P Z=p pq/n is a standard normal distribution. • The corresponding 100(1 − α)% CI is P(−zα/2 ≤ Z ≤ zα/2 ) = 1 − α ˆ−p P P(−zα/2 ≤ p ≤ zα/2 ) = 1 − α pq/n r r ˆ ˆ p q pˆ q ˆ − zα/2 ˆ + zα/2 ˆ P(P ≤p≤P ) ≈ 1−α n n where ˆp = x/n and ˆ q=1−ˆ p. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.47 CI on Proportion Estimation Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Theorem If ˆ p is the proportion successes in a random sample of size n and ˆq = 1 − ˆp, an approximately 100(1 − α)% confident interval, for the binomial parameter p is given by r r ˆpˆ ˆ q pˆ q ˆp − zα/2 ≤p≤ˆ p + zα/2 (34) n n where zα/2 is the z-value having an area of α/2 to the right. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution • If n is small and p is close to 0 or 1, Eq. (34) should not be used. • Both nˆ p and nˆq should be greater than or equal to 5. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.48 Statistical Interval of Single Sample CI on Proportion Estimation: Accurate Formula Ching-Han Hsu, Ph.D. • We can also solve p in the quadratic inequality: ˆ−p P −zα/2 ≤ p ≤ zα/2 pq/n Confidence Interval • We obtain another form of the confidence interval for p with limits ˆp + 1+ CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean z2α/2 2n z2α/2 n s ± zα/2 1+ z2α/2 n 2 ˆ pˆ q zα/2 + 2 n 4n (35) CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.49 CI on Proportion Estimation: Example I Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Example In a random sample of size n = 500 families owning cable TV in a community, it is found that x = 340 subscribe to the NTHU program. Find a 95% confident interval for the actual proportion of the families subscribing to the program. Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean • The point estimate of p is ˆ p= ˆq = 1 − ˆp = 0.32. 340 500 = 0.68. CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution • Both nˆ p > 5 and nˆq > 5. • z0.025 = 1.96. CI on Variance and STD of Normal Distribution Estimating a Proportion 9.50 Statistical Interval of Single Sample CI on Proportion Estimation: Example II Ching-Han Hsu, Ph.D. • The 95% confident interval is r ˆp − zα/2 r 0.68−1.96 ˆ pˆ q ≤p≤ˆ p + zα/2 n r ˆ pˆ q n r (0.68)(0.32) ≤ p ≤ 0.68+1.96 500 0.6391 ≤ p ≤ 0.7209 (0.68)(0.32) 500 • If we use the accurate form, we obtain 0.68 + 1+ 1.962 (2)(500) 1.962 500 ± 1.96 1+ 1.962 500 s 1.962 (0.68)(0.32) + 500 (4)(500)2 = 06786 ± 0.0408 0.6378 ≤ p ≤ 0.7194 Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.51 Statistical Interval of Single Sample CI on Proportion: Choice of Sample Size Theorem Ching-Han Hsu, Ph.D. If ˆ p is used as the estimate of p, we can be 100(1 − α)% confident that the error will be less than a specified amount E when the same size is approximately n= z2α/2 ˆ pˆ q E2 (36) Confidence Interval CI on Mean of a Normal Dist with Known Variance One-Sided Confidence Bounds Theorem Large Sample CI for Mean Since the maximum value of p(1 − p) locates at p = 0.5, we can substitute ˆp by 0.5. Then, we can be at least 100(1 − α)% confident that the error will be less than a specified amount E when the same size is n= z2α/2 4E2 (37) CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution CI on Variance and STD of Normal Distribution Estimating a Proportion 9.52 CI on Proportion: Choice of Sample Size Statistical Interval of Single Sample Ching-Han Hsu, Ph.D. Example How large a sample is required if we want to be 95% confident that our estimate of p in the previous TV problem is within 0.02 of the true vale? Confidence Interval • First approach: n= z2α/2 ˆpˆq E2 CI on Mean of a Normal Dist with Known Variance (1.96)2 (0.68)(0.32) = = 2080.8 ≈ 2081 0.022 • Second approach: n= z2α/2 4E2 One-Sided Confidence Bounds Large Sample CI for Mean CI of Normal Distribution with Unknown Mean and Variance Derivation of t Distribution = (1.96)2 = 2401 (4)(0.02)2 CI on Variance and STD of Normal Distribution Estimating a Proportion 9.53
© Copyright 2024