DISTRIBUTION OF THE SAMPLE MEAN , X …

DISTRIBUTION OF THE SAMPLE MEAN
X 1 , X 2 ,… , X n sample from a distribution/population with mean µ
and standard deviation σ.
1 n
X = ∑ Xi
n i =1
We know:
can take different values for different
samples – sampling distribution.
FACT1. The mean and standard deviation for the distribution of X
are given by:
σ
µ X = µ and σ X =
n
The mean of X is the same as the population mean µ. So, X is
unbiased for µ.
The standard deviation of X is
standard deviation.
n times smaller than the population
Averages have smaller variability than single observations!
Law of Large Numbers
Closer look at the standard deviation of X : σ X =
σ
n
As n=sample size increases, σ X
0; i.e. as n increases, spread
of the sample mean decreases to zero.
What random variable has spread zero? Constant!
comes arbitrarily close to µ
1.5
1.5
2.0
2.0
Conclusion- Law of Large Numbers: X
for large enough n.
Distribution with mean=10,
st.dv.= 2/100^0.5=0.2
1.0
1.0
Distribution with mean=10,
st.dv.= 2/10^0.5=0.63
0.0
0.0
0.5
0.5
Distribution with
mean=10, st.dv.=2
5
10
5
15
n=10, n=100
10
15
DISTRIBUTION OF THE SAMPLE MEAN – NORMAL DATA
X 1 , X 2 ,… , X n
sample from a Normal distribution, N(µ, σ ).
µX = µ
and
σX =
σ
From FACT 1, We know:
FACT 2: If X 1 , X 2 ,… , X n are from N(µ, σ ), then X has a N(µ, σ/ √n )
distribution.
n
NOTE: Since X is normally distributed, with µX = µ and
standardize it:
σX =
X −µ
n( X − µ)
=
Z=
.
σ
σ/ n
σ
n
, then we may
SAMPLING DISTRIBUTION OF THE SAMPLE MEAN
EXAMPLE: Students in an university have a weight distribution that is
known to be N(150, 20). Let X1, X2, …, X16 represent the weights of 16
randomly selected students from this university. If X is the average weight
for this sample, find P( X > 160).
Solution: Since the sample came from a normal distribution, by Fact 2, the
sample mean has a normal distribution as well.
X ~N(µ, σ/ √n )=N(150, 20/ √16)=N(150, 5). Thus,
P( X > 160) = P (
X − 150 160 − 150
) = P( Z > 2) = 1 − P( Z ≤ 2) = 1 − 0.9772 = 0.0228.
>
5
5
EXAMPLE, CONTD.
An elevator at this university has a capacity of 1500 pounds.
What is the probability that 9 students who enter the elevator
will have a safe ride, i.e. their total weight is less than 1,500 lb?
Solution: Again, by Fact 2, the sample mean has a normal
distribution:
X ~N(µ, σ/ √n )=N(150, 20/ √9)=N(150, 6.67). Also,
P( Total weight < 1500)=P( X <1500/9)=P(
X <166.67).
So,
X − 150 166.67 − 150
P( X < 166.67) = P(
>
) = P( Z < 2.5) = 0.9938.
6.67
6.67
DISTRIBUTION OF THE SAMPLE MEAN, CONTD.
EXAMPLE. Suppose X is the score on a test and X~N(500, 100). Let X1,
X2, …X16 be a sample of scores for 16 individuals and X their average
score. Find P( 550 < X ≤ 600).
Solution: Since the data come from a normal distribution, by Fact 2, X has a
normal distribution with mean
µ X = µ = 500 and σ X = σ / n = 100 / 16 = 25.
Thus, P(550 <
X ≤ 600) =
P(
550 − 500 X − 500 600 − 500
<
≤
)=
25
25
25
= P(2 < Z ≤ 4) = P(Z ≤ 4) - P(Z ≤ 2 ) =
= 1 – 0.9772 = 0.0228.
The CENTRAL LIMIT THEOREM (CLT)
What if the data does not come from the normal distribution?
FACT 3. (CLT): If X1, X2, …Xn are any set of observations with mean µ
and standard deviation σ, their sample mean X , has approximately
normal N(µ, σ/√n) distribution, if n is sufficiently large.
How large is sufficiently large? Depends on the distribution the data
comes from. Definitely n should be at least 20 before we use this
approximation.
Difference between Fact 2 and Fact 3: Fact 2 holds only for samples from
Normal distribution and gives exact distribution of X .
Fact 3 holds for samples from any distribution, but gives an approximate
distribution for X .
The Central Limit Theorem contd.
Example. Suppose X1, X2, …, X25 are lifetimes of electronic components,
with µ=700 hours and σ=10 hours. Find P( X ≤ 702), where X is the
sample mean of the lifetimes of 25 components.
Solution. Usually lifetime data is skewed to the right, so not normal (Why?)
Since n=25 (reasonably large), we will use CLT and the normal
approximation of the distribution of the sample mean:
X
So,
has approx. a N(µ, σ/√n) = N(700, 10/√25) = N(700, 2) distr.
X − 700 702 − 700
P ( X ≤ 702) = P(
≤
) = P( Z ≤ 1) = 0.8413.
2
2