Chapter 11. Sampling Distributions. Problems and definitions

Chapter 11. Sampling Distributions
STAT 145
Definitions:
1.
A parameter in a statistical problem is a number that describes a population, such as the population mean μ.
In statistical practice, the value of a parameter is not known because we cannot examine the entire population.
2.
A statistic is a number that can be computed from the sample data without making use of any unknown
parameters, such as the sample mean ¯x . In practice, we often use a statistic to estimate an unknown
parameter.
3.
Law of Large Numbers: Draw observations at random from any population with finite mean μ . As the
number of observations drawn increases, the mean ¯x of the observed values gets closer and closer to the
mean μ of the population.
(see Figure on next page)
4.
The population distribution of a variable is the distribution of values of the variable among all the
individuals in the population.
5.
The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible
samples of the same size from the same population.
6.
Mean and Standard Deviation of a Sample Mean:
Suppose that ¯x is the mean of an SRS of size n drawn from a large population with mean μ and standard
deviation σ . Then the sampling distribution of ¯x has mean μ and standard deviation σ .
√n
From the above definition we also can say that averages are less variable than individual observations.
7.
¯x
is an unbiased estimator of μ.
A statistic is said to be unbiased if the mean of its sampling distribution is equal to the true value of the
parameter being estimated.
8.
Central Limit Theorem:
Draw an SRS of size n from any population with mean μ and finite standard deviation σ
.
The Central Limit Theorem (CLT) says that when n is large the sampling distribution of the sample mean
¯x is approximately Normal:
¯x is approximately N (μ , σ )
√n
The Central Limit Theorem allows us to use Normal probability calculations to answer questions about sample
means from many observations.
Note: How large a sample size n is needed for ¯x to be close to Normal depends on the population
distribution. In fact, if the population distribution itself is exactly Normal, then the sampling distribution of is
¯x exactly Normal.
If the shape of the population distribution is far from Normal, more observations are required in order for to be
¯x close to Normal.
(see Figures on next page)
1
Chapter 11. Sampling Distributions
STAT 145
The law of large numbers
in action:
as we take more
observations, the sample
mean ¯x always
approaches the mean μ of
the population. (Figure 11.1)
Illustration of CLT on Uniform(0,1) and Exponential(1) :
n is a sample size (n=2, then n=6, then n=10, and n=25).
As n increases, the shape becomes more Normal.
10000 samples were chosen to plot histograms.
2
Chapter 11. Sampling Distributions
STAT 145
Figure 11.3 (d) from the book
3
Chapter 11. Sampling Distributions
STAT 145
Problem 1.
State whether each boldface number below is a parameter or a statistic:
Your local newspaper contains a large number of advertisements for unfurnished one-bedroom apartments. You
choose 10 at random and calculate that their mean monthly rent is $540 and that the standard deviation of their
rents is $80. and $80 are statistics (relate to our sample of 10 apartments).
Problem 2.
State whether each boldface number below is a parameter or a statistic:
Voter registration records show that 68% of all voters in Indianapolis are registered as Republicans. To test a
random-digit dialing device, you use the device to call 150 randomly chosen residential telephones in
Indianapolis. Of the registered voters contacted, 73% are registered Republicans.
Solution: 68% is a parameter (relates to the population of all registered voters in Indianapolis);
Problem 3.
In a survey of sleeping habits, 8400 national adults were selected randomly and contacted by telephone.
Respondents were asked: “Typically, how many times per week do you sleep less than 6 hours during the
night?” On average, those surveyed reported an average of 1.8 nights per week in which they got less than 6
hours of sleep. Which of the following is true with respect to this scenario?
a. 8400 is the size of the population being studied.
b. 1.8 is a parameter and represents an estimate of the unknown value of a statistic of interest.
c. 1.8 is a statistic and represents an estimate of the unknown value of a parameter of interest.
d. none of the above
Problem 4.
Suppose you're in a class of 35 students. The instructor takes a simple random sample of 7 students and
observes their heights. Imagine all of the different samples possible. Let X denote the tallest height in your
sample. The distribution of all values taken by X in all possible samples of 7 students selected from the 35
students in your class is
a. the probability that X is obtained.
b. the sampling distribution of X.
c. the standard deviation of values.
d. the parameter.
4
Chapter 11. Sampling Distributions
STAT 145
Problem 5.
Juan makes a measurement in a chemistry laboratory and records the result in his lab report. The standard
deviation of student's lab measurements is = 10 milligrams. Juan repeats the measurement 4 times and records
the mean ¯x of his 4 measurements.
a) What is the standard deviation of Juan's mean result? (That is, if Juan kept on making 4 measurements and
averaging them, what would be the standard deviation of all his ¯x 's?).
b) How many times must Juan repeat the measurement to reduce the standard deviation of ¯x to 2?
Problem 6.
The number of accidents per week at a hazardous intersection varies with mean 2.2 and standard deviation 1.4.
This distribution takes only whole-number values, so it is certainly not Normal.
a) Let ¯x be the mean number of accidents per week at the intersection during a year (52 weeks).
What is the approximate distribution of ¯x according to the central limit theorem?
b) What is the approximate probability that ¯x is less than 2?
c) What is the approximate probability that ¯x is greater than 2?
Problem 7.
Assume that the average adult weighs 140 pounds and that the standard deviation is 25 pounds. Five people
enter an elevator that has a capacity of 750 pounds.
What is the chance that their combined weight exceeds capacity?
Problem 8.
Suppose you interview 10 randomly selected workers and ask how many miles they commute to work. You’ll
compute the sample mean commute distance. Now imagine repeating the survey many, many times, each time
recording a different sample mean commute distance. In the long run, a histogram of these sample means
represents
a. the bias, if any, that is present in the sampling method.
b. the true population average commute distance.
c. a simple random sample.
d. the sampling distribution of the sample mean.
5
Chapter 11. Sampling Distributions
STAT 145
Problem 9.
The law of large numbers states that as the number of observations drawn at random from a population with
finite mean µ increases, the mean x of the observed values
a. gets larger and larger.
b. gets smaller and
 smaller.
c. tends to get closer and closer to the population mean µ.
d. fluctuates steadily between one standard deviation above and one standard deviation below the mean.
6