1) An estimator is a. an estimate. b. a formula that gives an efficient

1
Prof. Òscar Jordà
Due Date: Thursday, April 24
Problem Set 2
ECONOMETRICS
STUDENT’S NAME:
Multiple Choice Questions [20 pts]
Please provide your answers to this section below:
6.
7.
8.
9.
10.
1.
2.
3.
4.
5.
1)
An estimator is
a.
b.
c.
d.
an estimate.
a formula that gives an efficient guess of the true population value.
a random variable.
a nonrandom number.
Answer: c
2) An estimate is
a.
b.
c.
d.
efficient if it has the smallest variance possible.
a nonrandom number.
unbiased if its expected value equals the population value.
another word for estimator.
Answer: b
3) An estimator µˆ Y of the population value µY is more efficient when compared to
another estimator µ% Y , if
a.
b.
c.
d.
E( µˆ Y ) > E( µ% Y ).
it has a smaller variance.
its c.d.f. is flatter than that of the other estimator.
both estimators are unbiased, and var( µˆ Y ) < var( µ% Y ).
Answer: d
4) Among all unbiased estimators that are weighted averages of Y1 ,..., Yn , Y is
a.
b.
c.
d.
the only consistent estimator of µY .
the most efficient estimator of µY .
a number which, by definition, cannot have a variance.
the most unbiased estimator of µY .
2
Answer: b
5)
The p-value is defined as follows:
a. p = 0.05.
b. PrH 0 [| Y − µY ,0 |>| Y act − µY ,0 |].
c. Pr( z > 1.96) .
d. PrH 0 [| Y − µY ,0 |<| Y act − µY ,0 |]. .
Answer: b
6)
A large p-value implies
a.
b.
c.
d.
rejection of the null hypothesis.
a large t-statistic.
a large Y act .
that the observed value Y act is consistent with the null hypothesis.
Answer: d
7)
The power of the test
a. is the probability that the test actually incorrectly rejects the null
hypothesis when the null is true.
b. depends on whether you use Y or Y 2 for the t-statistic.
c. is one minus the size of the test.
d. is the probability that the test correctly rejects the null when the alternative
is true.
Answer: d
8)
The sample covariance can be calculated in any of the following ways, with the
exception of:
1 n
∑ ( X i − X )(Yi − Y ) .
n − 1 i =1
1 n
n
b.
X iYi −
XY .
∑
n − 1 i =1
n −1
1 n
c.
∑ ( X i − µ X )(Yi − µY ) .
n i =1
d. rXY sY sY , where rXY is the correlation coefficient.
a.
Answer: c
9)
When the sample size n is large, the 90% confidence interval for µY is
3
a.
b.
c.
d.
Y
Y
Y
Y
± 1.96SE (Y ) .
± 1.64SE (Y ) .
± 1.64σ Y .
± 1.96 .
Answer: b
10) The following statement about the sample correlation coefficient is true.
a. –1 ≤ rXY ≤ 1.
p
2
b. rXY
→ corr ( X i , Yi ) .
c. | rXY |< 1 .
d. rXY =
2
s XY
.
s X2 sY2
Answer: a
Problems [40 pts]
Instructions: The goal of the problem set is to understand what you are doing rather than just
getting the correct result. Please show your work clearly and neatly. Please write your answers in
the space provided.
1)
Adult males are taller, on average, than adult females. Visiting two recent American
Youth Soccer Organization (AYSO) under-12-year-old (U12) soccer matches on a Saturday, you
do not observe an obvious difference in the height of boys and girls of that age. You suggest to
your little sister that she collect data on height and gender of children in 4th to 6th grade as part of
her science project. The accompanying table shows her findings.
Height of Young Boys and Girls, Grades 4-6, in inches
Boys
(a)
YBoys
sBoys
nBoys
YGirls
sGirls
nGirls
57.8
3.9
55
58.4
4.2
57
Let your null hypothesis be that there is no difference in the height of females and males at this
age level. Specify the alternative hypothesis.
Answer:
(b)
Girls
H 0 : µ Boys − µGirls = 0 vs. H1 : µ Boys − µGirls ≠ 0
Find the difference in height and the standard error of the difference.
4
Answer: YBoys − YGirls = -0.6, SE( YBoys − YGirls ) =
(c)
3.92 4.22
= 0.77.
+
55
57
Generate a 95% confidence interval for the difference in height.
Answer: -0.6 ± 1.96 × 0.77 = (-2.11, 0.91).
(d)
Calculate the t-statistic for comparing the two means. Is the difference statistically significant at
the 1% level? Which critical value did you use? Why would this number be smaller if you had
assumed a one-sided alternative hypothesis? What is the intuition behind this?
Answer: t = -0.78, so | t | < 2.58, which is the critical value at the 1% level. Hence you cannot
reject the null hypothesis. The critical value for the one-sided hypothesis would have
been 2.33. Assuming a one-sided hypothesis implies that you have some information
about the problem at hand, and, as a result, can be more easily convinced than if you
had no prior expectation.
2)
The development office and the registrar have provided you with anonymous matches of
starting salaries and GPAs for 108 graduating economics majors. Your sample contains a variety
of jobs, from church pastor to stockbroker.
(a)
The average starting salary for the 108 students was $38,644.86 with a standard
deviation of $7,541.40. Construct a 95% confidence interval for the starting salary of all
economics majors at your university/college.
Answer: 38,644.86 ± 1.96 ×
7,541.40
= 38,644.86 ± 1,422.32 = (37,222.54, 40,067.18).
108
(b)
A similar sample for psychology majors indicates a significantly lower starting
salary. Given that these students had the same number of years of education, does this
indicate discrimination in the job market against psychology majors?
Answer: It suggests that the market values certain qualifications more highly than others.
Comparing means and identifying that one is significantly lower than others
does not indicate discrimination.
(c)
You wonder if it pays (no pun intended) to get good grades by calculating the
average salary for economics majors who graduated with a cumulative GPA of B+ or
better, and those who had a B or worse. The data is as shown in the accompanying table:
Cumulative GPA
Average Earnings
Y
B+ or better
B or worse
$39,915.25
$37,083.33
Standard
Deviation
sY
$8,330.21
$6,174.86
n
59
49
Conduct a t-test for the hypothesis that the two starting salaries are the same in the
population. Given that this data was collected in 1999, do you think that your results will
hold for other years, such as 2002?
5
Answer: Assuming unequal population variances, t =
(39,915.25 − 37, 083.33)
8,330.212 6,174.862
+
59
49
=
2.03. The critical value for a one-sided test is 1.64, for a two-sided test 1.96,
both at the 5% level. Hence you can reject the null hypothesis that the two
starting salaries are equal. Presumably you would have chosen as an alternative
that better students receive better starting salaries, so that this becomes your
new working hypothesis. 1999 was a boom year. If better students receive
better starting offers during a boom year, when the labor market for graduates
is tight, then it is very likely that they receive a better offer during a recession
year, assuming that they receive an offer at all.
3)
Assume that under the null hypothesis, Y has an expected value of 500 and a standard deviation
of 20. Under the alternative hypothesis, the expected value is 550. Sketch the probability density function
for the null and the alternative hypothesis in the same figure. Pick a critical value such that the p-value is
approximately 5%. Mark the areas, which show the size and the power of the test. What happens to the
power of the test if the alternative hypothesis moves closer to the null hypothesis, i.e., µY = 540, 530, 520,
etc.?
Answer: For a given size of the test, the power of the test is lower.
4)
Consider the following alternative estimator for the population mean:
.
1 1
7
1
7
1
7
Y = ( Y1 + Y2 + Y3 + Y4 + ... + Yn −1 + Yn )
n 4
4
4
4
4
4
Prove that Y is unbiased and consistent, but not efficient when compared to Y .
1 1
7
1
7
1
7
( E (Y1 ) + E (Y2 ) + E (Y3 ) + E (Y4 ) + ... + E (Yn −1 ) + E (Yn ))
n 4
4
4
4
4
4
1
1 7
n
 is unbiased.
= µY (2 + 2 + ... + + ) = µY = µY . Hence Y
n
4 4 n
) =
Answer: E (Y
6
var(Y ) = E (Y − µY ) 2 =
1 1
7
1
7
1
7
E[ ( Y1 + Y2 + Y3 + Y4 + ... + Yn −1 + Yn ) − µY ]2
n 4
4
4
4
4
4
1 1
7
1
7
2
= 2 E[ (Y1 − µY ) + (Y2 − µY ) + ... + (Yn −1 − µY ) + (Yn − µY )]
n
4
4
4
4
=
1 1
49
1
49
[ E (Y1 − µY ) 2 + E (Y2 − µY ) 2 + ... + E (Yn −1 − µY ) 2 + E (Yn − µY ) 2 ]
2
n 16
16
16
16
1 1 2 49 2
1 2 49 2
σ Y2 n 1 49
σ Y2
.
[ σ Y + σ Y + ... + σ Y + σ Y ] = 2 [ ( + )] = 1.5625
n 2 16
n
16
16
16
n 2 16 16
 ) → 0 as n → ∞, Y is consistent. Y has a larger variance than Y and
Since var(Y
=
is therefore not as efficient.
EViews Exercise [40 pts]
The data for this exercise is contained in the excel file “salary.XLS.” It contains 93 observations
on employees for a Chicago Bank during the period 1969-1971. In particular, the variable
“salary” refers to the starting salary, “education” refers to years of education, “experience” refers
to the number of months of previous work experience, “seniority” refers to the number of months
the person has been working at the current job with the bank, and “gender” takes the value of one
for males and 0 for females.
You will need to import the data into EViews. This can be easily accomplished by opening a new
workfile. Select the option corresponding to undated or irregular observations and a range that
goes from 1 to 93. Then from the “procs” button, select the import option, and the “excel” option
thereafter. The rest is pretty straightforward.
Answer the following questions:
1. Give summary statistics for all variables in one table. Then, subsample according to gender and
report the summary statistics with an additional two tables. Next, select all the variables (make
sure to undo the subsampling condition) and compute their correlations and include this table
with the previous tables in the same page. Comment on all of these results, making particular
emphasis on whether you see any evidence of discrimination (nothing formal yet, just hunches).
Beware of the experience and seniority levels of each group.
2. Do a formal test of the null hypothesis that females are discriminated against at a 5%
confidence level. Do your results hold at a 1% confidence level? Consider now a test of the null
hypothesis that males earn $250 more than females on average at a 5% confidence level. Make
sure to report these tests in one page and in one paragraph report on the economic significance of
your tests (Is there gender bias in pay scales?). Hints: make sure you consider whether a onetailed test or a two-tailed test is appropriate.
3. Load data on the exchange rate between the U.S. dollar and the Japanese yen into EViews.
Select monthly frequency and select a data source that begins at least in 1980. You will need to
plot these data and compute summary statistics. Regarding the plot, make sure that you label
important economic dates (such as oil crises, or other political events) that may help explain some
7
of the long swings in these data. Make sure to label the graph appropriately, including title, units
of
Answers to Problem Set #2 (Empirical)
1. Basic Statistics
SALARY
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
Observations
EDUCATION EXPERIENCE SENIORITY
5420.323
5400.000
8100.000
3900.000
709.5872
0.589159
4.172759
12.50538
12.00000
16.00000
8.000000
2.282369
-0.486804
2.684228
100.9269
70.00000
381.0000
0.000000
90.94755
1.146060
3.628779
16.72043
15.00000
34.00000
1.000000
10.25476
0.191436
1.881037
93
93
93
93
IF GENDER=0 (Females)
SALARY
EDUCATION EXPERIENCE SENIORITY
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
4979.1180
4950.0000
6300.0000
3900.0000
592.06040
0.2731300
2.4446870
11.617650
12.000000
16.000000
8.0000000
2.0451670
-0.3120040
3.2059770
82.491180
52.000000
244.00000
0.0000000
79.337440
0.7313630
2.1835760
15.529410
15.000000
33.000000
1.0000000
9.7615050
0.2388490
1.9262310
Observations
34.000000
34.000000
34.000000
34.000000
IF GENDER=1 (Males)
SALARY
EDUCATION EXPERIENCE SENIORITY
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
5674.5760
5580.0000
8100.0000
4620.0000
647.58260
1.0326770
4.8896460
13.016950
12.000000
16.000000
8.0000000
2.2704360
-0.7699380
2.9289280
111.55080
78.500000
381.00000
6.0000000
96.046220
1.2088630
3.5638260
17.406780
16.000000
34.000000
1.0000000
10.548930
0.1429750
1.8353130
Observations
59.000000
59.000000
59.000000
59.000000
Correlations
SALARY
EDUCATION
EXPERIENCE
SENIORITY
SALARY
1.0000000
0.3837540
-0.1007780
0.1667530
EDUCATION EXPERIENCE SENIORITY
0.3837540
-0.1007780
0.1667530
1.0000000
-0.2981960
-0.1507460
-0.2981960
1.0000000
0.0001920
-0.1507460
0.0001920
1.0000000
There seems to be some evidence of discrimination. First, females earn lower salaries on average. However,
females also tend to have less education, seniority, and experience, compared with men in the sample. This
could explain why they receive lower salaries. We can see from the correlations, that education and seniority
are positively correlated with salary, so this could explain the big differences between salaries for males versus
females.
2. One test of whether females are discriminated against would be to ask whether their salaries differ
significantly from those of males. Therefore, the hypothesis test would be:
HO: µF ≥ µM (that the average salary for a female is greater than or equal to that of a male)
HA: µF < µM (that the average salary for a female is less than that of a male)
Hypothesis Testing for SALARY
Sample(adjusted): 1 68 IF GENDER=0
Included observations: 34 after adjusting endpoints
Test of Hypothesis: Mean = 5674.576
Sample Mean = 4979.118
Sample Std. Dev. = 592.0604
Method
t-statistic
Value
-6.849274
Probability
0.0000
Note that Eviews reports the p-value for a two-tailed test. For the one-tailed test, the critical t-statistic
approximately -1.667, so we reject the null hypothesis. This test reveals that females earn a lower salary than
males. You should conduct a one-tailed test to see if women receive a lower salary compared to men.
Alternatively, you could set up a two-tailed test to test whether the salaries differ. The t-statistic is the same as
the one reported above, and the p-value reveals that we reject the null hypothesis.
For the test of whether males make more than $250 above the average salary for females:
HO: µM ≤ µF + 250 (that the average salary of a male is less than $250 greater than that of a female)
HA: µM >µF + 250 (that the average salary of a male is more than $250 greater than that of a female)
Hypothesis Testing for SALARY
Sample(adjusted): 13 93 IF GENDER=1
Included observations: 59 after adjusting endpoints
Test of Hypothesis: Mean = 5229.118
Sample Mean = 5674.576
Sample Std. Dev. = 647.5826
Method
t-statistic
Value
5.283697
Therefore, males earn an average salary that is
as least $250 greater than that of females.
3. A few of the relevant events are labeled on
the graph below. Notice, that an increase in
the exchange rate implies an appreciation of
the U.S. dollar and depreciation of the
Japanese yen.
Probability
0.0000
Exchange Rate: Japan (Yen/$US)
280
240
Gulf War
200
Oil prices fall 62%
(1985-1986)
160
1990-91 U.S.
Recession
Asian Financial
Crisis
Current U.S.
Recession
1980-83 U.S.
120 Recessions (2)
80
1980
1985
1990
1995
2000