Sample Exams Utrecht University School of Economics (USE) In this page you find two sample exams to test your required knowledge in quantitative methods: Econometrics, general exam (page 2) Econometrics, Stata (page 9) These sample exams give you an indication of the level you are expected to have in Econometrics before you start one of the Master’s programmes offered by Utrecht University School of Economics: International Economics and Business Economics of Public Policy and Management Economics and Law Multidisciplinary Economics (Research Programme) Please bear in mind that these tests are meant as self-assessment. Please do not send it in, we will not assess your test. If after completion of the exams you feel that you need to brush up your knowledge of Econometrics before the start of your programme, we strongly recommend you to do the Utrecht University Summer School course Introductory Econometrics. 1 Sample exam 1: Econometrics, general exam QUESTION 1 a. Consider the multiple regression model with two independent variables. We have a random sample of N observations. The regression equation is Y 0 1 X 1 2 X 2 for which we assume that the error term is independent of the explanatory variables X 1 and X 2 . Under some circumstances, the researcher is not able to calculate the Ordinary Least Squares (OLS) estimator of the regression parameters. Could you please explain when the OLS-estimator cannot be calculated? b. After applying OLS, the estimated regression equation may be rewritten as Y ˆ0 ˆ1 X 1 ˆ2 X 2 e For which e is the residual. The researcher claims that the residual e may be correlated to the explanatory variables X 1 and X 2 if the regression equation excludes relevant explanatory variables (thus there are omitted variables). Is the researcher right? Please explain your answer. c. Which of the following outcome(s) can cause the t-value of the estimated parameters not to be tdistributed? Do they lead to unbiased estimates of the regression parameters? Please motivate your answer briefly. Heteroskedasticity of the error term. A correlation coefficient of 0.80 between two explanatory variables of the regression equation. One of the explanatory variables is correlated to the error term of the regression equation. d. Next, we assume that all of the classical assumptions of the regression model are valid. The regression equation becomes: Y 0 1 X 1 2 X 1 X 2 3 X 2 Please determine the effect of X 1 on Y. e. We assume that the random variables X and X are independent. Please rewrite: 1 E( X 5 X | X x ) Cov(5 X , X X ) 1 2 1 1 1 2 1 2 Next, we assume that the random variables X and X are not independent. Please rewrite: 1 2 2 Var (2 X 3 X ) 1 2 QUESTION 2 The following information is available: tothours avgsal lavgsal sales lsales dy1 dy2 dy3 large : total hours of training : annual salary (in $) : logarithm of avgsal : annual sales (in $) : logarithm of sales : dummy variable; 1 for 1987 : dummy variable; 1 1988 : dummy variable; 1 1989 : dummy variable; 1 if large firm sum tothrs avgsal lavgsal sales lsales dy1 dy2 dy3 large Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------tothrs | 304 33.39474 52.42831 0 320 avgsal | 304 18723.52 6967.911 4237 42583 lavgsal | 304 9.773076 .3590829 8.351611 10.65921 sales | 304 6413918 7899873 110000 4.90e+07 lsales | 304 15.07133 1.126338 11.60824 17.70733 -------------+-------------------------------------------------------dy1 | 304 .3190789 .4668882 0 1 dy2 | 304 .3322368 .471792 0 1 dy3 | 304 .3486842 .4773396 0 1 large | 304 .2138158 .4106743 0 1 A researcher regresses the variable tothours on the logarithm of avgsal and the logarithm of sales, using a random sample of 304 observations, which were sampled in 1987, 1988, and 1989. See the regression output below. . reg tothrs lavgsal lsales Source | SS df MS -------------+-----------------------------Model | 52584.9232 2 26292.4616 Residual | 780279.708 301 2592.29139 -------------+-----------------------------Total | 832864.632 303 2748.72816 Number of obs F( 2, 301) Prob > F R-squared Adj R-squared Root MSE = = = = = = 304 10.14 0.0001 0.0631 0.0569 50.915 -----------------------------------------------------------------------------tothrs | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lavgsal | 17.08952 8.314689 2.06 0.041 .7272393 33.4518 lsales | -11.5002 2.65077 -4.34 0.000 -16.71659 -6.283811 _cons | 39.70086 83.09774 0.48 0.633 -123.8252 203.227 ------------------------------------------------------------------------------ a) Please give a precise economic interpretation of the estimated parameter on lavgsal. Next, we add two dummy variables dy2 and dy3 to the equation. See the Stata output below. 3 . reg tothrs lavgsal lsales dy2 dy3 Source | SS df MS -------------+-----------------------------Model | 56680.5896 4 14170.1474 Residual | 776184.042 299 2595.93325 -------------+-----------------------------Total | 832864.632 303 2748.72816 Number of obs F( 4, 299) Prob > F R-squared Adj R-squared Root MSE = = = = = = 304 5.46 0.0003 0.0681 0.0556 50.95 -----------------------------------------------------------------------------tothrs | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lavgsal | 15.59959 8.405549 1.86 0.064 -.9419353 32.14112 lsales | -11.66302 2.655821 -4.39 0.000 -16.88949 -6.436551 dy2 | 4.221359 7.271943 0.58 0.562 -10.08931 18.53203 dy3 | 9.090815 7.254422 1.25 0.211 -5.185378 23.36701 _cons | 52.14363 83.74939 0.62 0.534 -112.6693 216.9565 ------------------------------------------------------------------------------ b) Please apply a statistical testing procedure to test whether year has a statistically significant effect on the total hours of training. Use a significance level 0.05 . c) Please provide an economic interpretation of the parameter estimate on dy3 (for which you may ignore that the estimated parameter is statistically insignificant). d) Is there any indication of heteroskedasticity? See the regression output below. Please apply a statistical testing procedure, for which you use a significance level 0.05 . predict uhat, resid gen uhat2 = uhat^2 reg uhat lavgsal lsales dy2 dy3 Source | SS df MS -------------+-----------------------------Model | 1.1642e-10 4 2.9104e-11 Residual | 776184.046 299 2595.93326 -------------+-----------------------------Total | 776184.046 303 2561.66352 Number of obs F( 4, 299) Prob > F R-squared Adj R-squared Root MSE = 304 = 0.00 = 1.0000 = 0.0000 = -0.0134 = 50.95 -----------------------------------------------------------------------------uhat | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lavgsal | 1.92e-07 8.405549 0.00 1.000 -16.54153 16.54153 lsales | -6.72e-08 2.655821 -0.00 1.000 -5.226469 5.226469 dy2 | -2.04e-07 7.271943 -0.00 1.000 -14.31067 14.31067 dy3 | -1.75e-07 7.254422 -0.00 1.000 -14.27619 14.27619 _cons | -7.52e-07 83.74939 -0.00 1.000 -164.8129 164.8129 -----------------------------------------------------------------------------. reg uhat2 lavgsal lsales dy2 dy3 Source | SS df MS Number of obs = 304 -------------+-----------------------------F( 4, 299) = 3.82 Model | 1.0166e+09 4 254151538 R-squared = 0.0486 Residual | 1.9889e+10 299 66516778.3 Adj R-squared = 0.0359 -------------+-----------------------------Root MSE = 8155.8 Total | 2.0905e+10 303 68993804.9 -----------------------------------------------------------------------------uhat2 | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lavgsal | 232.6349 1345.504 0.17 0.863 -2415.222 2880.492 lsales | -1631.846 425.126 -3.84 0.000 -2468.464 -795.2283 dy2 | -143.6411 1164.044 -0.12 0.902 -2434.397 2147.115 dy3 | 534.56 1161.239 0.46 0.646 -1750.677 2819.797 _cons | 24735.1 13406.04 1.85 0.066 -1647.041 51117.25 4 ------------------------------------------------------------------------------ e) The dummy variable large is introduced, which becomes 1 for a large firm. Is the regression equation of sub-question a) different between large firms and small firms? Use a significance level 0.05 . gen large_lavgsal = large*lavgsal gen large_lsales = large*lsales reg tothrs lavgsal lsales large large_lavgsal large_lsales Source | SS df MS Number of obs = 304 -------------+-----------------------------F( 5, 298) = 5.08 Model | 65423.7882 5 13084.7576 Prob > F = 0.0002 Residual | 767440.843 298 2575.30484 R-squared = 0.0786 -------------+-----------------------------Adj R-squared = 0.0631 Total | 832864.632 303 2748.72816 Root MSE = 50.747 -----------------------------------------------------------------------------tothrs | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lavgsal | 22.45324 8.665497 2.59 0.010 5.399921 39.50656 lsales | -9.971127 3.131076 -3.18 0.002 -16.13295 -3.809305 large | 582.7495 273.2314 2.13 0.034 45.04194 1120.457 large_lavgsal| -59.82844 30.77553 -1.94 0.053 -120.3933 .7364569 large_lsales | -.0325676 9.070745 -0.00 0.997 -17.8834 17.81826 _cons | -34.48482 90.38981 -0.38 0.703 -212.368 143.3984 ------------------------------------------------------------------------------ f) The next regression is an equation that includes an interaction term in large and the average salary: large_lavgsal. Please give a precise economic interpretation of the effect of salary on hours of training. reg tothrs lavgsal lsales large large_lavgsal Source | SS df MS Number of obs = 304 -------------+-----------------------------F( 4, 299) = 6.37 Model | 65423.755 4 16355.9388 Prob > F = 0.0001 Residual | 767440.877 299 2566.69189 R-squared = 0.0786 -------------+-----------------------------Adj R-squared = 0.0662 Total | 832864.632 303 2748.72816 Root MSE = 50.663 -----------------------------------------------------------------------------tothrs | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------lavgsal | 22.45481 8.640065 2.60 0.010 5.451766 39.45785 lsales | -9.975008 2.933707 -3.40 0.001 -15.74834 -4.20168 large | 582.6985 272.4064 2.14 0.033 46.62183 1118.775 large_lavgsal| -59.8758 27.75878 -2.16 0.032 -114.5031 -5.248465 _cons | -34.44257 89.4705 -0.38 0.701 -210.5142 141.6291 ------------------------------------------------------------------------------ g) Given the estimation results, is it necessary to include an intercept in the equation of subquestion f? QUESTION 3 A researcher wants to investigate the effect of the percentage of unemployment (unem) on the real wage (wage), using annual data. She specifies the following equation: (1) waget 0 1waget 1 2unemt 3unemt 1 t 5 a) Using the parameters of equation (1), please give a careful economic interpretation of 2 . b) Using the parameters of equation (1), please calculate the long-run effect of unemployment on the real wage. c) When would it be necessary to include a time trend in equation (1)? d) The researcher investigates first-order autocorrelation of the error term. Please formulate carefully an equation (in which you also explain the notation) that specifies a first-order autocorrelation process of the error term. You do not need to provide a testing procedure on autocorrelation. e) Two Dickey Fuller tests are performed. One Dickey Fuller test for the real wage and another test for the unemployment rate. The zero hypothesis of the test is rejected both for the real wage and for the percentage of unemployment. What are the consequences for the OLSestimator of the regression parameters of equation (1)? 6 Formula card 1. OLS estimator Simple Multiple N (2.4) ˆ1 = ( i 1 i )( i ) N ( i 1 i ) 2 (2.5) ˆ0 ˆ1 X (slides) Var ( ˆ1 ) = 2 N ( i 1 2 i ) (slides) Var ( ˆ j ) = 2 SST j (1 R 2j ) N where SST j ( X ij X j ) 2 i 1 (slides) Var ( ˆ 0 ) = 2. N 1 i2 n i 1 N ( i 1 i ) 2 N RSS ei2 (2.35) i 1 (slides) ˆ 2 = N RSS 1 ) ei2 ( n 2 n 2 i 1 (slides) ˆ 2 = RSS N ( K 1) (slides) se ( ˆ j ) = ˆ TSS j (1 R 2j ) N where TSS j ( X ij X j ) 2 i 1 2. Summary statistics Total sample variation of Y (Total Sum of Squares): N TSS ( i ) 2 (2.12) i 1 R squared: RSS R2 1 TSS (2.14) 7 Adjusted R squared R2 1 ˆ 2 TSS /( N 1) 1 (1 R 2 ). N 1 N ( K 1) 3. Test statistics F-statistic for multiple linear restrictions: F= RSS R RSS N ( K 1) . ~ FM , N ( K 1) RSS M 4. Time series AR(1)-process: Yt 1Yt 1 ut , t 1, 2,.... (5.11) (12.3) Durbin-Watson (DW) statistic on autocorrelation N d (e e t 2 t 1 t N e t 1 )2 (9.10) 2 1 d 2(1 ˆ ). 8 Exam 2: Econometrics, Stata Please note that the required data sets are not provided. This example exam is meant to give you a realistic impression of the exam and the knowledge that we consider as known. QUESTION 1 data set: exam_apple.dta Please give the Stata commands and mention very briefly the conclusions. In all subquestions of this exam you need to assume a significance level of 0.05, unless stated differently. a) Using Ordinary Least Squares, regress the quantity of ecolabeled apples on the price of regular apples, the price of ecolabeled apples, the logarithm of family income (lfaminc), and years of schooling. Show that the residual is uncorrelated with all of the explanatory variables. b) Re-estimate the equation of sub-question a) for households that purchased ecolabeled apples. c) Reconsider the equation of sub-question a). Test whether the price of regular apples and the price of ecolabeled apples have a joint effect on the quantity of apples. What do you conclude? d) Test whether the coefficient on the price of regular apples is the negative of the price of ecolabeled apples. What do you conclude? e) Re-specify and re-estimate the equation of sub-question a) in such a way that you are able to measure the constant elasticity of income on the quantity of apples. f) Please reconsider the regression equation of sub-question a). Test for heteroskedasticity. What do you conclude? g) Reconsider the regression equation of sub-question a). Create a dummy variable (dumh1) which is one for one-person households. Create an additional dummy variable (dumh2) which is one for two-person households. Is there any joint effect of one-person and twopersons households on the dependent variable? What do you conclude? h) Please reconsider the regression equation of sub-question a). Is the effect of the price of regular apples different between males and females, keeping the effects of the price of ecolabeled apples, logarithm of family income, education, age, and gender constant? What do you conclude? 9 Question 2 data set: exam_housing.dta (35 points; a: 8 points; b-d: 9 points each) a) Regress the real housing investment on the housing price index and a time trend. Is there any indication of first-order autocorrelation? b) How would you re-estimate equation a) given your outcome of equation a)? c) Using the Dickey-Fuller test, please test for stationarity (i.e. no unit root) of the two variables real housing investment and the housing price, using the tables of the critical value below. What do you conclude? Table of critical value of DF-test with time trend Signif. level Critical value 1% -3.96 2.5% -3.66 5% -3.41 10% -3.12 Table of critical values of DF-test without time trend Signif. level 1% 2.5% 5% Critical value -3.43 -3.12 -2.86 10% -2.57 d) Let’s assume there is no co-integration. How would you re-estimate the equation of subquestion a) given the outcome of Dickey-Fuller test? 10
© Copyright 2024