HW#5 - Civil, Environmental and Architectural Engineering

University of Colorado
Department of Civil, Environmental and Architectural Engineering
CVEN 5454 Quantitative Methods
Homework #5
Due May 4th, 2015
Topics: Regression Chapters 11, 12, Helsel and Hirsch Chapters 10,11,12 and 15. Using Rcommands for these will make it easier!
Simple Linear Regression
1. Problem 11-77 (linear regression and transformation)
2. Problem 11-78 (simple algebra with expectations and normal equations)
3. Problem 11-86 (fit regression, perform ANOVA, model diagnostics)
Multiple Linear Regression
4. Problem 12-92 (Quadratic regression)
5. Problem 12-93 (Subset selection)
(a) Part (a) – ‘best model’ using PRESS
(b) Repeat (a) using stepwise regression and compare.
(b) Calculate variance inflation factors for the full model (i.e., model using all the
predictors) and for the best model from (a) above. Which model(s) would you
conclude has a multicollinearity problem?
(c) Perform ANOVA and model diagnostics (including computing Cook’s distance)
for the ‘best model’
6. Consider the data in Problem 12.4 of Helsel and Hirsch. To this data perform the
following.
(a) Compute correlation coefficient
(b) Kendall’s tau
(c) Spearman’s rank correlation
Perform significance test on all of them, report the p-value and summarize.
(d) Fit a robust or nonparametric line – i.e., Kendall-Theil line between Calcium
(dependent variable) and %Carbon-14 (independent variable) and, a linear regression
line. Perform significance test on the slopes from these two regressions and compare
them.
(e) Compute the Mutual Information (MI) between these two variables using Kernel
Density Estimators and its significance. Compare with the correlation coefficients above.
(f) For monsoon rainfall and NINO4 fit (i) linear regression, (ii)robust nonparametric line
and (iii)LOC – line of organic content. Also repeat (e)
Logistic Regression (regression of threshold exceedance variables)
7. Problem 15.1 (Helsel and Hirsch book) – Perform a drop-one cross-validation and
compute the skill score
8. Bonus: Subject data in problem 12-93 to a Bayesian Regression. Show the posterior
distributions of the regression coefficients and 95% confidence interval from the
posterior. Compare your findings with your results from problem 5.