Overview Least Angle Regression Tim Hesterberg, Insightful Corp. 16 June 2006 I Why is LARS imporant? I Other packages I GLARS package I Issues I Insightful Research This is joint work with Chris Fraley, with support from NIH SBIR Phase I 1 R43 GM074313-01 Tim Hesterberg, Insightful Corp. Tim Hesterberg, Insightful Corp. Least Angle Regression Why is LARS important? Ridge Regression I S5 Least Angle Regression — Efron, Hastie, Johnstone, Tibshirani (2004) Annals (with discussion) 1. Lasso 2. Forward stagewise 3. Least Angle Regression (LAR) I I I Unifying explanation Fast implementation Fast way to choose tuning parameter 500 Important Many approaches: stagewise, boosting, LASSO, regularization, ... BMI S2 BP 0 I P P (Yi − Yˆi ) + λ βˆj2 S4 S3 S6 AGE SEX −500 I I Minimize Variable Selection in Regression beta I Least Angle Regression S1 0.0 0.1 1.0 theta Tim Hesterberg, Insightful Corp. Least Angle Regression Tim Hesterberg, Insightful Corp. Least Angle Regression 10.0 LASSO I I I I Forward Stagewise Regression P P Minimize (Yi − Yˆi ) + λ |βˆj | Forces small coefficients → 0; gives simpler models. Smaller penalty on large coefficients: less effect on important terms Implementation is more complicated and slower LASSO Ridge Regression SEX 1000 2000 I 500 S4 S3 S6 AGE I S1 3000 0 1000 sum( |beta| ) 2000 3000 sum( |beta| ) Tim Hesterberg, Insightful Corp. Least Angle Regression Tim Hesterberg, Insightful Corp. Forward Stagewise and LASSO March 2003 Least Angle Regression Similarity: Trevor Hastie, Stanford Statistics Find the predictor xj most correlated with r δ = sign(r · xj ) βˆj ← βˆj + δ r ← r − δxj SEX S1 0 I BP 0 0 S4 S3 S6 AGE I BMI S2 −500 BP 1. Initialize: standardize predictors, center y , r = y , β1 = . . . = βp = 0 2. Repeat many times S5 Standardized Coefficients 500 BMI S2 −500 Standardized Coefficients S5 (Forward Stagewise = Least Squares Boosting) 6 Prostate Cancer Data Lasso Forward Stagewise 0.4 0.2 Coefficients 0.6 gleason With orthogonal predictors, yes. I Otherwise similar. gleason age -0.2 age -0.2 I Least Angle Regression provides explanation, and fast implementation. svi lweight pgg45 lbph 0.0 0.4 0.2 svi lweight pgg45 lbph 0.0 Coefficients Are LASSO and infinitesimal forward stagewise identical? lcavol 0.6 lcavol lcp 0.0 0.5 1.5 2.0 t = j |βj | 1.0 2.5 lcp 0 50 100 150 250 Iteration Tim Hesterberg, Insightful Corp. 200 Least Angle Regression Tim Hesterberg, Insightful Corp. Least Angle Regression Stepwise, Forward Stagewise, Least Angle Least Angle Regression Stepwise regression: I Pick predictor most correlated with y I Bring predictor completely into model (full LS fit) X2 O Least Angle Regression: I Pick predictor most correlated with y I Bring predictor into model only to extent it is better than others I Move in least-squares direction until another variable is as correlated B A B = first step for least-angle regression E = point on stagewise path Tim Hesterberg, Insightful Corp. Least Angle Regression S+GLARS I S-PLUS and R, open source I I I lars : Efron and Hastie (S-PLUS and R) I Linear regression I I I I Methods: plot, print, predict, cv, coef I I I Numerically-accurate calculations Factors, splines, polynomials, interactions, . . . Other models (robust regression, . . . ), other penalties Missing data Massive data sets Diagnostics, tools for selecting tuning parameter User-friendly I I I Least Angle Regression Incorporate lars, glmpath Cleanup, consistent interface Incorporate future work by others; provide framework Extensions I glmpath : Park and Hastie (R) I GLM and Cox Proportional Hazards Tim Hesterberg, Insightful Corp. X1 C = projection of y onto space spanned by X1 and X2 . Least Angle Regression LARS - other packages C E Forward stagewise: I Pick predictor most correlated with y I Increment coefficient for predictor Tim Hesterberg, Insightful Corp. D Consistent interface GUI Documentation Tim Hesterberg, Insightful Corp. Least Angle Regression Issues Insightful Research Department I Turn research into software for wide use I I Money I I NIH funding: require commercial potential Insightful: indirect benefit I Outside contributors I Licensing; ability to ship with S-PLUS, I-Miner. I Collaboration I Variety: resampling, missing data, group sequential designs, simulation-based econometric software, functional data, stable distributions, proteomics, microarrays, frailty models, causal modeling External funding — SBIR grants (NIH, NSF, . . . ) I I I I Tim Hesterberg, Insightful Corp. Least Angle Regression Higher standards than academic software (ease of use, robustness, testing) Somewhat easier funding Commercial potential Risk, research element I We’re hiring I We’re looking for good projects and collaborators Tim Hesterberg, Insightful Corp. Least Angle Regression
© Copyright 2024