Outline The issue Splines Penalized Comparison An extension Software Comments Smoothing with penalized splines A brief introduction and an illustrative application Antonio Gasparrini Department of Social and Environmental Health Research London School of Hygiene and Tropical Medicine (LSHTM) Centre for Statistical Methodology – LSHTM 29 May 2015 Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Outline 1 The issue 2 Splines 3 A penalized approach 4 A comparison 5 An extension 6 Software 7 Some comments Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Outline 1 The issue 2 Splines 3 A penalized approach 4 A comparison 5 An extension 6 Software 7 Some comments Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Relationships between x and y −5 0 y 5 10 15 Scatterplot of x and y 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Non-linearity −5 0 y 5 10 15 Scatterplot of x and y 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Regression models A relationship between a predictor x and an outcome y is usually estimated through regression models, controlling for potential confounders In the simple linear case: yi = α + f (xi ) + P X γp zip (1) p=1 A number of alternative options are available for representing f (x), describing the relationship as a smooth shape Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Smoothing methods Parametric Polynomials, fractional polynomials, regression splines In between Penalized splines Non-parametric Lowess, kernel, smoothing splines Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Outline 1 The issue 2 Splines 3 A penalized approach 4 A comparison 5 An extension 6 Software 7 Some comments Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Splines: basis representation A spline is a numeric function composed by piecewise-connected polynomial functions The advantage of using regression splines is that f (x) can be represented in a basis form: f (xi ; β) = d X βj bj (xi ) = xT β (2) j=1 where bj (x) are a series of d (known) basis transformations of x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Regression splines This type of splines allow the use of standard estimation methods, derived by minimizing the usual least square objective: N X y − α − f (x; β) + i=1 P X 2 γp zp = ||y − α − Xβ − Zγ||2 (3) p=1 Several different transformations for bj (x) (e.g. B-splines, natural splines), determining the mathematical properties Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Graphical representation - I 0.5 0.0 Splines 1.0 Knots and splines 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Graphical representation - II Splines and coefficients −0.49 1.09 2 4 4.45 13.29 6 8 4.26 0.5 0.0 Splines 1.0 1.01 0 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Graphical representation - III 5 −5 0 Terms 10 15 Sum of linear terms 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Graphical representation - IV −5 0 y 5 10 15 Estimated relationship 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Graphical representation - V −5 0 y 5 10 15 Estimated and true 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Problems and limitations In regression splines, the smoothness of the fitted curve is determined by: the degree of the spline the specific parameterization the number of knots the location of knots No general selection method for number and position of knots Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Number of knots 15 Degree of smoothness −5 0 y 5 10 df = 1 df = 3 df = 5 df = 7 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Knots location 15 Best fitting models −5 0 y 5 10 5k at eq.−spaced val 8k at eq.−spaced per 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Outline 1 The issue 2 Splines 3 A penalized approach 4 A comparison 5 An extension 6 Software 7 Some comments Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Generalized additive models A general framework of smoothing methods is offered by generalized additive models (GAMs) GAMs extends traditional GLMs by allowing the linear predictor to depend linearly on unknown smooth functions. In the linear case: yi = α + f (xi ) + P X f (zip ) (4) p=1 where f are traditionally represented by non-parametric terms such as smoothing splines of lowess Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Penalty The idea is to define a flexible function and control the smoothness through a penalty term, usually on the second derivative The objective in (3) is modified to: N X y − α − f (x; β) + i=1 P X 2 Z f (zip ) + λ [f 00 (x)]2 dx (5) p=1 with λ as smoothing parameter Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Penalized splines However, traditional GAMs are limited by complex and computationally-heavy estimation methods Penalized splines offer an flexible and efficient version of GAM, based on low-rank basis transformations The objective in (5) can be re-written in matrix terms as: ||y − α − Xβ − Zγ||2 + λβ T Sβ (6) where S is a penalty matrix Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Smoothers Alternative smoothers available, differing by parameterization and penalty: Thin-plate splines Cubic splines P-splines Random-effects Markov random fields Soap film smooths ... Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Graphical representation - I Increasing knots and splines Splines 1.0 0.5 0.0 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Graphical representation - II 0.5 0.39 0.26 1.70 4.52 9.25 7.42 0.79 0.11 1.07 3.25 7.62 9.43 2.19 1.20 0.11 0.54 2.20 5.88 10.07 4.82 0.0 Splines 1.0 Splines and coefficients 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Graphical representation - III Sum of linear terms 15 Terms 10 5 0 −5 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Estimation Estimation concerns coefficients of penalized and unpenalized terms (α, β, γ) and smoothing parameters (λs) For the former, a penalized iteratively reweighted least squares (P-IRLS) scheme is used Estimation of λs is integrated through either outer iteration or performance iteration, using GCV, UBRE/AIC or REML Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Advantages Relatively low-rank basis and simplified penalties Completely parametric form Number and location of knots not critical Automatic smoothing selection Efficient computational methods Well-grounded theoretical framework Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Outline 1 The issue 2 Splines 3 A penalized approach 4 A comparison 5 An extension 6 Software 7 Some comments Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Comparison Comparison of smoothing methods Fractional polynomials Regression splines Penalized splines y 10 5 0 0 2 4 6 8 10 x Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Simulations Comparing alternative methods: Fractional polynomials Regression splines Penalized splines Different shapes: Linear Decay Peak Complex Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Simulated shapes Decay 10 8 8 6 6 4 4 y y Linear 10 2 2 0 0 −2 −2 0 2 4 6 8 10 0 2 4 x Peak 8 10 8 10 Complex 10 10 8 8 6 6 4 4 y y 6 x 2 2 0 0 −2 −2 0 2 4 6 x Gasparrini A Smoothing with penalized splines 8 10 0 2 4 6 x LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Simulation results - I Fractional polynomials Decay 10 8 8 6 6 4 4 y y Linear 10 2 2 0 0 −2 −2 0 2 4 6 8 10 0 2 4 x Peak 8 10 8 10 Complex 10 10 8 8 6 6 4 4 y y 6 x 2 2 0 0 −2 −2 0 2 4 6 x Gasparrini A Smoothing with penalized splines 8 10 0 2 4 6 x LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Simulation results - II Regression splines Decay 10 8 8 6 6 4 4 y y Linear 10 2 2 0 0 −2 −2 0 2 4 6 8 10 0 2 4 x Peak 8 10 8 10 Complex 10 10 8 8 6 6 4 4 y y 6 x 2 2 0 0 −2 −2 0 2 4 6 x Gasparrini A Smoothing with penalized splines 8 10 0 2 4 6 x LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Simulation results - III Penalized splines Decay 10 8 8 6 6 4 4 y y Linear 10 2 2 0 0 −2 −2 0 2 4 6 8 10 0 2 4 x Peak 8 10 8 10 Complex 10 10 8 8 6 6 4 4 y y 6 x 2 2 0 0 −2 −2 0 2 4 6 x Gasparrini A Smoothing with penalized splines 8 10 0 2 4 6 x LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Simulation results - IV Statistics Linear Linear Decay Decay Peak Peak Complex Comples Gasparrini A Smoothing with penalized splines Fun GLM GAM GLM GAM GLM GAM GLM GAM (e)df 2.21 1.28 3.45 2.61 4.82 4.06 6.65 5.87 Bias 0.01 0.00 0.05 0.13 0.05 0.19 0.11 0.31 Cov 0.90 0.95 0.87 0.94 0.89 0.94 0.87 0.91 RMSE 0.82 0.66 0.89 0.76 0.93 0.84 1.02 0.94 LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Outline 1 The issue 2 Splines 3 A penalized approach 4 A comparison 5 An extension 6 Software 7 Some comments Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Distributed lag non-linear models Statistical tools to model non-linear and lagged dependencies Defined by a cross-basis function of lagged exposures: s(xt−`0 , . . . , xt−` ) = L X f ·w (xt−` , `) (7) `=`0 The function is composed of an exposure response function f (x) and a lag-response function w (`) Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments An example 1.10 1.05 RR 0 1.00 5 0.95 La g 10 20 15 15 Tem p 10 erat Gasparrini A Smoothing with penalized splines ure 5 (C) 0 20 LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Tensor product basis Parameterized by a special tensor product: T s(xt−`0 , . . . , xt−` ) = (1T L−`0 +1 At )η = wt η (8) T At = (1T v` ⊗ Rt ) (C ⊗ 1vx ) (9) with where Rt and C are basis matrices for x and `, respectively Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Penalized DLNMs Question: what about a penalized version of DLNMs? Modify objective in (6) to: T ||y−α−Wη −Zγ||2 +η T λx 1T ⊗ S +λ S ⊗ 1 η (10) x ` ` v` vx with λx , λ` and Sx , S` as smoothing parameters and penalty matrices for each dimension Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Simulated surfaces Scenario 1 Scenario 2 Scenario 3 0.5 0.6 0.4 0.10 Outcome 0.4 0.3 Outcome Outcome 0.05 0.2 0.2 0.1 0.00 0.0 0.0 10 10 0 8 10 6 x 4 2 30 0 40 Gasparrini A Smoothing with penalized splines 20 g La 10 0 8 10 6 x 4 2 30 0 40 20 g La 0 8 10 6 x 4 2 30 20 g La 0 40 LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Simulation results Scenario 1 Scenario 2 Scenario 3 20 Lag 30 40 0 10 30 40 GAMps 0 2 4 6 x Gasparrini A Smoothing with penalized splines 8 10 0 10 20 Lag 30 40 GAMps Outcome 0.1 0.3 −0.1 −0.1 −0.10 Outcome 0.1 0.3 Outcome 0.05 0.20 GAMps 20 Lag 0.5 10 −0.1 Outcome 0.1 0.3 Outcome 0.1 0.3 −0.1 −0.10 0 GAMps 0.5 GAMps Outcome 0.05 0.20 GAMps 0 2 4 6 x 8 10 0 2 4 6 8 10 x LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments An application Lag−response for 29C Overall cumulative exposure−response 2.0 RR 1.5 1.2 1.1 RR 1.2 2.5 1.3 3.0 Exposure−lag−response 1.0 RR 1.0 1.1 1.0 g 10 La 20 Tem 10 pera ture 0.5 0.9 0 5 0.9 0 5 10 15 20 0 5 10 15 20 25 30 15 0 20 Gasparrini A Smoothing with penalized splines Lag Temperature LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Outline 1 The issue 2 Splines 3 A penalized approach 4 A comparison 5 An extension 6 Software 7 Some comments Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments The R package mgcv Collection of functions implementing GAMs with penalized splines Written by Simon Wood, extensively documented Example of code: library(mgcv) model <- gam(y ~ s(x,bs="ps") + z, data, family=gaussian, method="REML") The function s determines the spline transformations and penalties Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments The R package dlnm Collection of functions implementing DLNMs Example of code: library(dlnm) cb <- crossbasis(x,lag=c(10), argvar=list(fun="bs",degree=2,knots=5,cen=0), arglag=list(fun="ns",knots=c(3,6),int=F)) model <- glm(y ~ s(x,bs="ps") + z, data, family=gaussian) pred <- crosspred(cb,model) plot(pred,"3d",xlab="x",ylab="Lag",zlab="Effect") Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Embedding dlnm and mgcv Example of code: library(dlnm) ; library(mgcv) cb <- crossbasis(x,lag=c(10), argvar=list(fun="ps"), arglag=list(fun="ps")) pen <- cbPen(cb) model <- model <- gam(y ~ cb + z, data, family=gaussian, method="REML",parapen=list(cb=list(pen))) pred <- crosspred(cb,model) plot(pred,"3d",xlab="x",ylab="Lag",zlab="Effect") Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Outline 1 The issue 2 Splines 3 A penalized approach 4 A comparison 5 An extension 6 Software 7 Some comments Gasparrini A Smoothing with penalized splines LSHTM Outline The issue Splines Penalized Comparison An extension Software Comments Some comments Penalized splines combine the flexibility of non-parametric methods with stability and simplicity of parametric smoothers Based on theoretically-grounded and computationally-efficient estimators Well implemented in the package mgcv in R Research still ongoing Gasparrini A Smoothing with penalized splines LSHTM
© Copyright 2024