Package `gelnet`

Package ‘gelnet’
May 14, 2015
Version 1.1
Date 2015-05-13
License GPL (>= 3)
Title Generalized Elastic Nets
Description Implements several extensions of the elastic net regularization scheme. These extensions include individual feature penalties for the L1 term, feature-feature penalties for the L2 term, as well as translation coefficients for the latter.
Author Artem Sokolov
Maintainer Artem Sokolov <[email protected]>
Depends R (>= 3.1.0)
NeedsCompilation yes
Repository CRAN
Date/Publication 2015-05-14 08:30:37
R topics documented:
adj2lapl . . . . .
adj2nlapl . . . .
gelnet.klr . . . .
gelnet.krr . . . .
gelnet.L1bin . . .
gelnet.lin . . . .
gelnet.lin.obj . .
gelnet.logreg . .
gelnet.logreg.obj
L1.ceiling . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Index
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. 2
. 2
. 3
. 4
. 4
. 5
. 7
. 8
. 9
. 10
11
1
2
adj2nlapl
adj2lapl
Generate a graph Laplacian
Description
Generates a graph Laplacian from the graph adjacency matrix.
Usage
adj2lapl(A)
Arguments
A
n-by-n adjacency matrix for a graph with n nodes
Details
A graph Laplacian is defined as: li,j = deg(vi ), if i = j; li,j = −1, if i 6= j and vi is adjacent to
vj ; and li,j = 0, otherwise
Value
The n-by-n Laplacian matrix of the graph
See Also
adj2nlapl
adj2nlapl
Generate a normalized graph Laplacian
Description
Generates a normalized graph Laplacian from the graph adjacency matrix.
Usage
adj2nlapl(A)
Arguments
A
n-by-n adjacency matrix for a graph with n nodes
Details
p
A normalized graph Laplacian is defined as: li,j = 1, if i = j; li,j = −1/ deg(vi )deg(vj ), if
i 6= j and vi is adjacent to vj ; and li,j = 0, otherwise
gelnet.klr
3
Value
The n-by-n Laplacian matrix of the graph
See Also
adj2nlapl
Kernel logistic regression
gelnet.klr
Description
Learns a kernel logistic regression model for a binary classification task
Usage
gelnet.klr(K, y, lambda, max.iter = 100, eps = 1e-05, v.init = rep(0,
nrow(K)), b.init = 0.5)
Arguments
K
n-by-n matrix of pairwise kernel values over a set of n samples
y
n-by-1 vector of binary response labels
lambda
scalar, regularization parameter
max.iter
maximum number of iterations
eps
convergence precision
v.init
initial parameter estimate for the kernel weights
b.init
initial parameter estimate for the bias term
Details
The method operates by constructing iteratively re-weighted least squares approximations of the
log-likelihood loss function and then calling the kernel ridge regression routine to solve those approximations. The least squares approximations are obtained via the Taylor series expansion about
the current parameter estimates.
Value
A list with two elements:
v n-by-1 vector of kernel weights
b scalar, bias term for the model
See Also
gelnet.krr
4
gelnet.L1bin
Kernel ridge regression
gelnet.krr
Description
Learns a kernel ridge regression model.
Usage
gelnet.krr(K, y, a, lambda)
Arguments
K
n-by-n matrix of pairwise kernel values over a set of n samples
y
n-by-1 vector of response values
a
n-by-1 vector of samples weights
lambda
scalar, regularization parameter
Details
The entries in the kernel matrix K can be interpreted asP
dot products in some feature space φ. The
corresponding weight vector can be retrieved via w = i vi φ(xi ). However, new samples can be
classified without explicit access to the underlying feature space:
X
X
wT φ(x) + b =
vi φT (xi )φ(x) + b =
vi K(xi , x) + b
i
i
Value
A list with two elements:
v n-by-1 vector of kernel weights
b scalar, bias term for the model
gelnet.L1bin
A GELnet model with a requested number of non-zero weights
Description
Binary search to find an L1 penalty parameter value that yields the desired number of non-zero
weights in a GELnet model.
Usage
gelnet.L1bin(f.gelnet, nF, l1s, max.iter = 10)
gelnet.lin
5
Arguments
f.gelnet
a function that accepts one parameter: L1 penalty value, and returns a typical
GELnets model (list with w and b as its entries)
nF
the desired number of non-zero features
l1s
the right side of the search interval: search will start in [0, l1s]
max.iter
the maximum number of iterations of the binary search
Details
The method performs simple binary search starting in [0, l1s] and iteratively training a model using
the provided f.gelnet. At each iteration, the method checks if the number of non-zero weights
in the model is higher or lower than the requested nF and adjusts the value of the L1 penalty term
accordingly. For linear regression problems, it is recommended to initialize l1s to the output of
L1.ceiling.
Value
The model with the desired number of non-zero weights and the corresponding value of the L1norm parameter. Returned as a list with three elements:
w p-by-1 vector of p model weights
b scalar, bias term for the linear model
l1 scalar, the corresponding value of the L1-norm parameter
See Also
L1.ceiling
Examples
X <- matrix( rnorm(100*20), 100, 20 )
y <- rnorm(100)
l1s <- L1.ceiling( X, y )
f <- function( l1 ) {gelnet.lin( X, y, l1, l2 = 1 )}
m <- gelnet.L1bin( f, nF = 50, l1s = l1s )
print( m$l1 )
gelnet.lin
GELnet for linear regression
Description
Constructs a GELnet model for linear regression using coordinate descent.
6
gelnet.lin
Usage
gelnet.lin(X, y, l1, l2, a = rep(1, n), d = rep(1, p), P = diag(p),
m = rep(0, p), max.iter = 100, eps = 1e-05, w.init = rep(0, p),
b.init = sum(a * y)/sum(a), fix.bias = FALSE, silent = FALSE)
Arguments
X
n-by-p matrix of n samples in p dimensions
y
n-by-1 vector of response values
l1
coefficient for the L1-norm penalty
l2
coefficient for the L2-norm penalty
a
n-by-1 vector of sample weights
d
p-by-1 vector of feature weights
P
p-by-p feature association penalty matrix
m
p-by-1 vector of translation coefficients
max.iter
maximum number of iterations
eps
convergence precision
w.init
initial parameter estimate for the weights
b.init
initial parameter estimate for the bias term
fix.bias
set to TRUE to prevent the bias term from being updated (default: FALSE)
silent
set to TRUE to suppress run-time output to stdout (default: FALSE)
Details
The method operates through cyclical coordinate descent. The optimization is terminated after the
desired tolerance is achieved, or after a maximum number of iterations.
Value
A list with two elements:
w p-by-1 vector of p model weights
b scalar, bias term for the linear model
gelnet.lin.obj
gelnet.lin.obj
7
Linear regression objective function value
Description
Evaluates the linear regression objective function value for a given model. See details.
Usage
gelnet.lin.obj(w, b, X, z, lambda1, lambda2, a = rep(1, nrow(X)), d = rep(1,
ncol(X)), P = diag(ncol(X)), m = rep(0, ncol(X)))
Arguments
w
p-by-1 vector of model weights
b
the model bias term
X
n-by-p matrix of n samples in p dimensions
z
n-by-1 response vector
lambda1
L1-norm penalty scaling factor
lambda2
L2-norm penalty scaling factor
a
n-by-1 vector of sample weights
d
p-by-1 vector of feature weights
P
p-by-p feature-feature penalty matrix
m
p-by-1 vector of translation coefficients
Details
Computes the objective function value according to
1 X
ai (zi − (wT xi + b))2 + R(w)
2n i
where
R(w) = λ1
X
j
Value
The objective function value.
See Also
gelnet.lin
dj |wj | +
λ2
(w − m)T P (w − m)
2
8
gelnet.logreg
GELnet for logistic regression
gelnet.logreg
Description
Constructs a GELnet model for logistic regression using the Newton method.
Usage
gelnet.logreg(X, y, l1, l2, d = rep(1, p), P = diag(p), m = rep(0, p),
max.iter = 100, eps = 1e-05, w.init = rep(0, p), b.init = 0.5,
silent = FALSE)
Arguments
X
y
l1
l2
d
P
m
max.iter
eps
w.init
b.init
silent
n-by-p matrix of n samples in p dimensions
n-by-1 vector of binary response labels
coefficient for the L1-norm penalty
coefficient for the L2-norm penalty
p-by-1 vector of feature weights
p-by-p feature association penalty matrix
p-by-1 vector of translation coefficients
maximum number of iterations
convergence precision
initial parameter estimate for the weights
initial parameter estimate for the bias term
set to TRUE to suppress run-time output to stdout (default: FALSE)
Details
The method operates by constructing iteratively re-weighted least squares approximations of the
log-likelihood loss function and then calling the linear regression routine to solve those approximations. The least squares approximations are obtained via the Taylor series expansion about the
current parameter estimates.
Value
A list with two elements:
w p-by-1 vector of p model weights
b scalar, bias term for the linear model
See Also
gelnet.lin
gelnet.logreg.obj
gelnet.logreg.obj
9
Logistic regression objective function value
Description
Evaluates the logistic regression objective function value for a given model. See details. Computes
the objective function value according to
−
1X
yi si − log(1 + exp(si )) + R(w)
n i
where
si = wT xi + b
R(w) = λ1
X
j
dj |wj | +
λ2
(w − m)T P (w − m)
2
Usage
gelnet.logreg.obj(w, b, X, y, lambda1, lambda2, d = rep(1, ncol(X)),
P = diag(ncol(X)), m = rep(0, ncol(X)))
Arguments
w
p-by-1 vector of model weights
b
the model bias term
X
n-by-p matrix of n samples in p dimensions
y
n-by-1 binary response vector sampled from 0,1
lambda1
L1-norm penalty scaling factor
lambda2
L2-norm penalty scaling factor
d
p-by-1 vector of feature weights
P
p-by-p feature-feature penalty matrix
m
p-by-1 vector of translation coefficients
Value
The objective function value.
See Also
gelnet.logreg
10
L1.ceiling
L1.ceiling
The largest meaningful value of the L1 parameter
Description
Computes the smallest value of the LASSO coefficient L1 that leads to an all-zero weight vector for
a given linear regression problem.
Usage
L1.ceiling(X, y, a = rep(1, nrow(X)), d = rep(1, ncol(X)))
Arguments
X
n-by-p matrix of n samples in p dimensions
y
n-by-1 response vector
a
n-by-1 vector of sample weights
d
p-by-1 vector of feature weights
Details
The cyclic coordinate descent updates the model weight wk using a soft threshold operator S(·, λ1 dk )
that clips the value of the weight to zero, whenever the absolute value of the first argument falls below λ1 dk . From here, it is straightforward to compute the smallest value of λ1 , such that all weights
are clipped to zero.
Value
The largest meaningful value of the L1 parameter (i.e., the smallest value that yields a model with
all zero weights)
Index
adj2lapl, 2
adj2nlapl, 2
gelnet.klr, 3
gelnet.krr, 3, 4
gelnet.L1bin, 4
gelnet.lin, 5, 7, 8
gelnet.lin.obj, 7
gelnet.logreg, 8, 9
gelnet.logreg.obj, 9
L1.ceiling, 10
11