This library is a C++ template-based library that provides linear regressors to gaml. The implemented linear regressor takes an arbitrary number of dimensions in input (even kernel based) but a single output. The only implemented algorithm so far is the Equi-gradient algorithm for LASSO as described in [Loth, 2007]
Download and documentation
The latest gaml-linear package can be found here as a gaml-linear-*.tar.gz package.
The documentation (with examples) is here.
Fitting a noisy sinc using at most 55 + 1 features
The basis is made of 200 randomly drawn points in [-10, 10] with their sinc image, adding a noise in [-.1, .1]. The features are made of gaussians with 11 centers regularly placed in [-10, 10] and 5 variances in [.1, .5, 1, 2, 5]. With a lambda target at 0.1, LASSO selects 12 basis which produces :
This illustration has been produced with the example example-001-lasso.cc of the library.
We use the problem classically considered for testing LARS and LASSO of the diabetes data. Using the library, you can access to the full regularization path down to a minimal lambda. Below we plot the weights associated to each dimension function of the L1 norm of the weights (as in [Erfon 2004] and similar to what is obtained with scikit-learn) ; Caution there is one difference around |beta|_1 = 2000, one weight should get negatively slightly activated but it turns out that it gets activated only later. This is different from what one can see in the scklearn implementation and the results in the paper of [Efron 2004]. This leads to noticeable differences at the end of the path.
This illustration has been produced with the example example-002-diabetes-lasso.cc of the library.
There is also the LARS algorithm,
This illustration has been produced with the example example-002-diabetes-lars.cc of the library. Caution again we notice a difference when a very small coefficient has to be activated. It is positively activated around 2400 while the sklearn implementation activates it negatively around 2200.