Archive for July, 2012

Elastic net, LASSO, and LARS in Python

July 26, 2012

I’m currently looking for implementations of the LASSO and Elastic Net, otherwise known as L1 and L1/L2 regularised linear regression respectively, in Python. The options seem to be scikit.learn and glmnet-python. The former offers coordinate ascent or the LARS algorithm coded in pure Python (with Numpy obviously), whereas the latter just wraps Jerome Friedman’s Fortran code from the R glmnet package.

Timing comparison

Runtime comparison between LASSO/Elastic net implementations from scikit.learn and glmnet-python. x-axis: number of features P. y-axis: time in seconds. Synthetic data with N=400, P/10 non-zero coefficients sampled from N(0,9), and 0.01 variance Gaussian noise.

As you might expect, the Fortran code is significantly faster in general, although for large P the LARS scikit.learn implementation is competitive with glmnet, presumably because the Python overhead becomes less noticeable. Unfortunately as far as I can see scikit.learn does not include a LARS implementation for the elastic net.