On this page:
<require>
<provide>
<data>
<fit>
<run-example>
<*>

3.4 Elastic net (0 < α < 1)🔗

The elastic net blends the two penalties: with 0 < α < 1 the objective mixes the lasso’s L1 term and the ridge’s L2 term. It keeps the lasso’s ability to select variables while borrowing the ridge’s stability — in particular it tends to keep or drop groups of correlated predictors together, where the pure lasso would arbitrarily pick one. Setting α = 0 recovers Ridge regression (L2, α = 0) exactly and α = 1 recovers Lasso (L1, α = 1); the interesting models live in between.

On the running fixture, an α = 0.5 fit at a moderate λ both shrinks its coefficients (ridge-like) and drives at least one exactly to zero (lasso-like). At a shared λ, the number of coefficients it zeros sits between ridge (which zeros none) and lasso (which zeros the most) — the characteristic in-between behaviour of the blend.

(require glmnet)

(provide run-example)

<data> ::=
(define X '((1.0 2.0  1.0)
            (2.0 1.0  4.0)
            (3.0 4.0  9.0)
            (4.0 3.0 16.0)
            (5.0 6.0 25.0)
            (6.0 5.0 36.0)))
(define y '(1.0 4.0 3.0 6.0 5.0 8.0))

elastic-net takes an explicit #:alpha. Compare its fit to the ridge and lasso fits at the same λ to see it land between them.

<fit> ::=

(define result (elastic-net X y #:alpha 0.5 #:lambda 0.5))

(define (run-example)
  <data>
  <fit>
  result)

<*> ::=