3.7 Cox proportional hazards (survival)

Racket

3.7 Cox proportional hazards (survival)🔗ℹ

The Cox family fits a survival model: instead of a label or a number, each observation carries a follow-up time and a 0/1 event indicator (1 = the event was observed, 0 = right-censored). Cox has no intercept — the baseline hazard is left unspecified — so the fit returns coefficients only, on the log relative-hazard scale, where a positive coefficient raises the hazard (shortens survival). The elastic-net knobs (α, λ) are the same as every other family.

In the synthetic cohort below x₁ is a risk factor: subjects with higher x₁ have the event sooner, while the low-risk subjects are censored; x₂ is noise. A lasso-penalized fit recovers a positive β₁ and drops x₂ to exactly 0.0.

<require> ::=

(require glmnet)

<provide> ::=

(provide run-example)

<data> ::=

(define X '((0.5 1.0) (1.0 2.0) (1.5 1.0) (2.0 2.0)
(2.5 1.0) (3.0 2.0) (3.5 1.0) (4.0 2.0)))
; follow-up times (higher-risk subjects fail sooner)
(define times '(12.0 10.0 11.0 8.0 6.0 7.0 4.0 3.0))
; 1 = event observed, 0 = right-censored
(define statuses '(0 0 0 1 1 1 1 1))

cox-fit takes the predictor matrix, the times, and the 0/1 statuses. There is no intercept, so cox-result-coefficients is the whole model; β₁ comes back positive (risk rises with x₁) and the noise β₂ is exactly 0.0.

<fit> ::=

(define result (cox-fit X times statuses #:lambda 0.1))

cox-relative-risk turns a fit into exp(x·β), the hazard relative to the baseline, so subjects can be ranked by risk; cox-linear-predictor returns the raw log relative hazard x·β:

(cox-relative-risk result X)
; => relative hazard per subject; larger = higher risk
(cox-linear-predictor result X)

<run-example> ::=

(define (run-example)
  <data>
  <fit>
  result)

<*> ::=

<require>
<provide>
<run-example>

1	Getting started
2	User guide
3	Examples
4	API reference

3.1	Ordinary least squares (λ = 0)
3.2	Ridge regression (L2, α = 0)
3.3	Lasso (L1, α = 1)
3.4	Elastic net (0 < α < 1)
3.5	Binomial logistic regression (classification)
3.6	Multinomial classification (K classes)
3.7	Cox proportional hazards (survival)
3.8	Poisson regression (counts)
3.9	Multi-response Gaussian (grouped)