3.7 Cox proportional hazards (survival)
The Cox family fits a survival model: instead of a label or a number,
each observation carries a follow-up time and a 0/1 event indicator
(1 = the event was observed, 0 = right-censored). Cox has no intercept —
In the synthetic cohort below x₁ is a risk factor: subjects with higher x₁ have the event sooner, while the low-risk subjects are censored; x₂ is noise. A lasso-penalized fit recovers a positive β₁ and drops x₂ to exactly 0.0.
(require glmnet)
(provide run-example)
(define X '((0.5 1.0) (1.0 2.0) (1.5 1.0) (2.0 2.0) (2.5 1.0) (3.0 2.0) (3.5 1.0) (4.0 2.0))) ; follow-up times (higher-risk subjects fail sooner) (define times '(12.0 10.0 11.0 8.0 6.0 7.0 4.0 3.0)) ; 1 = event observed, 0 = right-censored (define statuses '(0 0 0 1 1 1 1 1))
cox-fit takes the predictor matrix, the times, and the 0/1 statuses. There is no intercept, so cox-result-coefficients is the whole model; β₁ comes back positive (risk rises with x₁) and the noise β₂ is exactly 0.0.
cox-relative-risk turns a fit into exp(x·β), the hazard relative to the baseline, so subjects can be ranked by risk; cox-linear-predictor returns the raw log relative hazard x·β:
(cox-relative-risk result X) ; => relative hazard per subject; larger = higher risk (cox-linear-predictor result X)