2.5 Watching an evaluation set
Rather than letting train run the whole boosting loop, you can drive it
one round at a time and inspect a held-out metric after each —
The two pieces are booster-update-one-iter! (do one boosting round) and eval-one-iter (return XGBoost’s metric line, which parse-eval-line turns into a hash).
(require ffi/vector xgboost)
(provide run-example)
The data. Two non-overlapping splits of a y ≈ 2·x₀ + x₁ − x₂
dataset —
(define dtrain (make-dmatrix (f32vector 1.0 2.0 0.5 2.0 1.0 1.5 3.0 0.5 0.0 0.5 3.0 2.0 4.0 2.0 1.0 1.5 1.5 0.5 2.5 3.5 1.5 0.0 1.0 0.0) #:nrow 8 #:ncol 3 #:labels (f32vector 3.5 3.5 6.5 2.0 9.0 4.0 7.0 1.0))) (define deval (make-dmatrix (f32vector 2.0 0.5 0.5 1.0 1.0 1.0 3.5 1.0 0.5 0.5 0.5 0.5) #:nrow 4 #:ncol 3 #:labels (f32vector 4.0 2.0 7.5 1.0)))
Set up the booster. Training with #:rounds 0 and an #:evals list builds the booster and binds both matrices into its cache (so the GC keeps them alive) without doing any boosting yet:
(define booster (train dtrain #:evals (list (cons "eval" deval)) #:objective "reg:squarederror" #:max-depth 3 #:eta 0.1 #:verbosity 0 #:rounds 0))
The loop. Each round, advance the booster and record the parsed metrics for both watched matrices. run-example returns the booster and the per-round history:
(define eval-set (list (cons "train" dtrain) (cons "eval" deval))) (define history (for/list ([iter (in-range 30)]) (booster-update-one-iter! booster iter dtrain) (parse-eval-line (eval-one-iter booster iter eval-set))))
The harness "test/05-train-with-eval.rkt" prints the per-round table and the final metrics, and asserts the evaluation RMSE falls over training:
; iter train-rmse eval-rmse ; 0 3.8019 3.6960 ; 29 0.0530 0.3327
(define (run-example) <r05-data> <r05-setup> <r05-loop> (values booster history))