2.26 Parameter recipes

Racket

2.26 Parameter recipes🔗ℹ

Several upstream XGBoost tutorials are really just parameter settings on top of ordinary training. This example collects four, all expressed through #:params: the DART booster ("booster=dart" with drop rates), monotonic constraints ("monotone_constraints"), feature-interaction constraints ("interaction_constraints"), and random-forest mode ("num_parallel_tree" with subsampling, one boosting round). The headline check is that the monotone model really is non-decreasing in the constrained feature.

<r28-require> ::=

(require ffi/vector
racket/list
xgboost)

<r28-provide> ::=

(provide run-example)

Synthetic data. A deterministic 200-row set where y increases in feature 0, depends mildly on feature 1, and feature 2 is noise:

<r28-data> ::=

(define ncol 3)
(define (make-data)
  (random-seed 20260531)
  (define (rnd) (- (* 2.0 (random)) 1.0))
  (define xs (for/list ([_ (in-range 200)]) (list (rnd) (rnd) (rnd))))
  (values xs (for/list ([x (in-list xs)])
               (+ (* 2.0 (first x)) (* 0.5 (second x)) (* 0.2 (rnd))))))
(define (rows->dmatrix xs ys)
  (make-dmatrix (list->f32vector (map exact->inexact (append* xs)))
                #:nrow (length xs) #:ncol ncol
                #:labels (list->f32vector (map exact->inexact ys))))

The recipes. Each is a thin wrapper over train with a different #:params:

<r28-recipes> ::=

(define (train-recipe xs ys extra)
  (train (rows->dmatrix xs ys)
         #:params (cons '(objective . "reg:squarederror") extra)
         #:max-depth 3 #:eta 0.1 #:verbosity 0 #:rounds 30))
(define (recipe-dart xs ys)
  (train-recipe xs ys '((booster . "dart") (rate_drop . "0.1") (skip_drop . "0.5"))))
(define (recipe-monotone xs ys)
  (train-recipe xs ys '((monotone_constraints . "(1,0,0)"))))
(define (recipe-interaction xs ys)
  (train-recipe xs ys '((interaction_constraints . "[[0,1],[2]]"))))
(define (recipe-random-forest xs ys)
  (train (rows->dmatrix xs ys)
         #:params '((objective . "reg:squarederror") (num_parallel_tree . "20")
                    (subsample . "0.8") (colsample_bynode . "0.8"))
         #:max-depth 4 #:eta 1.0 #:verbosity 0 #:rounds 1))

Monotonicity probe. Sweep feature 0 with the others held at 0 and count any decrease in the prediction:

<r28-probe> ::=

(define (monotone-violations bst)
  (define probe (for/list ([k (in-range 21)]) (list (+ -1.0 (* 0.1 k)) 0.0 0.0)))
  (define preds
    (predict-from-dense bst (list->f32vector (map exact->inexact (append* probe)))
                        #:nrow (length probe) #:ncol ncol))
  (for/sum ([a (in-list preds)] [b (in-list (cdr preds))])
    (if (< b (- a 1e-6)) 1 0)))

The run. Train all four, record their round counts and a sample prediction, and probe the monotone model. run-example returns the per-recipe summary plus the violation count (expected 0):

<r28-run> ::=

(define (run-example)
  (define-values (xs ys) (make-data))
  (define recipes
    (list (cons "dart" (recipe-dart xs ys))
          (cons "monotone" (recipe-monotone xs ys))
          (cons "interaction" (recipe-interaction xs ys))
          (cons "random-forest" (recipe-random-forest xs ys))))
  (define summary
    (for/list ([r (in-list recipes)])
      (define preds (predict (cdr r) (rows->dmatrix xs ys)))
      (list (car r) (booster-boosted-rounds (cdr r)) (car preds)
            (for/and ([p (in-list preds)]) (= p p)))))
  (define mono (cdr (assoc "monotone" recipes)))
  (hash 'summary summary 'violations (monotone-violations mono)))

The harness "test/28-param-recipes.rkt" prints each recipe’s round count and a sample prediction, and asserts every recipe gives finite predictions and the monotone constraint holds across the sweep (zero violations).

<*> ::=

<r28-require>
<r28-provide>
<r28-data>
<r28-recipes>
<r28-probe>
<r28-run>

2.1	Building a DMatrix
2.2	Training a regressor
2.3	Binary classification
2.4	Multiclass classification
2.5	Watching an evaluation set
2.6	Iris: a full classification pipeline
2.7	Get Started
2.8	Robust regression
2.9	Quantile regression
2.10	Poisson count regression
2.11	Survival analysis (AFT)
2.12	Custom objective
2.13	Saving and loading models
2.14	Booster snapshots
2.15	DMatrix constructors
2.16	DMatrix metadata
2.17	Slicing and binary serialization
2.18	Quantile cuts
2.19	The high-level API end to end
2.20	Booster lifecycle and config
2.21	Booster attributes
2.22	Model dumps and feature importance
2.23	In-place prediction (dense)
2.24	In-place prediction (CSR)
2.25	In-place prediction (columnar)
2.26	Parameter recipes
2.27	Learning to rank
2.28	Global and process APIs
2.29	CUDA regression
2.30	CUDA classification