XGBoost
| (require xgboost) | package: xgboost |
This module provides the high-level Racket API for XGBoost. It accepts ordinary Racket data for common training and prediction workflows, while keeping native XGBoost handles behind opaque Racket values.
For lower-level access, use xgboost/ffi. For direct C FFI bindings, use xgboost/ffi/raw.
The xgboost/ffi module also exposes lower-level DMatrix constructors for URI loading, dense array-interface input, CSR, CSC, and columnar array-interface input. These are intended for callers that already work with native-style buffers or XGBoost JSON array-interface strings. It also exposes local DMatrix metadata and dataset operations such as feature names/types, uint info, row slicing, binary DMatrix saving, and quantile-cut inspection. Booster inspection APIs cover lifecycle queries, JSON config round-trips, attributes, feature info, model dumps, and feature scores. CPU inplace prediction APIs support dense, CSR, and columnar array-interface inputs for serving-style prediction without constructing a DMatrix first. Custom objective training is available by supplying gradient and Hessian vectors for one boosting iteration at a time.
1 Example
(require xgboost) (define dtrain (make-dmatrix '((1.0 2.0 0.5) (2.0 1.0 1.5) (3.0 0.5 0.0) (0.5 3.0 2.0)) #:labels '(3.5 3.5 6.5 2.0))) (define booster (train dtrain #:objective "reg:squarederror" #:max-depth 2 #:eta 0.2 #:verbosity 0 #:rounds 10)) (predict booster dtrain)
2 Data
procedure
(make-dmatrix data [ #:nrow nrow #:ncol ncol #:missing missing #:labels labels #:weights weights]) → dmatrix? data : any/c nrow : (or/c #f exact-positive-integer?) = #f ncol : (or/c #f exact-positive-integer?) = #f missing : real? = +nan.0 labels : (or/c #f any/c) = #f weights : (or/c #f any/c) = #f
data may be one of:
a list of row lists
a vector of row vectors
a flat row-major list, vector, or f32vector?, when both #:nrow and #:ncol are supplied
Rows must be rectangular. Labels and weights may be lists, vectors, or f32vector? values, and their lengths must match the inferred row count.
DMatrix lifetimes are managed by Racket’s GC: the underlying XGBoost handle is reclaimed once the wrapper is unreachable. There is no public free operation; if you need deterministic release for a long-lived workload, import (submod xgboost/ffi unsafe).
procedure
(make-dmatrix-from-csr indptr indices data ncol [ missing]) → dmatrix? indptr : u64vector? indices : u32vector? data : f32vector? ncol : exact-positive-integer? missing : real? = +nan.0
procedure
(make-dmatrix-from-csc indptr indices data nrow [ missing]) → dmatrix? indptr : u64vector? indices : u32vector? data : f32vector? nrow : exact-positive-integer? missing : real? = +nan.0
procedure
(make-dmatrix-from-columnar columns [ missing]) → dmatrix? columns : (listof f32vector?) missing : real? = +nan.0
procedure
(make-dmatrix-from-uri uri-or-path [ #:format format #:silent? silent?]) → dmatrix? uri-or-path : (or/c path-string? string?) format : (or/c #f "libsvm" "csv") = #f silent? : any/c = #t
procedure
dm : dmatrix?
procedure
dm : dmatrix?
procedure
(dmatrix->list dm) → (listof (listof real?))
dm : dmatrix?
procedure
(dmatrix-show dm [port]) → void?
dm : dmatrix? port : output-port? = (current-output-port)
procedure
(dmatrix-slice dm indices [ #:allow-groups? allow-groups?]) → dmatrix? dm : dmatrix? indices : (or/c list? vector? s32vector?) allow-groups? : any/c = #f
procedure
(dmatrix-save-binary! dm path [ #:silent? silent?]) → void? dm : dmatrix? path : path-string? silent? : any/c = #t
3 DMatrix Metadata
These helpers attach and retrieve the well-known XGBoost label/weight fields, ranking groups, and feature info. Label/weight/group setters accept lists, vectors, or typed vectors and coerce internally.
procedure
(dmatrix-set-label! dm xs) → void?
dm : dmatrix? xs : any/c
procedure
(dmatrix-set-weight! dm xs) → void?
dm : dmatrix? xs : any/c
procedure
(dmatrix-set-base-margin! dm xs) → void?
dm : dmatrix? xs : any/c
procedure
(dmatrix-set-label-lower-bound! dm xs) → void?
dm : dmatrix? xs : any/c
procedure
(dmatrix-set-label-upper-bound! dm xs) → void?
dm : dmatrix? xs : any/c
procedure
(dmatrix-set-group! dm sizes) → void?
dm : dmatrix? sizes : any/c
procedure
(dmatrix-set-feature-names! dm names) → void?
dm : dmatrix? names : (listof string?)
procedure
(dmatrix-set-feature-types! dm types) → void?
dm : dmatrix? types : (listof string?)
procedure
(dmatrix-label dm) → (listof real?)
dm : dmatrix?
procedure
(dmatrix-weight dm) → (listof real?)
dm : dmatrix?
procedure
(dmatrix-base-margin dm) → (listof real?)
dm : dmatrix?
procedure
(dmatrix-group-ptr dm) → (listof exact-nonnegative-integer?)
dm : dmatrix?
procedure
(dmatrix-feature-names dm) → (listof string?)
dm : dmatrix?
procedure
(dmatrix-feature-types dm) → (listof string?)
dm : dmatrix?
procedure
(dmatrix-quantile-cut dm) →
(listof exact-nonnegative-integer?) f32vector? dm : dmatrix?
Quantile cuts only exist after at least one round of "tree_method" "hist" training; calling this before training returns empty results.
4 Training
procedure
(train dtrain [ #:params params #:rounds rounds #:evals evals #:objective objective #:objective-fn objective-fn #:eta eta #:max-depth max-depth #:num-class num-class #:eval-metric eval-metric #:verbosity verbosity]) → booster? dtrain : dmatrix? params : any/c = '() rounds : exact-nonnegative-integer? = 10 evals : (listof (cons/c string? dmatrix?)) = '() objective : (or/c #f any/c) = #f objective-fn : (or/c #f (-> f32vector? dmatrix? any)) = #f eta : (or/c #f any/c) = #f max-depth : (or/c #f any/c) = #f num-class : (or/c #f any/c) = #f eval-metric : (or/c #f any/c) = #f verbosity : (or/c #f any/c) = #f
params may be a hash or association list. Parameter keys may be strings, symbols, or keywords. Symbol and keyword keys are converted to XGBoost-style names by replacing hyphens with underscores. Parameter values are converted to strings before reaching the FFI layer.
Keyword conveniences such as #:objective and #:max-depth are merged after params, so they override entries with the same XGBoost parameter name.
When #:objective-fn is supplied, each round runs a Racket-side custom objective: train computes margin predictions, calls (objective-fn preds dtrain) which must return (values grad hess) (lists, vectors, or f32vector? values), and feeds those gradients into one boosting iteration. Use #:rounds 0 to construct an untrained Booster bound to dtrain (and any #:evals entries) for manual stepping with booster-update-one-iter! or booster-train-one-iter!.
The returned Booster retains references to dtrain and all evaluation DMatrices so native cache inputs remain live while the Booster is live.
procedure
(make-booster) → booster?
procedure
(booster-cache booster) → (listof dmatrix?)
booster : booster?
procedure
(booster-num-feature booster) → exact-nonnegative-integer?
booster : booster?
procedure
(booster-boosted-rounds booster) → exact-nonnegative-integer?
booster : booster?
procedure
(booster-reset! booster) → void?
booster : booster?
procedure
(booster-slice booster begin-layer end-layer [ step]) → booster? booster : booster? begin-layer : exact-integer? end-layer : exact-integer? step : exact-positive-integer? = 1
procedure
(booster-set-param! booster key value) → void?
booster : booster? key : any/c value : any/c
procedure
(booster-update-one-iter! booster iter dtrain) → void? booster : booster? iter : exact-integer? dtrain : dmatrix?
procedure
(booster-train-one-iter! booster iter dtrain grad hess) → void? booster : booster? iter : exact-integer? dtrain : dmatrix? grad : any/c hess : any/c
5 Booster Attributes and Configuration
procedure
(booster-set-attr! booster key value) → void?
booster : booster? key : string? value : string?
procedure
(booster-attr booster key) → (or/c #f string?)
booster : booster? key : string?
procedure
(booster-attr-names booster) → (listof string?)
booster : booster?
procedure
(booster-delete-attr! booster key) → void?
booster : booster? key : string?
procedure
(booster-set-feature-names! booster names) → void?
booster : booster? names : (listof string?)
procedure
(booster-set-feature-types! booster types) → void?
booster : booster? types : (listof string?)
procedure
(booster-feature-names booster) → (listof string?)
booster : booster?
procedure
(booster-feature-types booster) → (listof string?)
booster : booster?
procedure
(booster-config booster) → string?
booster : booster?
procedure
(booster-set-config! booster config) → void?
booster : booster? config : string?
procedure
(booster-dump booster [ #:format format #:with-stats? with-stats? #:feature-names feature-names #:feature-types feature-types]) → (listof string?) booster : booster? format : (or/c "text" "json" "dot") = "text" with-stats? : any/c = #f feature-names : (or/c #f (listof string?)) = #f feature-types : (or/c #f (listof string?)) = #f
procedure
(booster-feature-score booster [ #:importance-type importance-type #:feature-names feature-names #:config config]) → (hash/c symbol? any/c) booster : booster? importance-type : string? = "weight" feature-names : (or/c #f (listof string?)) = #f config : (or/c #f string?) = #f
6 Inplace Prediction
These variants run prediction directly against ordinary Racket data without constructing a DMatrix. They share the same #:output, #:iteration-end, and #:as keywords as predict.
procedure
(predict-from-dense booster data [ #:nrow nrow #:ncol ncol #:missing missing #:output output #:iteration-end iteration-end #:as as]) → (or/c (listof real?) f32vector?) booster : booster? data : any/c nrow : (or/c #f exact-positive-integer?) = #f ncol : (or/c #f exact-positive-integer?) = #f missing : real? = +nan.0
output :
(or/c 'value 'margin 'contribs 'approx-contribs 'interactions 'approx-interactions 'leaf) = 'value iteration-end : exact-nonnegative-integer? = 0 as : (or/c 'list 'f32vector) = 'list
procedure
(predict-from-csr booster indptr indices data ncol [ #:missing missing #:output output #:iteration-end iteration-end #:as as]) → (or/c (listof real?) f32vector?) booster : booster? indptr : u64vector? indices : u32vector? data : f32vector? ncol : exact-positive-integer? missing : real? = +nan.0
output :
(or/c 'value 'margin 'contribs 'approx-contribs 'interactions 'approx-interactions 'leaf) = 'value iteration-end : exact-nonnegative-integer? = 0 as : (or/c 'list 'f32vector) = 'list
procedure
(predict-from-columnar booster columns [ #:missing missing #:output output #:iteration-end iteration-end #:as as]) → (or/c (listof real?) f32vector?) booster : booster? columns : (listof f32vector?) missing : real? = +nan.0
output :
(or/c 'value 'margin 'contribs 'approx-contribs 'interactions 'approx-interactions 'leaf) = 'value iteration-end : exact-nonnegative-integer? = 0 as : (or/c 'list 'f32vector) = 'list
procedure
(predict booster dmat [ #:output output #:iteration-end iteration-end #:as as]) → (or/c (listof real?) f32vector?) booster : booster? dmat : dmatrix?
output :
(or/c 'value 'margin 'contribs 'approx-contribs 'interactions 'approx-interactions 'leaf) = 'value iteration-end : exact-nonnegative-integer? = 0 as : (or/c 'list 'f32vector) = 'list
By default, predictions are returned as a list. Pass #:as 'f32vector to receive the Racket f32vector? copied from XGBoost output.
#:iteration-end limits prediction to the first N boosting rounds; 0 means all available rounds.
7 Evaluation
procedure
(eval-one-iter booster iter evals) → string?
booster : booster? iter : exact-integer? evals : (listof (cons/c string? dmatrix?))
procedure
(parse-eval-line line) → (hash/c string? real?)
line : string?
8 Model IO
procedure
(save-model booster path) → void?
booster : booster? path : path-string?
procedure
(load-model path) → booster?
path : path-string?
procedure
(save-model-to-bytes booster [ #:format format]) → bytes? booster : booster? format : (or/c "json" "ubj") = "ubj"
procedure
(load-model-from-bytes data) → booster?
data : bytes?
9 Booster Snapshots
save-model / load-model persist only the trained tree ensemble. Snapshots additionally capture XGBoost’s internal training caches, so a restored Booster can resume per-iteration updates in lockstep with the original.
procedure
(booster->bytes booster) → bytes?
booster : booster?
procedure
(bytes->booster data) → booster?
data : bytes?
10 Version
procedure
procedure
11 Process Configuration
procedure
procedure
(xgboost-set-global-config! config) → void?
config : string?
procedure
(xgboost-register-log-callback! callback) → void?
callback : (-> string? any/c)
The callback receives log messages as strings. Since the registration is process-global, callers should treat it as shared mutable process state.