XGBoost
1 Example
2 Data
dmatrix?
make-dmatrix
make-dmatrix-from-csr
make-dmatrix-from-csc
make-dmatrix-from-columnar
make-dmatrix-from-uri
dmatrix-rows
dmatrix-cols
dmatrix->list
dmatrix-show
dmatrix-slice
dmatrix-save-binary!
3 DMatrix Metadata
dmatrix-set-label!
dmatrix-set-weight!
dmatrix-set-base-margin!
dmatrix-set-label-lower-bound!
dmatrix-set-label-upper-bound!
dmatrix-set-group!
dmatrix-set-feature-names!
dmatrix-set-feature-types!
dmatrix-label
dmatrix-weight
dmatrix-base-margin
dmatrix-group-ptr
dmatrix-feature-names
dmatrix-feature-types
dmatrix-quantile-cut
4 Training
booster?
train
make-booster
booster-cache
booster-num-feature
booster-boosted-rounds
booster-reset!
booster-slice
booster-set-param!
booster-update-one-iter!
booster-train-one-iter!
5 Booster Attributes and Configuration
booster-set-attr!
booster-attr
booster-attr-names
booster-delete-attr!
booster-set-feature-names!
booster-set-feature-types!
booster-feature-names
booster-feature-types
booster-config
booster-set-config!
booster-dump
booster-feature-score
6 Inplace Prediction
predict-from-dense
predict-from-csr
predict-from-columnar
predict
7 Evaluation
eval-one-iter
parse-eval-line
8 Model IO
save-model
load-model
save-model-to-bytes
load-model-from-bytes
9 Booster Snapshots
booster->bytes
bytes->booster
10 Version
xgboost-version
xgboost-build-info
11 Process Configuration
xgboost-get-global-config
xgboost-set-global-config!
xgboost-register-log-callback!
9.1

XGBoost🔗ℹ

 (require xgboost) package: xgboost

This module provides the high-level Racket API for XGBoost. It accepts ordinary Racket data for common training and prediction workflows, while keeping native XGBoost handles behind opaque Racket values.

For lower-level access, use xgboost/ffi. For direct C FFI bindings, use xgboost/ffi/raw.

The xgboost/ffi module also exposes lower-level DMatrix constructors for URI loading, dense array-interface input, CSR, CSC, and columnar array-interface input. These are intended for callers that already work with native-style buffers or XGBoost JSON array-interface strings. It also exposes local DMatrix metadata and dataset operations such as feature names/types, uint info, row slicing, binary DMatrix saving, and quantile-cut inspection. Booster inspection APIs cover lifecycle queries, JSON config round-trips, attributes, feature info, model dumps, and feature scores. CPU inplace prediction APIs support dense, CSR, and columnar array-interface inputs for serving-style prediction without constructing a DMatrix first. Custom objective training is available by supplying gradient and Hessian vectors for one boosting iteration at a time.

1 Example🔗ℹ

(require xgboost)
 
(define dtrain
  (make-dmatrix '((1.0 2.0 0.5)
                  (2.0 1.0 1.5)
                  (3.0 0.5 0.0)
                  (0.5 3.0 2.0))
                #:labels '(3.5 3.5 6.5 2.0)))
 
(define booster
  (train dtrain
         #:objective "reg:squarederror"
         #:max-depth 2
         #:eta 0.2
         #:verbosity 0
         #:rounds 10))
 
(predict booster dtrain)

2 Data🔗ℹ

procedure

(dmatrix? v)  boolean?

  v : any/c
Returns #t when v is a high-level DMatrix value produced by make-dmatrix.

procedure

(make-dmatrix data    
  [#:nrow nrow    
  #:ncol ncol    
  #:missing missing    
  #:labels labels    
  #:weights weights])  dmatrix?
  data : any/c
  nrow : (or/c #f exact-positive-integer?) = #f
  ncol : (or/c #f exact-positive-integer?) = #f
  missing : real? = +nan.0
  labels : (or/c #f any/c) = #f
  weights : (or/c #f any/c) = #f
Creates an XGBoost DMatrix.

data may be one of:

  • a list of row lists

  • a vector of row vectors

  • a flat row-major list, vector, or f32vector?, when both #:nrow and #:ncol are supplied

Rows must be rectangular. Labels and weights may be lists, vectors, or f32vector? values, and their lengths must match the inferred row count.

DMatrix lifetimes are managed by Racket’s GC: the underlying XGBoost handle is reclaimed once the wrapper is unreachable. There is no public free operation; if you need deterministic release for a long-lived workload, import (submod xgboost/ffi unsafe).

procedure

(make-dmatrix-from-csr indptr    
  indices    
  data    
  ncol    
  [missing])  dmatrix?
  indptr : u64vector?
  indices : u32vector?
  data : f32vector?
  ncol : exact-positive-integer?
  missing : real? = +nan.0
Creates a DMatrix from CSR storage: indptr holds row offsets, indices holds column indices, and data holds the corresponding values. Missing entries materialize as missing.

procedure

(make-dmatrix-from-csc indptr    
  indices    
  data    
  nrow    
  [missing])  dmatrix?
  indptr : u64vector?
  indices : u32vector?
  data : f32vector?
  nrow : exact-positive-integer?
  missing : real? = +nan.0
Creates a DMatrix from CSC storage: indptr holds column offsets, indices holds row indices, and data holds values.

procedure

(make-dmatrix-from-columnar columns    
  [missing])  dmatrix?
  columns : (listof f32vector?)
  missing : real? = +nan.0
Creates a DMatrix from column-major f32vector? columns. All columns must have the same length.

procedure

(make-dmatrix-from-uri uri-or-path    
  [#:format format    
  #:silent? silent?])  dmatrix?
  uri-or-path : (or/c path-string? string?)
  format : (or/c #f "libsvm" "csv") = #f
  silent? : any/c = #t
Loads a DMatrix from a file path or XGBoost URI. When #:format is a string, it is appended as a query parameter (for example, libsvm-format files load as #:format "libsvm").

procedure

(dmatrix-rows dm)  exact-nonnegative-integer?

  dm : dmatrix?
Returns the row count of dm.

procedure

(dmatrix-cols dm)  exact-nonnegative-integer?

  dm : dmatrix?
Returns the column count of dm.

procedure

(dmatrix->list dm)  (listof (listof real?))

  dm : dmatrix?
Materializes dm as a list of row lists. Missing entries appear as +nan.0.

procedure

(dmatrix-show dm [port])  void?

  dm : dmatrix?
  port : output-port? = (current-output-port)
Writes a human-readable rendering of dm to port.

procedure

(dmatrix-slice dm    
  indices    
  [#:allow-groups? allow-groups?])  dmatrix?
  dm : dmatrix?
  indices : (or/c list? vector? s32vector?)
  allow-groups? : any/c = #f
Returns a fresh DMatrix containing the rows of dm selected by indices. Pass #:allow-groups? #t to slice a DMatrix that carries ranking group information.

procedure

(dmatrix-save-binary! dm    
  path    
  [#:silent? silent?])  void?
  dm : dmatrix?
  path : path-string?
  silent? : any/c = #t
Writes dm to path in XGBoost’s binary buffer format. Reload with make-dmatrix-from-uri.

3 DMatrix Metadata🔗ℹ

These helpers attach and retrieve the well-known XGBoost label/weight fields, ranking groups, and feature info. Label/weight/group setters accept lists, vectors, or typed vectors and coerce internally.

procedure

(dmatrix-set-label! dm xs)  void?

  dm : dmatrix?
  xs : any/c
Sets the "label" float info field.

procedure

(dmatrix-set-weight! dm xs)  void?

  dm : dmatrix?
  xs : any/c
Sets the "weight" float info field.

procedure

(dmatrix-set-base-margin! dm xs)  void?

  dm : dmatrix?
  xs : any/c
Sets the "base_margin" float info field.

procedure

(dmatrix-set-label-lower-bound! dm xs)  void?

  dm : dmatrix?
  xs : any/c
Sets the "label_lower_bound" field used by AFT survival objectives.

procedure

(dmatrix-set-label-upper-bound! dm xs)  void?

  dm : dmatrix?
  xs : any/c
Sets the "label_upper_bound" field used by AFT survival objectives. Use +inf.0 for right-censored observations.

procedure

(dmatrix-set-group! dm sizes)  void?

  dm : dmatrix?
  sizes : any/c
Sets the ranking group sizes. The cumulative "group_ptr" is maintained by XGBoost and read back via dmatrix-group-ptr.

procedure

(dmatrix-set-feature-names! dm names)  void?

  dm : dmatrix?
  names : (listof string?)

procedure

(dmatrix-set-feature-types! dm types)  void?

  dm : dmatrix?
  types : (listof string?)
Set the feature-name and feature-type metadata.

procedure

(dmatrix-label dm)  (listof real?)

  dm : dmatrix?

procedure

(dmatrix-weight dm)  (listof real?)

  dm : dmatrix?

procedure

(dmatrix-base-margin dm)  (listof real?)

  dm : dmatrix?

procedure

(dmatrix-group-ptr dm)  (listof exact-nonnegative-integer?)

  dm : dmatrix?

procedure

(dmatrix-feature-names dm)  (listof string?)

  dm : dmatrix?

procedure

(dmatrix-feature-types dm)  (listof string?)

  dm : dmatrix?
Read the corresponding metadata fields back as Racket data.

Returns (values indptr data) for the quantile cuts XGBoost computed during hist-mode training. indptr holds per-feature offsets into data, whose final entry equals the total cut count.

Quantile cuts only exist after at least one round of "tree_method" "hist" training; calling this before training returns empty results.

4 Training🔗ℹ

procedure

(booster? v)  boolean?

  v : any/c
Returns #t when v is a high-level Booster value produced by train, load-model, or load-model-from-bytes.

procedure

(train dtrain    
  [#:params params    
  #:rounds rounds    
  #:evals evals    
  #:objective objective    
  #:objective-fn objective-fn    
  #:eta eta    
  #:max-depth max-depth    
  #:num-class num-class    
  #:eval-metric eval-metric    
  #:verbosity verbosity])  booster?
  dtrain : dmatrix?
  params : any/c = '()
  rounds : exact-nonnegative-integer? = 10
  evals : (listof (cons/c string? dmatrix?)) = '()
  objective : (or/c #f any/c) = #f
  objective-fn : (or/c #f (-> f32vector? dmatrix? any)) = #f
  eta : (or/c #f any/c) = #f
  max-depth : (or/c #f any/c) = #f
  num-class : (or/c #f any/c) = #f
  eval-metric : (or/c #f any/c) = #f
  verbosity : (or/c #f any/c) = #f
Trains a Booster for rounds boosting rounds.

params may be a hash or association list. Parameter keys may be strings, symbols, or keywords. Symbol and keyword keys are converted to XGBoost-style names by replacing hyphens with underscores. Parameter values are converted to strings before reaching the FFI layer.

Keyword conveniences such as #:objective and #:max-depth are merged after params, so they override entries with the same XGBoost parameter name.

When #:objective-fn is supplied, each round runs a Racket-side custom objective: train computes margin predictions, calls (objective-fn preds dtrain) which must return (values grad hess) (lists, vectors, or f32vector? values), and feeds those gradients into one boosting iteration. Use #:rounds 0 to construct an untrained Booster bound to dtrain (and any #:evals entries) for manual stepping with booster-update-one-iter! or booster-train-one-iter!.

The returned Booster retains references to dtrain and all evaluation DMatrices so native cache inputs remain live while the Booster is live.

procedure

(make-booster)  booster?

Creates an empty Booster with no DMatrix cache. Useful for setting attributes ahead of training, loading a JSON config, or preparing a target for bytes->booster.

procedure

(booster-cache booster)  (listof dmatrix?)

  booster : booster?
Returns the list of DMatrices the Booster retains alive (the training DMatrix followed by any #:evals DMatrices).

procedure

(booster-num-feature booster)  exact-nonnegative-integer?

  booster : booster?
Returns the number of features the Booster was trained on.

procedure

(booster-boosted-rounds booster)  exact-nonnegative-integer?

  booster : booster?
Returns how many boosting rounds the Booster has been trained for.

procedure

(booster-reset! booster)  void?

  booster : booster?
Releases internal training caches without resetting the trained model. booster-boosted-rounds is unchanged after reset.

procedure

(booster-slice booster    
  begin-layer    
  end-layer    
  [step])  booster?
  booster : booster?
  begin-layer : exact-integer?
  end-layer : exact-integer?
  step : exact-positive-integer? = 1
Returns a fresh Booster containing only the trees in the half-open layer range from begin-layer (inclusive) to end-layer (exclusive). The result is independent of booster and shares no XGBoost-internal state.

procedure

(booster-set-param! booster key value)  void?

  booster : booster?
  key : any/c
  value : any/c
Sets a single XGBoost parameter. Keys and values are coerced through the same rules as train’s #:params.

procedure

(booster-update-one-iter! booster    
  iter    
  dtrain)  void?
  booster : booster?
  iter : exact-integer?
  dtrain : dmatrix?
Runs one XGBoost-built-in objective boosting round on dtrain.

procedure

(booster-train-one-iter! booster    
  iter    
  dtrain    
  grad    
  hess)  void?
  booster : booster?
  iter : exact-integer?
  dtrain : dmatrix?
  grad : any/c
  hess : any/c
Runs one custom-objective boosting round using user-supplied gradient and Hessian vectors. Both vectors must have one entry per row of dtrain (or per row times num-class for multiclass objectives).

5 Booster Attributes and Configuration🔗ℹ

procedure

(booster-set-attr! booster key value)  void?

  booster : booster?
  key : string?
  value : string?

procedure

(booster-attr booster key)  (or/c #f string?)

  booster : booster?
  key : string?

procedure

(booster-attr-names booster)  (listof string?)

  booster : booster?

procedure

(booster-delete-attr! booster key)  void?

  booster : booster?
  key : string?
Read, write, list, and delete user-defined string attributes on booster. booster-attr returns #f when the key is absent.

procedure

(booster-set-feature-names! booster names)  void?

  booster : booster?
  names : (listof string?)

procedure

(booster-set-feature-types! booster types)  void?

  booster : booster?
  types : (listof string?)

procedure

(booster-feature-names booster)  (listof string?)

  booster : booster?

procedure

(booster-feature-types booster)  (listof string?)

  booster : booster?
Read and write the feature-name and feature-type metadata stored on booster.

procedure

(booster-config booster)  string?

  booster : booster?
Returns booster’s configuration as a JSON string. The shape is XGBoost’s wire format; treat it as opaque.

procedure

(booster-set-config! booster config)  void?

  booster : booster?
  config : string?
Loads a JSON configuration produced by booster-config.

procedure

(booster-dump booster    
  [#:format format    
  #:with-stats? with-stats?    
  #:feature-names feature-names    
  #:feature-types feature-types])  (listof string?)
  booster : booster?
  format : (or/c "text" "json" "dot") = "text"
  with-stats? : any/c = #f
  feature-names : (or/c #f (listof string?)) = #f
  feature-types : (or/c #f (listof string?)) = #f
Returns one string per tree in booster. Pass #:feature-names and #:feature-types together to substitute human-readable names into the dump.

procedure

(booster-feature-score booster 
  [#:importance-type importance-type 
  #:feature-names feature-names 
  #:config config]) 
  (hash/c symbol? any/c)
  booster : booster?
  importance-type : string? = "weight"
  feature-names : (or/c #f (listof string?)) = #f
  config : (or/c #f string?) = #f
Returns a hash with keys 'features, 'shape, and 'scores describing per-feature importance. Pass #:importance-type to choose "weight", "gain", or similar XGBoost importance modes.

6 Inplace Prediction🔗ℹ

These variants run prediction directly against ordinary Racket data without constructing a DMatrix. They share the same #:output, #:iteration-end, and #:as keywords as predict.

procedure

(predict-from-dense booster 
  data 
  [#:nrow nrow 
  #:ncol ncol 
  #:missing missing 
  #:output output 
  #:iteration-end iteration-end 
  #:as as]) 
  (or/c (listof real?) f32vector?)
  booster : booster?
  data : any/c
  nrow : (or/c #f exact-positive-integer?) = #f
  ncol : (or/c #f exact-positive-integer?) = #f
  missing : real? = +nan.0
  output : 
(or/c 'value 'margin 'contribs
      'approx-contribs
      'interactions
      'approx-interactions
      'leaf)
 = 'value
  iteration-end : exact-nonnegative-integer? = 0
  as : (or/c 'list 'f32vector) = 'list
Predicts on a dense input shaped like make-dmatrix’s data argument.

procedure

(predict-from-csr booster 
  indptr 
  indices 
  data 
  ncol 
  [#:missing missing 
  #:output output 
  #:iteration-end iteration-end 
  #:as as]) 
  (or/c (listof real?) f32vector?)
  booster : booster?
  indptr : u64vector?
  indices : u32vector?
  data : f32vector?
  ncol : exact-positive-integer?
  missing : real? = +nan.0
  output : 
(or/c 'value 'margin 'contribs
      'approx-contribs
      'interactions
      'approx-interactions
      'leaf)
 = 'value
  iteration-end : exact-nonnegative-integer? = 0
  as : (or/c 'list 'f32vector) = 'list
Predicts on CSR-encoded input.

procedure

(predict-from-columnar booster 
  columns 
  [#:missing missing 
  #:output output 
  #:iteration-end iteration-end 
  #:as as]) 
  (or/c (listof real?) f32vector?)
  booster : booster?
  columns : (listof f32vector?)
  missing : real? = +nan.0
  output : 
(or/c 'value 'margin 'contribs
      'approx-contribs
      'interactions
      'approx-interactions
      'leaf)
 = 'value
  iteration-end : exact-nonnegative-integer? = 0
  as : (or/c 'list 'f32vector) = 'list
Predicts on column-major f32vector? columns.

procedure

(predict booster 
  dmat 
  [#:output output 
  #:iteration-end iteration-end 
  #:as as]) 
  (or/c (listof real?) f32vector?)
  booster : booster?
  dmat : dmatrix?
  output : 
(or/c 'value 'margin 'contribs
       'approx-contribs 'interactions
       'approx-interactions 'leaf)
 = 'value
  iteration-end : exact-nonnegative-integer? = 0
  as : (or/c 'list 'f32vector) = 'list
Runs prediction for dmat.

By default, predictions are returned as a list. Pass #:as 'f32vector to receive the Racket f32vector? copied from XGBoost output.

#:iteration-end limits prediction to the first N boosting rounds; 0 means all available rounds.

7 Evaluation🔗ℹ

procedure

(eval-one-iter booster iter evals)  string?

  booster : booster?
  iter : exact-integer?
  evals : (listof (cons/c string? dmatrix?))
Evaluates booster at iter for the named DMatrices in evals, returning XGBoost’s metric line.

procedure

(parse-eval-line line)  (hash/c string? real?)

  line : string?
Parses an XGBoost metric line into a hash from "name-metric" strings to numeric values.

8 Model IO🔗ℹ

procedure

(save-model booster path)  void?

  booster : booster?
  path : path-string?
Saves booster to path. XGBoost chooses the model format from the file extension.

procedure

(load-model path)  booster?

  path : path-string?
Loads a Booster from path.

procedure

(save-model-to-bytes booster    
  [#:format format])  bytes?
  booster : booster?
  format : (or/c "json" "ubj") = "ubj"
Serializes booster to bytes in JSON or UBJSON format.

procedure

(load-model-from-bytes data)  booster?

  data : bytes?
Loads a Booster from bytes produced by save-model-to-bytes.

9 Booster Snapshots🔗ℹ

save-model / load-model persist only the trained tree ensemble. Snapshots additionally capture XGBoost’s internal training caches, so a restored Booster can resume per-iteration updates in lockstep with the original.

procedure

(booster->bytes booster)  bytes?

  booster : booster?
Serializes booster’s full state, including training caches.

procedure

(bytes->booster data)  booster?

  data : bytes?
Reconstructs a Booster from bytes produced by booster->bytes. The restored Booster has an empty DMatrix cache; pass training data explicitly to booster-update-one-iter! when resuming.

10 Version🔗ℹ

procedure

(xgboost-version)  string?

Returns the linked XGBoost version string.

procedure

(xgboost-build-info)  string?

Returns XGBoost build information as a JSON string.

11 Process Configuration🔗ℹ

Returns XGBoost’s process-global configuration as a JSON string.

procedure

(xgboost-set-global-config! config)  void?

  config : string?
Sets XGBoost’s process-global configuration from a JSON string.

procedure

(xgboost-register-log-callback! callback)  void?

  callback : (-> string? any/c)
Registers a process-global XGBoost log callback.

The callback receives log messages as strings. Since the registration is process-global, callers should treat it as shared mutable process state.