2.29 CUDA regression

Racket

2.29 CUDA regression🔗ℹ

This is Training a regressor moved onto the GPU: the same synthetic data and the same high-level train / predict, but with "device=cuda" and "tree_method=hist" passed through #:params so tree construction runs on an NVIDIA GPU. Running it needs a CUDA-enabled native library (./scripts/build-so.sh linux-cuda or nix build .#cpp-cuda) and a physical GPU; cuda-available? gates the work so the example skips gracefully on CPU-only builds.

<r24-require> ::=

(require ffi/vector
xgboost)

<r24-provide> ::=

(provide run-example cuda-available?)

The run. The only difference from the CPU regressor is the #:params device/tree-method pair. run-example returns the prediction count and training MSE:

<r24-run> ::=

(define features
  (f32vector 1.0 2.0 0.5   2.0 1.0 1.5   3.0 0.5 0.0   0.5 3.0 2.0
             4.0 2.0 1.0   1.5 1.5 0.5   2.5 3.5 1.5   0.0 1.0 0.0))
(define labels (f32vector 3.5 3.5 6.5 2.0 9.0 4.0 7.0 1.0))

(define (run-example)
  (define dtrain (make-dmatrix features #:nrow 8 #:ncol 3 #:labels labels))
  (define booster
    (train dtrain
           #:objective "reg:squarederror"
           #:params '((device . "cuda") (tree_method . "hist"))
           #:max-depth 3 #:eta 0.1 #:verbosity 0 #:rounds 50))
  (define preds (predict booster dtrain #:as 'f32vector))
  (define n (f32vector-length labels))
  (define mse
    (/ (for/sum ([i (in-range n)])
         (expt (- (f32vector-ref preds i) (f32vector-ref labels i)) 2)) n))
  (hash 'prediction-count n 'mse mse 'improved? (< mse 1.0)))

The harness "test/24-cuda-regression.rkt" runs the example only when cuda-available? is true, printing the MSE and asserting it drops below 1.0; on CPU-only builds it prints a skip notice.

<*> ::=

<r24-require>
<r24-provide>
<r24-run>

2.1	Building a DMatrix
2.2	Training a regressor
2.3	Binary classification
2.4	Multiclass classification
2.5	Watching an evaluation set
2.6	Iris: a full classification pipeline
2.7	Get Started
2.8	Robust regression
2.9	Quantile regression
2.10	Poisson count regression
2.11	Survival analysis (AFT)
2.12	Custom objective
2.13	Saving and loading models
2.14	Booster snapshots
2.15	DMatrix constructors
2.16	DMatrix metadata
2.17	Slicing and binary serialization
2.18	Quantile cuts
2.19	The high-level API end to end
2.20	Booster lifecycle and config
2.21	Booster attributes
2.22	Model dumps and feature importance
2.23	In-place prediction (dense)
2.24	In-place prediction (CSR)
2.25	In-place prediction (columnar)
2.26	Parameter recipes
2.27	Learning to rank
2.28	Global and process APIs
2.29	CUDA regression
2.30	CUDA classification