On this page:
<r24-require>
<r24-provide>
<r24-run>
<*>

2.29 CUDA regression🔗ℹ

This is Training a regressor moved onto the GPU: the same synthetic data and the same high-level train / predict, but with "device=cuda" and "tree_method=hist" passed through #:params so tree construction runs on an NVIDIA GPU. Running it needs a CUDA-enabled native library (./scripts/build-so.sh linux-cuda or nix build .#cpp-cuda) and a physical GPU; cuda-available? gates the work so the example skips gracefully on CPU-only builds.

(require ffi/vector
         xgboost)

(provide run-example cuda-available?)

The run. The only difference from the CPU regressor is the #:params device/tree-method pair. run-example returns the prediction count and training MSE:

(define features
  (f32vector 1.0 2.0 0.5   2.0 1.0 1.5   3.0 0.5 0.0   0.5 3.0 2.0
             4.0 2.0 1.0   1.5 1.5 0.5   2.5 3.5 1.5   0.0 1.0 0.0))
(define labels (f32vector 3.5 3.5 6.5 2.0 9.0 4.0 7.0 1.0))
 
(define (run-example)
  (define dtrain (make-dmatrix features #:nrow 8 #:ncol 3 #:labels labels))
  (define booster
    (train dtrain
           #:objective "reg:squarederror"
           #:params '((device . "cuda") (tree_method . "hist"))
           #:max-depth 3 #:eta 0.1 #:verbosity 0 #:rounds 50))
  (define preds (predict booster dtrain #:as 'f32vector))
  (define n (f32vector-length labels))
  (define mse
    (/ (for/sum ([i (in-range n)])
         (expt (- (f32vector-ref preds i) (f32vector-ref labels i)) 2)) n))
  (hash 'prediction-count n 'mse mse 'improved? (< mse 1.0)))

The harness "test/24-cuda-regression.rkt" runs the example only when cuda-available? is true, printing the MSE and asserting it drops below 1.0; on CPU-only builds it prints a skip notice.

<*> ::=