4 How-to guides
Task-oriented guides for competent users with a specific problem to solve. Each guide is self-contained; read only the one you need.
4.1 Debug a pipeline
This guide shows how to inspect a running pipeline’s events, filter them by ashlar name or event type, and correlate work across ashlars using trace and span ids. For the conceptual model see Observability.
4.1.1 Prerequisites
A pipeline that runs. If you don’t have one, work through Getting Started first.
A jq install (or equivalent) for filtering JSON on the command line.
4.1.2 1. Enable a receiver
Stone’s observability is a single logger, stone-logger. To see anything, subscribe a receiver and drain it from a thread. The pattern below writes each event to trace.jsonl as JSON:
(require racket/logging stone/logging json) (define trace-out (open-output-file "trace.jsonl" #:exists 'append)) (define receiver (make-log-receiver stone-logger 'info)) ; stone-event builds hashes with symbol keys and arbitrary ; Racket values; write-json only accepts a narrow subset, ; so coerce everything on the way out. (define (->json-safe v) (cond [(or (string? v) (boolean? v) (exact-integer? v)) v] [(and (real? v) (rational? v)) (exact->inexact v)] [(symbol? v) (symbol->string v)] [(keyword? v) (keyword->string v)] [(path? v) (path->string v)] [(null? v) '()] [(hash? v) (for/hash ([(k val) (in-hash v)]) (values (if (symbol? k) k (string->symbol (format "~a" k))) (->json-safe val)))] [(list? v) (map ->json-safe v)] [(vector? v) (map ->json-safe (vector->list v))] [(void? v) 'null] [else (format "~v" v)])) (define logger-thread (thread (lambda () (let loop () (define evt (sync receiver)) (with-handlers ([exn:fail? (lambda (_) (void))]) (write-json (hash 'level (->json-safe (vector-ref evt 0)) 'message (->json-safe (vector-ref evt 1)) 'data (->json-safe (vector-ref evt 2)) 'topic (->json-safe (vector-ref evt 3))) trace-out) (newline trace-out) (flush-output trace-out)) (loop)))))
The ->json-safe helper isn’t optional. stone-event emits Racket hashes keyed by symbols and holding arbitrary values (DAG nodes, message lists, paths); write-json will raise on anything outside its accepted subset.
4.1.3 2. Parameterize a trace id
Wrap run-pipeline in a parameterize so every event carries the same trace id:
(define tid (generate-id "run-")) (define sid (generate-id "span-")) (parameterize ([current-trace-id tid] [current-span-id sid]) (run-pipeline my-pipeline (make-dag)))
Every event in this run will carry the same 'trace-id, making it easy to isolate a single run in a shared log file. Span ids are re-allocated per ashlar invocation by the framework wrapper.
4.1.4 3. Read the stream
With the trace file written, common debugging questions become jq queries. Each line is one event; the structured payload is under 'data.
jq '.' trace.jsonl |
jq 'select(.data.event == "api-call")' trace.jsonl |
jq 'select(.level == "error")' trace.jsonl |
jq 'select(.data."ashlar-name" == "discover-project")' trace.jsonl |
jq -s 'group_by(.data."ashlar-name") |
| map({ashlar: .[0].data."ashlar-name", count: length})' trace.jsonl |
ashlar-name is hyphenated, so it must be quoted in jq paths. Same goes for trace-id, span-id, parent-span-id.
4.1.5 4. Enable debug-level events when you need them
'ashlar-start, 'ashlar-end, and 'middleware-run are emitted at debug level. A receiver subscribed at 'info won’t see them. When you need "did this ashlar actually run?", subscribe at debug:
(define receiver (make-log-receiver stone-logger 'debug))
Debug is noisy. Use it, get the answer, turn it back down.
4.1.6 5. Follow a specific trace-id
When you’re tailing a shared log file and one run misbehaved, filter by trace id:
jq 'select(.data."trace-id" == "run-1775875329701-93248")' trace.jsonl |
generate-id returns "<prefix><ms>-<random>", so the prefix you chose ("run-", "tdd-", whatever) is the fastest way to scope a query.
4.1.7 6. Read failure events
When any ashlar creates a failure node, stone/logging auto-emits an 'error-level event. No user code needs to opt in:
jq 'select(.level == "error")' trace.jsonl |
The 'event field of an error event is the failure kind (e.g. 'llm-parse-failed, 'loop-exhausted). The 'reason field is the message. For a given failure, the triple (span-id, event, reason) usually tells you which ashlar failed and why in one line.
4.1.8 7. Check the DAG after a run
The pipeline’s final DAG is still available after run-pipeline returns. When logs say "something failed" but you want to see what actually landed in the DAG, inspect it directly:
(define-values (result final-dag) (parameterize ([current-trace-id tid] [current-span-id sid]) (run-pipeline my-pipeline (make-dag)))) (for ([(id n) (in-hash (dag-nodes final-dag))]) (displayln (list (node-type n) (node-content n))))
Useful when an ashlar produced a partial result before failing downstream — the partial result is still in final-dag, and no amount of log-scanning will show you its content the way dag-nodes will.
4.1.9 8. Test-time interception
For unit tests, use with-intercepted-logging to capture events in-process without writing to a file:
(require racket/logging stone/logging) (define events (box '())) (with-intercepted-logging (lambda (v) (set-box! events (cons v (unbox events)))) (lambda () (run-pipeline my-pipeline (make-dag))) #:logger stone-logger 'info 'stone)
The #:logger stone-logger keyword is mandatory. Because stone-logger is defined with #:parent #f, its events never reach the root logger, and a call that omits the keyword attaches to the root logger — it captures nothing and the test passes for the wrong reason.
4.1.10 Troubleshooting
Trace file is empty. Either no receiver is subscribed, or the receiver thread hasn’t drained before process exit. Kill or join the logger thread before closing the output port.
Events have ashlar-name: false. The ashlar was created without #:name or #:produces. The framework should auto-generate a name; if you see false the framework may be out of date relative to the pipeline.
with-intercepted-logging captures nothing. You forgot #:logger stone-logger.
Duplicate events from the same ashlar. A ashlar-meta can be composed into multiple positions in a pipeline, and each invocation emits its own events with distinct span-ids — the events aren’t duplicates, they’re separate runs of the same ashlar.
4.1.11 See also
Observability — the conceptual model.
Trace a run — the minimal setup for writing a trace file.
Handle failures (below) — writing pipelines that recover from the error events you see here.
4.2 Trace a run
Stone emits a stream of structured events on a dedicated logger, stone-logger, every time a pipeline runs. This guide shows you how to capture that stream to a JSONL file so you can inspect exactly what happened — which ashlars ran in what order, what each LLM call sent and received, where failures fired.
For the conceptual picture of Stone’s observability, see Observability.
4.2.1 Prerequisites
A working Stone pipeline — if you don’t have one yet, follow Getting Started first.
4.2.2 Steps
4.2.2.1 1. Require the logging pieces
Add two requires:
(require stone/logging racket/logging)
stone/logging re-exports stone-logger and the correlation parameters. racket/logging provides make-log-receiver.
4.2.2.2 2. Open a trace file and a log receiver
Before running the pipeline, open a file to write events to and subscribe a receiver to stone-logger at the level you want:
(define trace-out (open-output-file "trace.jsonl" #:exists 'replace)) (define receiver (make-log-receiver stone-logger 'info))
'info sees 'api-call, 'api-response, 'agent-ashlar-start, 'agent-ashlar-end, 'tool-dispatch, and all failure events. Use 'debug if you also want per-ashlar 'ashlar-start and 'ashlar-end events — noisy but useful when debugging composition.
4.2.2.3 3. Start a background thread to write events
(define logger-thread (thread (lambda () (let loop () (define evt (sync receiver)) (write (hasheq 'level (vector-ref evt 0) 'data (vector-ref evt 2)) trace-out) (newline trace-out) (flush-output trace-out) (loop)))))
The receiver delivers events as Racket log vectors #(level message data topic). Index 2 is the structured data hash that stone-event built — that’s what you want to write. The thread loops forever, syncing on the receiver and writing one line per event.
4.2.2.4 4. Set trace-id and span-id around the run
(define-values (result final-dag) (parameterize ([current-trace-id (generate-id "run-")] [current-span-id (generate-id "span-")]) (run-pipeline pipeline (make-dag)))) (displayln (node-text result))
current-trace-id correlates every event from this pipeline run; every ashlar’s span carries the same trace id. current-span-id is the root span; deeper ashlars get their own span ids with the root’s as parent.
4.2.2.5 5. Run and inspect
Run the file and open trace.jsonl. You should see lines like:
#hash((level . info) (data . #hash((event . api-call) (ashlar-name . summarize) (trace-id . "run-...") (span-id . "span-...") ...))) |
#hash((level . info) (data . #hash((event . api-response) (ashlar-name . summarize) (usage . #hash((prompt_tokens . 42) ...))) ...)) |
One 'api-call and one 'api-response for every LLM turn; lifecycle events for every ashlar that ran; error events for every failure node created.
4.2.3 Reading the trace
Once you have a trace.jsonl, the raco stone trace subcommands cover the canonical investigation flow:
Start with tally. raco stone trace tally trace.jsonl prints total event count and per-event-type counts. Use it as a first-pass health check — unexpected counts (e.g. lots of 'sse-trip-raised or 'hypothesis-mismatch) point at where to look next.
Then walk lifecycle. raco stone trace lifecycle trace.jsonl prints the ashlar-by-ashlar story of the run: starts, ends, tool dispatches, api-calls. One line per event, in chronological order. Use it to find the moment something went wrong.
Then dump payload. raco stone trace payload trace.jsonl –ashlar <name> –turn <n> shows what the agent actually sent on a specific turn — system prompt, every message, every tool call. This is what you read when the lifecycle view tells you a turn went sideways and you want to know why. –last picks the most recent matching payload.
"api-call-payload" events are only emitted at the 'debug log level. If payload returns No api-call-payload found, re-run your pipeline with 'debug level on the log receiver (or STONE_LOG_LEVEL=debug in environments that honor it) so the full message contents land in the trace.
For programmatic access to the trace from Racket code (custom inspection tools, ad-hoc scripts), see stone/trace.
4.2.4 Troubleshooting
Empty trace file. The receiver is probably attached to the wrong logger. stone-logger has #:parent #f so it doesn’t propagate to the root logger — a receiver on (current-logger) sees nothing. Use stone-logger explicitly.
Missing events. Raise the level on the receiver. 'info misses 'ashlar-start/'ashlar-end; switch to 'debug.
Trailing events lost. The background thread may still be draining when your program exits. If you care about every event, call flush-output on trace-out and sleep for a few tens of milliseconds before exit, or drain the receiver with sync/timeout in a tight loop.
4.2.5 See also
Observability — the full design of Stone’s logging: correlation parameters, event vocabulary, failure auto-logging.
Debug a pipeline in the how-to section — using a live receiver to watch events during development.
4.3 Validate a topology
This guide shows how to run static validation on a Stone pipeline before executing it. Validation walks the topology tree, reads the metadata every ashlar carries, and reports structural mistakes — missing producers, unsafe lenses, match branches that don’t all produce the types downstream consumers rely on. It runs no ashlars, calls no LLMs, and finishes in milliseconds. For the conceptual model see Validation.
4.3.1 Prerequisites
A pipeline: a module that builds and exports a ashlar-meta value (see Getting Started).
Stone installed and visible to Racket.
4.3.2 1. Validate from the REPL or a script
Call validate-pipeline on a ashlar-meta value:
(require stone/validate) (define result (validate-pipeline pipeline)) (if (validation-ok? result) (displayln "valid") (for ([err (validation-errors result)]) (displayln (format "[~a] ~a: ~a" (validation-error-type err) (or (validation-error-ashlar-name err) "?") (validation-error-message err)))))
validate-pipeline returns a validation-result struct. validation-errors pulls the error list out of it; validation-ok? is true when the list contains no hard errors. Each validation-error has four fields: type, ashlar-name, queried-type, and message. The type field is a symbol ('missing-producer, 'maybe-unavailable, 'invalid-lens, or 'fanout-not-reduced) that tells you which class of problem the validator hit.
4.3.3 2. Validate from the command line
Stone provides a validate subcommand on raco stone. It takes a path to a pipeline file, loads it via dynamic-require, and runs validate-pipeline on the binding named pipeline:
raco stone validate my-pipeline.rkt |
The pipeline file must be a Racket module that provides a binding named exactly pipeline, bound to a ashlar-meta? value:
(require stone stone/llm-ashlar stone/llm-client) (provide pipeline) (define caller (make-openai-caller #:url "http://localhost:8000")) (define pipeline (~> ashlar-an ashlar-b))
On success the CLI prints one line and exits 0:
$ raco stone validate my-pipeline.rkt |
Pipeline is valid. |
When the topology has hard errors, you get a header and one line per error formatted as [<type>] <ashlar-name>: <message>, and the CLI exits 1:
$ raco stone validate broken.rkt |
Errors (1): |
[missing-producer] summarize: summarize queries 'draft but no upstream Stone produces it |
Warnings ('maybe-unavailable) print in a separate Warnings (N): block and don’t change the exit code.
4.3.4 3. Interpret common errors
4.3.4.1 'missing-producer
Meaning. An ashlar declares #:queries '(foo) but no ashlar upstream produces a 'foo node.
Typo in the queried type name (querying 'configuration but producing 'config).
The producer lives in a sibling branch of a ashlar-match or ashlar-parallel that doesn’t always run.
The producer was removed or renamed during a refactor and the consumer wasn’t updated.
Fix. Rename the query to match an upstream producer, or ensure a producer exists on every path that reaches the consumer.
4.3.4.2 'maybe-unavailable
Meaning. A queried type is produced in some branches of a ashlar-match or ashlar-parallel but not every branch. The validator can’t tell at compose time which branch will execute, so it flags the type as uncertain.
Interpretation. This is a warning. If the runtime behavior of your match guarantees that the producing branch runs whenever the query is reached, the warning is safe to ignore. If not, it’s a real bug that will bite you the first time the other branch fires.
Fix. Move the producer outside the branch, add the producer to every branch (even as a defaulted stub), or handle the absence at runtime by checking dag-nearest-ancestor for #f.
4.3.4.3 'invalid-lens
Meaning. A ashlar-match uses a lens extractor like (lens 'foo 'bar), and the previous ashlar’s JSON schema doesn’t have a foo property at the top level.
Typo in the lens path.
Schema change in the upstream ashlar that the lens wasn’t updated for.
Lens path expects a nested field but the schema is flat.
Fix. Update the lens to match the upstream schema, or update the schema to match the lens. If the extraction is inherently dynamic, move it into a regular ashlar body that can handle missing fields.
4.3.4.4 'fanout-not-reduced
Meaning. A ashlar-map or ashlar-parallel is positioned so its arbitrary "last lane" return value would flow to a downstream consumer without a ashlar-reduce between them. The rule exists to force an explicit collapse via dag-query-all inside a reducer so downstream ashlars see a principled aggregate instead of a source-order accident.
Fix. Pair the fan-out with a ashlar-reduce whose inner ashlar reads the lane outputs via dag-query-all and produces an aggregate. For a fan-out inside another composite (loop body, match branch, nested fan-out), wrap the inner fan-out in its own sequence ending with a reducer.
4.3.5 4. Integrate validation into your workflow
4.3.5.1 As a pre-run check
Add a one-line guard at the top of your entry point so a broken pipeline fails fast instead of eating LLM tokens:
(define result (validate-pipeline pipeline)) (unless (validation-ok? result) (for ([err (validation-errors result)]) (displayln (format "VALIDATION: ~a" (validation-error-message err)))) (exit 1))
4.3.5.2 As a CI gate
Run the CLI in a test step. It returns 0 on valid pipelines (and on warnings-only runs) and non-zero on hard errors. The whole pass is millisecond-fast:
raco stone validate my-pipeline.rkt || exit 1 |
4.3.5.3 In tests
Use validate-pipeline inside a rackunit test alongside your other compose-time checks:
(require rackunit stone/validate) (test-case "pipeline passes validation" (check-true (validation-ok? (validate-pipeline pipeline))))
4.3.6 5. Enumerate ashlars in a topology
enumerate-ashlars returns a flat list of every ashlar’s name. Use it for inventories, tables of contents, topology diagrams, or counts:
(enumerate-ashlars pipeline) ; => '(ashlar-an ashlar-b ashlar-c ...)
enumerate-paths returns the distinct execution paths through the topology, each a list of ashlar names:
(enumerate-paths pipeline) ; => '((ashlar-an ashlar-b ashlar-c) ; (ashlar-an ashlar-b ashlar-d))
Sequences concatenate along a single path. Matches and parallels fork. Loops collapse to a single representative path — one iteration of the body.
4.3.6.1 Orphan-ashlar guard
enumerate-ashlars returns every ashlar reachable from the pipeline root. An ashlar defined and exported but never wired into the pipeline won’t show up — which makes diffing that list against your declared ashlars a precise dead-code check:
(define declared '(load-config classify implement summarize cleanup)) (define wired (list->set (enumerate-ashlars pipeline))) (define orphans (filter (lambda (s) (not (set-member? wired s))) declared)) (unless (null? orphans) (error 'my-pipeline "declared but not wired: ~a" orphans))
The validator can’t catch an orphan — an unused ashlar isn’t a structural error — but this adjacent check will.
4.3.7 Troubleshooting
"does not export ’pipeline’". The CLI calls dynamic-require on the file and looks for a binding named exactly pipeline. Rename your exported topology to pipeline, or add a thin alias (define pipeline my-real-topology) alongside the export.
Validator passes but the pipeline fails at runtime. Static validation only catches structural mismatches. LLM responses that don’t match the schema, network failures, tool errors, and bad healer output still happen at runtime — see Debug a pipeline.
'maybe-unavailable in output that should be fine. Check whether your ashlar-match branches all produce the same types. If they do but the validator still warns, restructure the match or accept the warning as a known soft signal.
4.3.8 See also
Validation — the full design.
Edge Primitives — the composition vocabulary being validated.
Debug a pipeline — inspecting runtime failures the validator can’t catch.
4.4 Write a custom ashlar
This guide shows how to wrap your own work as an ashlar with make-ashlar. Reach for it when the work isn’t a single LLM call and isn’t a multi-turn agent loop, but something you still want to compose into a pipeline: reading a configuration file, shelling out to a subprocess, parsing output, hitting an external API, or running any deterministic transformation whose result later ashlars want to query.
Because everything in Stone is an ashlar, there’s no second calling convention for "the plain-code bits." You write a (DAG -> node) function, hand it to make-ashlar, declare the metadata the validator needs, and it slots into every composition primitive the framework exposes.
4.4.1 Prerequisites
A working Stone pipeline. If you don’t have one, work through Getting Started first.
A basic understanding of typed nodes and DAG queries. See The DAG as Pipeline State if those terms are new.
4.4.2 1. The shape of a custom ashlar
A custom ashlar is make-ashlar applied to a function that takes a DAG and returns a node. The function reads whatever it needs from the DAG, does its work, and hands back one node.
(require stone) (define my-ashlar (make-ashlar (lambda (dag) (typed-node dag 'my-type (hasheq 'result "something"))) #:produces 'my-type #:name 'my-ashlar))
typed-node is a convenience over make-typed-node that defaults the parents to (dag-heads dag) — the DAG frontier the ashlar was just handed. That’s what you want 99% of the time. Reach for the full make-typed-node form when you need to attach custom meta or point at parents other than the current heads.
make-ashlar takes the function as its first positional argument and then keyword metadata. #:produces declares the node type the ashlar will emit; #:name is the symbol used in logs, validator output, and error messages.
4.4.3 2. Reading from the DAG
Use dag-nearest-ancestor to pull the node of a type on the current execution lineage and dag-query-all to pull every node of a type. node-get and node-text are the shape-safe readers:
(define extract-config-name (make-ashlar (lambda (dag) (define cfg (dag-nearest-ancestor dag 'project-config)) (define name (node-get cfg 'language "unknown")) (typed-node dag 'config-name (hasheq 'name name))) #:produces 'config-name #:queries '(project-config) #:name 'extract-config-name))
node-get returns the default when the node is #f (nothing upstream), when the content isn’t a hash, or when the key is missing. For nested paths, node-get* ((node-get* cfg 'deployment 'region)); for plain text output, node-text.
Declare every type you read in #:queries. The validator uses the list to check that some upstream ashlar actually produces each type.
4.4.4 3. Producing a typed node
Two constructors. typed-node is the ergonomic form — takes the DAG and defaults parents to the current heads:
(typed-node dag 'my-type (hasheq 'key "value"))
make-typed-node is the full form — use it when you need custom meta or different parents:
(make-typed-node (dag-heads dag) 'my-type (hasheq 'key "value") (hasheq 'tag "extra"))
Use hasheq rather than hash for content. Symbol keys are faster to look up, match the convention every other ashlar follows, and keep node-content reads on one discipline.
4.4.5 4. Handling failure
When an ashlar can’t complete its job, return a failure node instead of a typed one. make-failure-node builds it:
(define check-file-exists (make-ashlar (lambda (dag) (define cfg (dag-nearest-ancestor dag 'config)) (define path (node-get cfg 'file-path)) (cond [(and path (file-exists? path)) (typed-node dag 'file-ok (hasheq 'path path))] [else (make-failure-node (dag-heads dag) 'file-missing (format "required file not found: ~a" path))])) #:produces 'file-ok #:queries '(config) #:name 'check-file-exists))
A failure node stops ~> immediately and exits an ashlar-loop. stone/logging auto-emits an 'error-level event with the failure kind and reason — see Observability. For recovery patterns see Handle failures in a pipeline.
4.4.6 5. Side effects are fine
A custom ashlar can write files, call subprocesses, or hit external APIs. The framework doesn’t try to make ashlars pure. Just make sure the returned node describes the result of the side effect so downstream ashlars can find it:
(define (make-write-file project-root) (make-ashlar (lambda (dag) (define spec (dag-nearest-ancestor dag 'file-spec)) (define content (node-get spec 'body)) (define rel-path (node-get spec 'path)) (define abs-path (build-path project-root rel-path)) (display-to-file content abs-path #:exists 'replace) (typed-node dag 'file-written (hasheq 'path (path->string abs-path)))) #:produces 'file-written #:queries '(file-spec) #:name 'write-file))
Parameterize factories like this one by accepting construction-time configuration — here, project-root — and returning the ashlar. The ashlar’s function closes over the config.
4.4.7 6. Choosing the right constructor
make-ashlar — deterministic work, side effects, any (DAG -> node) logic. This guide.
make-agent-ashlar — an LLM ashlar with optional middleware, tools, and multi-turn loops. Use for any work that involves an LLM call, whether it’s a single prompt-and-response or a multi-turn agent.
If your work fits make-agent-ashlar, use it — it handles retries, schema parsing, lifecycle events, and conversation attachment for you. Reach for make-ashlar only when the work doesn’t involve an LLM call.
4.4.8 7. Declaring metadata correctly
A checklist:
#:produces type — the symbol of the node type you will emit. Required for downstream validation.
#:queries '(type1 type2 ...) — every type the function reads. The validator checks these against upstream producers before the pipeline runs.
#:name 'ashlar-name — a symbol used in logs and error messages. Defaults to #:produces when omitted. Set it explicitly for readable traces.
4.4.9 Troubleshooting
Ashlar compiles but validate-pipeline reports 'missing-producer. You queried a type that nothing upstream produces. Fix the query name, add a producer, or restructure the topology.
Ashlar runs but downstream ashlars can’t find its output. Check #:produces. If the declared type doesn’t match what downstream ashlars query for, the emitted node is invisible to them.
Side effect fires but the pipeline claims failure. You returned a failure node after doing the side effect. Either check preconditions before the side effect and fail early, or return a success node that describes what the side effect did.
Ashlar function crashes with an unhandled exception. The framework doesn’t catch exceptions — they propagate up and abort the pipeline. Wrap risky calls in with-handlers and return a failure node from the handler.
4.4.10 See also
Ashlars — the conceptual model.
The DAG as Pipeline State — typed nodes and queries.
Handle failures in a pipeline — recovery patterns.
Validate a topology — catching broken couplings before the pipeline runs.
4.5 Route on a node from earlier in the pipeline
ashlar-match’s procedure extractor receives the full work DAG, so the branch decision can come from any node reachable from a head — not just the immediately preceding one. Reach for this shape when a classifier runs early and intermediate steps land between it and the routing decision. For the conceptual model see Edge Primitives.
4.5.1 Prerequisites
A pipeline with a classifier ashlar that emits a typed node (for example, 'classification) and one or more intermediate ashlars between the classifier and the routing decision.
Comfort building ashlars with make-ashlar and composing with ~>. See Write a custom ashlar if either is new.
4.5.2 1. Use a procedure extractor, not a lens
A lens? extractor is applied to the latest head’s content, which is whatever the immediately preceding ashlar produced — not your classifier’s output. To reach an earlier node, pass a procedure instead. The procedure receives the work DAG and is free to walk anywhere in it.
4.5.3 2. Walk to the classifier’s node
Inside the extractor, call dag-nearest-ancestor with the classifier’s #:produces type. The walk follows the first-parent line back from the latest head until it finds a node of that type.
4.5.4 3. Project the routing key with node-get
Pull the field you want to branch on out of the classifier’s node with node-get (or the accessor your classification uses). Return that value from the extractor — ashlar-match runs the branch whose val equals it.
(~> classify-request ; produces 'classification {kind: ...} gather-context ; produces 'context; this is the head at match time (ashlar-match (lambda (d) (node-get (dag-nearest-ancestor d 'classification) 'kind)) ["support" support-flow] ["billing" billing-flow]))
4.5.5 Troubleshooting
The match returns a 'match-failed failure node. Confirm the type passed to dag-nearest-ancestor matches the classifier’s #:produces symbol exactly, and that the classifier actually ran before the match (its output must be reachable on the first-parent line from the current head).
The extractor returns #f because the ancestor wasn’t found. dag-nearest-ancestor returns #f when no node of the requested type exists on the first-parent line, and node-get called on #f returns #f. Add an explicit #f branch, or guard the walk before projecting, rather than relying on the default 'match-failed error.
The wrong branch fires after a loop or fan-out. dag-nearest-ancestor walks the first-parent line, so an intermediate composite that rewrote the head can shift which 'classification node it finds. If a loop or fan-out sits between the classifier and the match, sanity-check the walk by printing the ancestor’s content in the extractor.
4.5.6 See also
stone/edge — the full ashlar-match signature.
Edge Primitives — why the extractor is shaped this way.
4.6 Add structured output to an LLM ashlar
This guide shows how to constrain an LLM ashlar’s output to a JSON schema so downstream ashlars can read typed fields off the resulting node. For the conceptual model see Ashlars.
4.6.1 Prerequisites
A working LLM ashlar. If you don’t have one, work through Getting Started first.
An LLM provider that supports JSON schema response format. Most OpenAI-compatible endpoints do, and Anthropic’s Messages API does through the caller shim.
If you’re running on vLLM and also need tools, the ashlar-pair pattern applies — a single agent can’t combine schema enforcement with tool calls on that provider. See The ashlar-pair pattern.
4.6.2 1. Define a schema
Use make-json-schema (re-exported from stone):
(require stone) (define project-config-schema (make-json-schema "project_config" (hasheq 'language (hasheq 'type "string" 'description "Programming language") 'test_framework (hasheq 'type "string") 'impl_paths (hasheq 'type "array" 'items (hasheq 'type "string")) 'test_paths (hasheq 'type "array" 'items (hasheq 'type "string"))) '("language" "test_framework")))
Three arguments: a name (surfaced to the provider), a hasheq mapping field names to property specs (standard JSON Schema vocabulary), and a list of required field names.
make-json-schema wraps your properties in the full response-format envelope (type: "json_schema", strict: #t, additionalProperties: #f) and returns a hash you can pass straight to #:response-format.
4.6.3 2. Set #:response-format on the ashlar
(require stone stone/llm-ashlar) (define propose-config (make-agent-ashlar caller #:produces 'project-config #:queries '(requirement) #:max-turns 1 #:middleware '() #:response-format project-config-schema #:system (lambda (dag) "You propose TDD project configs from a feature description. Return JSON matching the project_config schema.") #:user (lambda (dag) (node-get (dag-nearest-ancestor dag 'requirement) 'text "")) #:name 'propose-config))
When the ashlar runs it forwards the response format to the caller, which forwards it to the API. On return, the ashlar parses the response text as JSON and produces a typed node whose content is the parsed hash. Your system prompt should still mention that JSON is expected; the schema constrains the shape, but telling the model what it’s looking at helps.
4.6.4 3. Read the structured output downstream
(define use-config (make-ashlar (lambda (dag) (define cfg (dag-nearest-ancestor dag 'project-config)) (define language (node-get cfg 'language)) (define framework (node-get cfg 'test_framework)) (typed-node dag 'setup-plan (hasheq 'summary (format "~a project using ~a" language framework)))) #:produces 'setup-plan #:queries '(project-config) #:name 'use-config))
The parsed content is a Racket hash with symbol keys — JSON objects become hasheq values during parsing. Use (node-get cfg 'language) with a symbol, not a string.
4.6.5 4. Handle parse failures
If the LLM’s response can’t be parsed as JSON, or is empty, the ashlar doesn’t raise. It produces a failure node instead. The failure node’s kind is 'llm-parse-failed.
This matters because downstream ashlars that expect a 'project-config won’t see one — a ~> sequence stops on the first failure. To recover, wrap the ashlar in a ashlar-loop:
(define discover-config (ashlar-loop propose-config #:until has-project-config? #:max 3))
Where has-project-config? checks both the node type and the content shape:
(define (has-project-config? node) (and (equal? (node-type node) 'project-config) (hash? (node-content node)) (hash-has-key? (node-content node) 'language)))
The predicate short-circuits on failure nodes (whose type is 'failure, not 'project-config) and the loop retries.
4.6.6 5. Fold a human step into the body when retries aren’t enough
If the LLM keeps failing to produce valid JSON, or the predicate keeps rejecting it, blind retries don’t help. Add a human-in-the- loop step as a sibling inside the body so it runs between iterations:
(define discover-config (ashlar-loop (~> propose-config (ashlar-match (lens 'needs-help?) [#t ask-for-config-details] [#f noop])) #:until has-project-config? #:max 5))
The match runs the ask only when the proposal couldn’t satisfy the schema on its own. See Add human interaction to a pipeline for the ask-human wiring and Edge Primitives for the body-fold pattern.
4.6.7 6. Auto-derived schema metadata
When you pass #:response-format, the ashlar-meta’s schema field is auto-populated from the JSON schema. This means validate-pipeline can check downstream lens accesses against the schema without you declaring #:schema separately.
A ashlar-match using (lens 'language) to branch on the config validates against the project_config schema’s language field. If you rename the field in the schema but forget to update the lens, validate-pipeline catches the mismatch before a single LLM call runs.
4.6.8 7. Dispatching to different node shapes with #:finalize
When the parsed JSON needs to dispatch between different node types — a pass produces one kind of node, a reject produces a failure node with structured feedback — reach for #:finalize. The hook receives the parsed content and returns either a typed node or a make-failure-node:
(define verdict-schema (make-json-schema "verdict" (hasheq 'ok (hasheq 'type "boolean") 'reason (hasheq 'type "string")) '("ok" "reason"))) (define review-implementation (make-agent-ashlar caller #:produces 'review-passed #:response-format verdict-schema #:max-turns 1 #:middleware '() #:finalize (lambda (parsed) (if (hash-ref parsed 'ok #f) (make-typed-node '() 'review-passed parsed) (make-failure-node '() 'review-rejected (hash-ref parsed 'reason "no reason given")))) #:system (lambda (dag) "Review the implementation…") #:user (lambda (dag) "")))
Without #:finalize, the parsed content would always become a 'review-passed node — even when ok was #f. With it, the reject path produces a failure node that downstream ~> recognizes and propagates. See Agents and Tools for the full #:finalize contract including conversation injection behavior.
4.6.9 Troubleshooting
Stone produces 'llm-parse-failed. The LLM response was empty or not valid JSON. Check that your provider supports response_format: json_schema. Check that the system prompt clearly asks for JSON. Wrap in a ashlar-loop to retry.
hash-ref fails downstream. The parsed content is a hash with symbol keys, not string keys.
Schema works locally but fails against a different provider. Different providers have slightly different JSON-schema support. If switching, test end to end — the contract sits between your caller and the remote API.
Agent ashlar never emits a final answer with tools attached. On vLLM, schema + tools usually fails — see The ashlar-pair pattern. On Anthropic, ensure #:decide halts once the tool phase is done.
4.6.10 See also
Ashlars — the single atomic unit.
Validation — static lens checking.
Agents and Tools — the full story on #:finalize, #:adversary, and #:heal-with.
The ashlar-pair pattern — when schema + tools don’t combine and how to split them.
4.7 Add human interaction to a pipeline
This guide shows how to pause a running pipeline for human input, how to wire a stdin frontend, and how to compose human ashlars with approval loops and learning loops. For the conceptual model see Ask Human.
4.7.1 Prerequisites
A working pipeline. If you don’t have one, work through Getting Started first.
Basic understanding of Racket channels (make-channel, channel-get, channel-put).
4.7.2 1. Add an ask-human ashlar
make-ask-human takes a channel bundle plus four keyword arguments: a formatter that builds the question, a #:name, a #:produces symbol for the answer node, and an optional #:queries list of node types the formatter reads. Building the channel inside call-with-collected-ask-human-channels auto-registers it.
(require stone stone/tools) (define approve-channels (make-ask-human-channel 'ask-approve (make-channel) ; out (make-channel) ; in (make-channel))) ; cancel (define ask-approve (make-ask-human approve-channels #:format-fn (lambda (dag) (define proposal (dag-nearest-ancestor dag 'proposal)) (format "Approve proposal: ~a? (yes/no)" (node-get proposal 'summary))) #:name 'ask-approve #:produces 'human-response #:queries '(proposal)))
At run time the ashlar calls the formatter, pushes the resulting string onto the out channel, blocks on the in channel for an answer, and produces a typed node whose content is (hasheq 'response answer-string). Downstream ashlars and predicates read the answer with (node-get node 'response).
4.7.3 2. Wire a stdin frontend
A frontend is a long-lived thread that receives the pipeline’s list of channels, syncs on all their out channels, prompts the user, and writes the answer back to the matching in channel. The pipeline builder hands the list over — the frontend doesn’t need to know what ashlars exist.
Build the pipeline inside call-with-collected-ask-human-channels:
(require racket/list stone stone/tools) (define-values (my-pipeline channels) (call-with-collected-ask-human-channels (lambda () (define ch (make-ask-human-channel 'ask-approve (make-channel) (make-channel) (make-channel))) (define ask-approve (make-ask-human ch ...)) (~> propose verify ask-approve))))
Nested call-with-collected-ask-human-channels calls auto-bubble: channels built anywhere inside the outermost call land in its returned list regardless of nesting depth. Same-name collisions raise at construction time.
Start the frontend with the collected list:
(define (start-stdin-frontend! channels) (thread (lambda () (let loop () (define ask-evts (map (lambda (ch) (handle-evt (ask-human-channel-out ch) (lambda (question) (list question (ask-human-channel-in ch) (ask-human-channel-cancel ch))))) channels)) (define result (apply sync ask-evts)) (define question (first result)) (define in-ch (second result)) (define cancel-ch (third result)) (displayln (format "\n? ~a" question)) (display "> ") (flush-output) (define answer (read-line (current-input-port))) (cond [(eof-object? answer) (channel-put cancel-ch 'eof)] [else (channel-put in-ch answer) (loop)]))))) (start-stdin-frontend! channels) (define-values (result final-dag) (run-pipeline my-pipeline (make-dag)))
The frontend syncs on all out channels simultaneously. When any one fires, the handle-evt wrapper tells the loop which in channel to reply on.
4.7.4 3. Use an approval loop pattern
Approval loops ask the human to confirm a result. The ask-human ashlar lives inside the body because the iteration isn’t done until the human has weighed in:
(define approve-or-retry (ashlar-loop (~> propose verify ask-approve) #:until approved? #:max 5))
Where propose drafts a change, verify runs static checks, ask-approve asks the human, and approved? inspects the response:
(define (approved? node) (and (equal? (node-type node) 'human-response) (regexp-match? #rx"(?i:^(y|yes|ok|approve))" (node-get node 'response ""))))
On rejection the whole body runs again. This is the shape you want whenever the predicate can’t be satisfied without the human.
4.7.5 4. Use a learning loop pattern
Learning loops ask the human only when the body couldn’t figure something out on its own. Fold a ashlar-match into the body so the ask only runs on the unhappy path:
(define discover-config (ashlar-loop (~> discover (ashlar-match (lens 'needs-help?) [#t ask-for-language] [#f noop])) #:until has-project-config? #:max 5))
discover tries to produce a project config from whatever is already in the DAG and sets 'needs-help? based on whether required fields are present. The ashlar-match fires ask-for-language on missing fields or noop on a clean pass. Iteration N+1’s discover sees the answer landed by iteration N’s ask.
4.7.6 5. Branch on a human response
For routing decisions, use ashlar-match with a lens extractor:
(define after-approval (ashlar-match (lens 'response) ["yes" proceed-ashlar] ["no" cancel-ashlar]))
Cases match exactly, so normalize the response upstream if you want to handle "y", "yes", "Y", and "ok" the same way — a small make-ashlar that rewrites the response into a canonical form, placed between the ask-human ashlar and the match, is the usual shape.
4.7.7 6. Cancel a pending question
A frontend can cancel a pending question by putting anything on the cancel channel:
(channel-put (ask-human-channel-cancel ch) 'cancelled)
The ask-human ashlar’s sync unblocks on the cancel branch and produces a node with an empty response string. Downstream predicates that check for an affirmation return false; code that needs to distinguish "cancelled" from "said no" can compare the response to "".
4.7.8 7. Test with a scripted frontend
For unit tests, skip the stdin frontend and feed answers directly into the channels from a scripted thread. Start the thread before run-pipeline so the first question has a consumer waiting:
(require rackunit) (test-case "pipeline handles rejection then approval" (define ch (make-ask-human-channel 'test (make-channel) (make-channel) (make-channel))) (define ask (make-ask-human ch #:format-fn (lambda (dag) "approve?") #:name 'ask-approve #:produces 'human-response)) (thread (lambda () (channel-get (ask-human-channel-out ch)) (channel-put (ask-human-channel-in ch) "no") (channel-get (ask-human-channel-out ch)) (channel-put (ask-human-channel-in ch) "yes"))) (define-values (result dag) (run-pipeline my-pipeline (make-dag))) (check-true (approved? result)))
4.7.9 Troubleshooting
Frontend hangs at startup. (apply sync '()) blocks forever. If the pipeline has no ask-human ashlars the collected channels list is '(); either skip starting the frontend when the list is empty, or guard the sync.
Question is printed but no ashlar sees the answer. The frontend is writing to the wrong channel. ask-human-channel fields are (name out in cancel) in that order. The ashlar reads from in; the frontend must channel-put on in, not out.
Tests block on human input. The scripted frontend thread wasn’t started before run-pipeline.
Multiple ask-human ashlars in the same pipeline. Each ashlar gets its own channel and registers separately. The stdin frontend syncs on all of them at once — whichever fires first is routed to stdin and the others keep waiting.
4.7.10 See also
Ask Human — why human interaction is an ashlar.
Edge Primitives — the body-fold pattern.
Your First Orchestration — end-to-end tutorial.
Use the TUI (below) — a ready-made frontend.
4.8 Handle failures in a pipeline
This guide is a catalog of patterns for when an ashlar can’t do its work. Ashlars signal failure by returning a failure node rather than raising; the composition primitives recognize that shape and propagate it outward.
For the conceptual model, see Ashlars, Edge Primitives, and Agents and Tools (for the adversary + heal-with pattern on make-agent-ashlar).
4.8.1 Prerequisites
A working pipeline. If you don’t have one, work through Getting Started first.
Comfort building ashlars with make-ashlar. See Write a custom ashlar if that’s new.
4.8.2 1. Return a failure node from your ashlar
When an ashlar can’t complete its job, build a failure node with make-failure-node instead of a typed one. The signature is (make-failure-node parents kind reason [meta]): kind is a symbol naming the class of failure, reason is a human-readable message, and the optional meta is a hash of extra context.
(require stone) (define check-file-exists (make-ashlar (lambda (dag) (define cfg (dag-nearest-ancestor dag 'config)) (define path (node-get cfg 'file-path)) (cond [(and path (file-exists? path)) (typed-node dag 'file-ok (hasheq 'path path))] [else (make-failure-node (dag-heads dag) 'file-missing (format "required file not found: ~a" path))])) #:produces 'file-ok #:queries '(config) #:name 'check-file-exists))
You don’t need to log the failure yourself. stone/logging installs a hook on failure-log-handler that make-failure-node fires on every call — every failure node becomes an 'error-level event in the trace automatically.
One subtlety: make-ashlar’s wrapper appends successful nodes to the DAG but leaves the DAG unchanged when the function returns a failure. The failure node is a value in flight — it propagates through the composition primitives but doesn’t land in the DAG itself.
4.8.3 2. Understand what propagates when
Each composition primitive has its own rule:
~> (sequence) — stops at the first failure and returns it. Subsequent ashlars in the chain are skipped.
ashlar-loop — a body failure exits immediately with that failure. A predicate never satisfied within #:max exits with a fresh 'loop-exhausted failure.
ashlar-match — no head node or no matching branch produces its own 'match-failed failure. A branch that itself returns a failure propagates through unchanged.
ashlar-map / ashlar-parallel — failing lanes are dropped from the DAG append step. The composite’s returned node is the last lane’s output; if that’s a failure, the composite reports it even though earlier successful lanes did land in the DAG. ashlar-map with an empty item list produces 'map-empty; ashlar-parallel with no lanes produces 'parallel-empty.
A failure at any depth propagates outward without exceptions. Every primitive above the failure gets a chance to recognize it, branch on it, or wrap it.
4.8.4 3. Check for failure at the top level
run-pipeline always returns (values result dag). Check with failure-node? before trusting it:
(define-values (result final-dag) (run-pipeline my-pipeline (make-dag))) (cond [(failure-node? result) (displayln (format "Pipeline failed: ~a — ~a" (node-get result 'kind) (node-get result 'reason))) (exit 1)] [else (displayln "Pipeline completed.") (displayln (node-content result))])
The failure node’s content is a hasheq with 'kind (a symbol like 'llm-parse-failed) and 'reason (a string). The final-dag is the partial result — inspect it with dag-nodes if you need to see what landed before the failure.
4.8.5 4. Retry with ashlar-loop
If an ashlar is flaky — an LLM reply that sometimes needs another pass, a predicate that needs more information — wrap it in a ashlar-loop:
(define retry-3x (ashlar-loop propose-config #:until has-required-fields? #:max 3))
Critical: this retries only when the body succeeded but the predicate said no. If the body returns a failure, the loop exits with it immediately. For retrying on body failure, see section 6.
4.8.6 5. Recover by folding repair into the loop body
When an ashlar needs external help between attempts — usually human input, sometimes a cleanup step — fold the repair work into the loop body as a sibling ashlar. A simple sequence runs the repair every pass; a ashlar-match inside the body runs it only when the try actually failed.
; Always run the repair step (safe when it's cheap and idempotent) (define discover-config (ashlar-loop (~> propose-config ask-for-missing-fields) #:until has-required-fields? #:max 5)) ; Run the repair only when the body's output is incomplete (define discover-config (ashlar-loop (~> propose-config (ashlar-match (lens 'complete?) [#t noop] [#f ask-for-missing-fields])) #:until has-required-fields? #:max 5))
The match-inside-body shape is the one to reach for when the repair has costs you only want to pay when you have to — asking a human a needless question, hitting a paid API, or firing a side effect that makes no sense on the happy path.
Iteration N+1’s body sees every node iteration N produced, so propose-config on the next pass has strictly more information to work with.
For conversation-level healing — where an LLM ashlar’s draft is rejected and the model needs feedback folded into its next turn — see make-agent-ashlar’s #:adversary and #:heal-with pair in Agents and Tools.
4.8.7 6. Retry on body failure, not just predicate miss
Sometimes you want to retry when the body itself fails — e.g., an LLM call that returned an unparseable response. ashlar-loop exits on body failure by default. The workaround is to reify the failure into data so the body always returns a typed node and the predicate inspects the content:
(define propose-with-retry (make-ashlar (lambda (dag) (define-values (result dag) (propose-config dag)) (cond [(failure-node? result) (typed-node dag 'proposal-attempt (hasheq 'ok #f 'reason (node-get result 'reason)))] [else (typed-node dag 'proposal-attempt (hasheq 'ok #t 'result (node-content result)))])) #:produces 'proposal-attempt #:name 'propose-with-retry)) (define retry-loop (ashlar-loop propose-with-retry #:until (lambda (node) (node-get node 'ok)) #:max 5))
The wrapping ashlar catches the failure and turns it into a 'proposal-attempt node tagged ok: #f. The loop sees a successful body, runs the predicate, finds ok #f, and iterates. Use this when the failure is expected, recoverable by retry alone, and worth retrying a bounded number of times.
4.8.8 7. Read failure events from the trace
Because make-failure-node auto-logs, every failure emits an 'error-level event. Filter the trace file:
jq 'select(.level == "error")' trace.jsonl |
Narrow by kind — the event field is the failure kind itself, not a generic 'failure:
jq 'select(.data.event == "llm-parse-failed")' trace.jsonl |
Every failure event carries trace-id, span-id, and parent-span-id from the enclosing ashlar. See Debug a pipeline for the full vocabulary.
4.8.9 8. Fail fast vs recover
Fail fast. Every ashlar returns a failure when it can’t do its work, nothing catches anything, and the top-level caller prints the kind and reason and exits non-zero. Good for simple pipelines, tests, and one-off scripts.
Recover. Risky ashlars are wrapped in ashlar-loop with retries, body-level repair, or reified-failure patterns. Good for production pipelines where LLM flakiness or missing context is expected.
Recovery is localized, not global. The usual shape is a fail-fast outer pipeline with recovery islands around the ashlars that need them.
4.8.10 Troubleshooting
Pipeline exits with "Pipeline failed" but I can’t find the originating ashlar. Filter the trace with jq ’select(.level == "error")’ — every failure event carries span-id, parent-span-id, and the enclosing ashlar’s name.
Stone returned a failure but it isn’t in the final DAG. Expected. make-ashlar’s wrapper leaves the DAG unchanged on failure.
ashlar-loop doesn’t retry when the body fails. Correct by design. For retrying on body failure, use the reified-failure pattern from section 6.
ashlar-map reports success even though some lanes failed. ashlar-map filters failing lanes out of the appended DAG but returns the last lane’s output. If the last lane happened to succeed, the top-level result looks fine. Use dag-query-all on the final DAG to inspect what landed from each lane.
'match-failed with no obvious cause. Either the DAG had no head node or the extractor returned a value that matches no branch. ashlar-match has no default — add explicit branches for every value you expect.
4.8.11 See also
Ashlars — failure as a value.
Edge Primitives — how each primitive propagates failures, and the body-fold pattern.
Agents and Tools — the adversary + heal-with pattern for conversation-level healing.
Debug a pipeline — reading the trace.
Write a custom ashlar — the make-ashlar shape this guide builds on.
4.9 Configure a caller
A caller is a function that knows how to talk to a specific LLM API. Stone ships two factories — make-openai-caller for OpenAI-compatible servers and make-anthropic-caller for Anthropic’s Messages API — and both accept an #:extra-body hash for passing provider-specific fields that Stone doesn’t bake into the signature.
This guide covers the configurations you’re most likely to need: picking the right factory, setting the model, disabling Qwen3.5’s default thinking mode, and passing provider-specific reasoning controls. For the design reasons these knobs exist, see Provider constraints.
4.9.1 Prerequisites
Stone installed and a pipeline that needs an LLM call.
Either an OpenAI-compatible endpoint (vLLM, ollama, OpenAI, Fireworks, Together, etc.) or an Anthropic API key.
4.9.2 1. OpenAI-compatible servers
For vLLM, ollama, LM Studio, LiteLLM, Fireworks, Together, Groq, DeepSeek, Qwen/DashScope, and OpenAI itself:
(require stone stone/llm-client) (define caller (make-openai-caller #:url "http://localhost:8000")) (default-model "Qwen/Qwen3.5-35B-A3B")
#:url can be the base URL or the full /v1/chat/completions URL; /v1/chat/completions is appended when missing. #:api-key defaults to empty — no Authorization header is sent. Set it if your provider needs one:
(define caller (make-openai-caller #:url "https://api.openai.com/v1/chat/completions" #:api-key (getenv "OPENAI_API_KEY"))) (default-model "gpt-4o-mini")
4.9.3 2. Anthropic
(define caller (make-anthropic-caller #:api-key (getenv "ANTHROPIC_API_KEY"))) (default-model "claude-sonnet-4-6-20250514")
#:url defaults to https://api.anthropic.com/v1/messages — override if you’re proxying through a gateway.
4.9.4 3. Disable Qwen3.5 thinking mode
Qwen3.5-family models ship with reasoning enabled by default, which blows 60-second harness timeouts on schema-enforced outputs and interferes with tool calls. Disable it via #:extra-body:
(define caller (make-openai-caller #:url "http://localhost:8000" #:extra-body (hasheq 'chat_template_kwargs (hasheq 'enable_thinking #f))))
The caller closes over #:extra-body at construction time, so every request this caller sends carries the flag. No per-call plumbing.
If one pipeline needs thinking and another doesn’t, construct two callers and hand each to the ashlars that want it:
(define reasoning-caller (make-openai-caller #:url "...")) (define fast-caller (make-openai-caller #:url "..." #:extra-body (hasheq 'chat_template_kwargs (hasheq 'enable_thinking #f))))
See Provider constraints for why the /no_think soft-switch from earlier Qwen releases isn’t an option on 3.5.
4.9.5 4. Provider-specific reasoning controls
Different providers expose reasoning as different request fields. Stone doesn’t normalize them. Use #:extra-body:
- OpenAI o1/o3:
(make-openai-caller #:url "..." #:extra-body (hasheq 'reasoning_effort "low")) xAI Grok: nested object with budget — check the provider’s docs for the exact shape.
DeepSeek: model-only — pick deepseek-chat for no reasoning, deepseek-reasoner for chain-of-thought. No #:extra-body needed.
4.9.6 5. Reserved keys
Stone reserves the keys it uses to thread the model, messages, tools, and schemas. Trying to override them through #:extra-body raises at call time:
4.9.7 6. Multiple callers in one pipeline
Callers are values. Hand different callers to different ashlars when different parts of a pipeline want different providers, different models, or different thinking settings:
(define planning-caller (make-anthropic-caller #:api-key (getenv "ANTHROPIC_API_KEY"))) (define coding-caller (make-openai-caller #:url "http://localhost:8000" #:extra-body (hasheq 'chat_template_kwargs (hasheq 'enable_thinking #f)))) (define plan-step (make-agent-ashlar planning-caller #:produces 'plan #:model "claude-sonnet-4-6-20250514" ...)) (define code-step (make-agent-ashlar coding-caller #:produces 'implementation #:model "Qwen/Qwen3.5-35B-A3B" ...))
#:model on make-agent-ashlar overrides default-model for that ashlar. Use it to pin a specific ashlar to a specific model while the rest of the pipeline uses the default.
4.9.8 Troubleshooting
"reserved key X cannot be overridden". You tried to set a reserved field through #:extra-body. Use the corresponding Stone parameter instead (#:model default-model, the agent ashlar’s #:response-format, the middleware list).
Anthropic caller accepts #:outbox but no events fire. Streaming isn’t implemented for the Anthropic caller yet — see Provider constraints.
Tools attached, schema set, no tool call ever fires on vLLM. The schema decoder is masking tool-use tokens. Use the The ashlar-pair pattern.
Request times out on Qwen3.5 schema-enforced calls. Thinking mode is on by default. Add the chat_template_kwargs extra-body shown above.
4.9.9 See also
Provider constraints — the design of the #:extra-body escape hatch and the provider quirks it works around.
The ashlar-pair pattern — the shape to reach for when you hit the schema-plus-tools wall on vLLM.
4.10 Use the TUI
The Stone terminal UI displays pipeline progress across three panels — agents, DAG, and conversation — and routes human interaction through the same input line you type commands into. This guide shows how to launch a pipeline under the TUI, navigate the panels, answer ask-human prompts, and cancel a pending question. For the channel model behind human interaction see Ask Human.
4.10.1 Prerequisites
Stone installed and visible to raco.
A terminal that supports ANSI escape sequences, the alternate screen buffer, and mouse reporting. Any modern emulator will do.
An agent builder: a .stone/settings.rkt that provides a build-agent thunk, or a –url flag so a default builder is synthesized for you.
4.10.2 1. Launch the TUI
The packaged entry point is raco stone. It loads .stone/settings.rkt (walking up from the current directory to $HOME), merges any flags you pass, and launches the TUI with the resulting agent ashlar and the ask-human channels collected during construction.
raco stone --url http://localhost:8000 --model llama-3.1-70b |
Flag |
| Purpose |
–url <url> |
| LLM API endpoint URL. |
–model <model> |
| Model identifier. Also updates default-model. |
Both flags are #:once-each; flags override values from .stone/settings.rkt. If neither a config file nor –url produces an agent, the CLI prints a hint and exits without launching the TUI.
For a pipeline you assemble yourself, call run-tui directly. Wrap construction in call-with-collected-ask-human-channels so any ask-human ashlars register their channels:
(require stone/tui-main stone/tools) (define-values (my-agent channels) (call-with-collected-ask-human-channels (lambda () (build-my-pipeline)))) (run-tui #:agent my-agent #:ask-channels channels)
4.10.3 2. Panel layout
+-------------------+-----------------------------+ |
| Agents | Conversation | |
| | | |
+-------------------+ | |
| DAG | | |
| | | |
+-------------------+-----------------------------+ |
> input line |
[i]nput [b]ranch [m]erge [d]etail [/]search [?]help |
Agents (top-left) — one line per running agent with a status icon (● active, ◌ waiting, ✓ resolved, ⊘ halted) and the current turn number.
DAG (bottom-left) — structural nodes along the current head path, with ▶ markers for branches you can drill into. Selecting a branch and pressing Enter focuses that sub-history in the conversation panel; Esc drills back out.
Conversation (right) — the message history of the current view, rendered with a role prefix (>> user, << assistant, .. agent, ## tool). Streaming LLM output appears inline as tokens arrive.
Input line — > prompt, with a block cursor when active.
The focused panel is drawn with double-line borders; unfocused panels use single lines.
4.10.4 3. Keybindings
The TUI has three modes: normal, input, and modal.
4.10.4.1 Global
Key |
| Action |
Ctrl+C |
| Interrupt the running agent thread. |
Ctrl+\\ |
| Cancel a pending ask-human question; agent keeps running. |
4.10.4.2 Normal mode
Key |
| Action |
q |
| Quit the TUI and restore the terminal. |
Tab |
| Cycle focus: agents → DAG → conversation → agents. |
1 |
| Focus agents. |
2 |
| Focus DAG. |
3 |
| Focus conversation. |
j / k |
| Move selection or scroll depending on panel. |
i |
| Enter input mode. |
Enter |
| In DAG: drill into the selected branch. |
Esc |
| In DAG with a non-empty stack: drill out one level. |
+ / - |
| In conversation: expand / collapse thinking blocks. |
4.10.4.3 Input mode
Key |
| Action |
Enter |
| Submit. Answers a pending ask-human question or starts a new turn. |
Esc |
| Cancel input mode and clear the buffer. |
Backspace |
| Delete the character before the cursor. |
Left / Right / Home / End |
| Move the cursor. |
4.10.4.4 Modal mode
Tool permission prompts: y / n / Esc to approve / decline / dismiss.
4.10.5 4. Answer an ask-human prompt
When a make-ask-human ashlar fires, the TUI appends the question as an assistant node to the conversation panel (prefixed with [ask-user]), activates input mode, and records which ashlar is waiting so Enter routes the answer to the right in channel.
Type your answer and press Enter. The ashlar unblocks and produces a node whose content is (hasheq 'response <your-text>).
Channel discovery is explicit. run-tui takes an #:ask-channels keyword argument — the list returned by call-with-collected-ask-human-channels when the pipeline was built. The TUI syncs on every channel’s out, so adding a new ask-human ashlar inside the builder exposes it to the TUI automatically.
4.10.6 5. Cancel a pending question
Press Ctrl+\\ to cancel an ask-human question without interrupting the rest of the pipeline. The TUI writes 'cancel to the ashlar’s cancel channel; the ashlar’s sync unblocks on the cancel branch and produces a node with content (hasheq 'response "").
Key |
| Effect on agent |
| Effect on pending question |
Ctrl+C |
| Kills the agent thread. |
| Question is dropped. |
Ctrl+\\ |
| Left alone. |
| Cancel sent; ashlar produces empty response. |
Reach for Ctrl+\\ when the pipeline should continue but you want to refuse the specific prompt. Reach for Ctrl+C when you want the whole run to stop.
4.10.7 6. Exit cleanly
Press q from normal mode. The TUI kills the reader and agent threads, restores the alternate screen buffer, un-hides the cursor, and returns control to the shell.
4.10.8 Troubleshooting
Terminal corrupted after a crash. Run reset, or stty sane; tput cnorm; tput rmcup, to restore cursor visibility and leave the alternate screen buffer.
Garbled box-drawing or status glyphs. The TUI uses Unicode box characters and small glyphs. Switch to a UTF-8-capable terminal.
Ask-human prompt never appears. The TUI only syncs on the channels passed in via #:ask-channels. An ashlar whose channel registers outside the builder handed to the TUI won’t be visible. Build such ashlars inside the same collector helper call.
Pipeline appears frozen. The TUI updates on events. If no events are flowing — the agent is blocked on a long LLM call — no panels redraw until the call returns. Enable the trace receiver from Trace a run to see whether the agent is actually working.
q does nothing. You’re in input mode. Press Esc first, then q.
4.10.9 See also
Add human interaction to a pipeline — wiring ask-human ashlars into a pipeline.
Ask Human — the channel model.
Observability — the event types that drive panel updates.
4.11 Test ashlars that use tools
This guide shows how to test an ashlar that calls tools — ask_user, a file reader, a shell runner — without either mocking the whole ashlar or shipping prose where a tool call belonged. The test harness is stone/test. For the conceptual model see The test harness.
The motivating failure mode: you wire an agent ashlar with an ask_user tool, run it against a real LLM, and the model writes the question as prose instead of emitting a tool call. The pipeline doesn’t pause for input; it continues with an assistant turn that looks like a question but never reaches the frontend. A unit test that asserts check-tool-called? catches this.
4.11.1 Prerequisites
An ashlar built with make-agent-ashlar and at least one make-tool middleware. If you don’t have one, work through Write a custom ashlar first.
Basic familiarity with rackunit.
4.11.2 1. Live test: does this ashlar actually call ask_user?
A live test exercises the real LLM. It catches behavior drift — the model changed its mind about whether a sentence warrants a tool call — that a mock test can’t see. Stub only the tool’s side effects; leave the LLM wired up.
(require rackunit stone stone/llm-client stone/llm-ashlar stone/test stone/tools) (define caller (make-openai-caller #:url "http://localhost:8000")) (define ask-user-tool (make-tool 'ask_user #:schema (hasheq 'name "ask_user" 'input_schema (hasheq 'type "object" 'properties (hasheq 'question (hasheq 'type "string")) 'required '("question"))) #:handler (lambda (args) (values "mocked answer" (hasheq))))) (define discover (make-agent-ashlar caller #:produces 'project-config #:middleware (list ask-user-tool) #:system (lambda (_) "You discover the project's language.") #:user (lambda (_) "Find out what language this project uses.") #:max-turns 5)) (test-case "discover asks the user when the project is empty" (define stubbed (ashlar-with-tool-stub discover 'ask_user (stub-answer "racket"))) (with-live-harness #:caller caller #:timeout 30 (stubbed (make-dag)) (check-tool-called? 'ask_user "expected the model to emit an ask_user tool call, not prose")))
What each piece does:
make-openai-caller builds a real LLM caller.
ashlar-with-tool-stub walks the ashlar and swaps the named tool’s handler. The schema is preserved so the LLM sees the same tool signature; only the side-effecting handler is replaced.
stub-answer "racket" returns a handler that always emits that string with meta (hasheq 'stub #t).
with-live-harness installs a log receiver on stone-logger, binds harness-current-caller, and records every tool dispatch.
check-tool-called? asserts that at least one call of that name was recorded.
4.11.3 2. Mock test: pipeline wiring without network calls
A mock test scripts the LLM’s responses. It catches wiring bugs — the ashlar forgot to declare the tool, the decide function never loops — and runs in milliseconds.
(require rackunit stone stone/llm-ashlar stone/llm-types stone/test stone/tools) (define ask-user-tool (make-tool 'ask_user #:schema (hasheq 'name "ask_user" 'input_schema (hasheq 'type "object" 'properties (hasheq 'question (hasheq 'type "string")) 'required '("question"))) #:handler (lambda (args) (values "racket" (hasheq))))) (define (loop-if-recommended ctx recs) (if (ormap (lambda (r) (eq? (recommendation-type r) 'loop)) recs) (recommendation 'loop 'mock (hash)) (recommendation 'continue 'mock (hash)))) (test-case "discover asks, then emits project-config" (with-mock-harness #:responses (list (llm-tool-call "ask_user" #:question "language?") (llm-text "done")) (define discover (make-agent-ashlar (harness-current-caller) #:produces 'project-config #:middleware (list ask-user-tool) #:decide loop-if-recommended #:system (lambda (_) "s") #:user (lambda (_) "u") #:max-turns 5)) (discover (make-dag)) (check-tool-called? 'ask_user) (check-tool-call-count 'ask_user 1)))
Key moves:
#:responses takes a list of llm-response values, one per LLM turn. Running off the end raises.
llm-tool-call builds a response whose only content is a single tool call. llm-text builds a text-only response. llm-multi combines prose with one or more tool calls.
harness-current-caller is the caller the harness installed. Bind your agent ashlar to it.
tool-calls, tool-calls-by-name, and tool-call-count are accessors on the recorded dispatches.
If you already have a caller you want to drive manually, pass it via #:caller instead:
(with-mock-harness #:caller my-caller (my-ashlar (make-dag)) (check-tool-called? 'ask_user))
4.11.4 3. Catch accidentally-unstubbed tools with #:strict-tools?
When a test forgets to ashlar-with-tool-stub a tool, the real handler runs. For an ask_user tool wired to a live frontend, that means the test blocks on stdin. #:strict-tools? turns this into a loud failure at the end of the harness block.
(test-case "strict-tools catches an unstubbed ask_user" (check-exn exn:fail? (lambda () (with-mock-harness #:responses (list (llm-tool-call "ask_user" #:question "?") (llm-text "done")) #:strict-tools? #t (define s (make-agent-ashlar (harness-current-caller) #:produces 'x #:middleware (list ask-user-tool) ; NOT stubbed! #:decide loop-if-recommended #:system (lambda (_) "s") #:user (lambda (_) "u") #:max-turns 3)) (s (make-dag))))))
Turn #:strict-tools? #t on by default for new tests. Turn it off only when you deliberately want the real handler to run.
4.11.5 4. Bound test runtime with #:timeout
Live tests can hang. #:timeout runs the body on a thread, syncs on thread-dead-evt with a deadline, and breaks the thread if the deadline expires.
(with-live-harness #:caller caller #:timeout 30 (stubbed (make-dag)) (check-tool-called? 'ask_user))
Timeout is in seconds. with-mock-harness doesn’t take #:timeout — mock tests run synchronously.
4.11.6 5. Reference: which assertion should I use?
Assertion |
| Use when |
| At least one call of this name happened. | |
| No call of this name happened. | |
| Exactly n calls happened. | |
| Full list of records for custom asserts. | |
| Records for one tool. | |
| Count for one tool. |
Each record is an immutable hasheq with keys 'name, 'input, 'result-text, and 'result-meta. Use (hash-ref r 'input) in a custom assertion when you care about the arguments:
(define calls (tool-calls-by-name 'ask_user)) (check-equal? (hash-ref (hash-ref (first calls) 'input) 'question) "What language does the project use?")
4.11.7 6. Stub helpers
stub-answer s — returns a handler that always produces the string s with meta (hasheq 'stub #t).
stub-fn f — returns f unchanged. Reach for it when you want a handler that inspects args or records state itself.
4.11.8 7. Response builders
llm-text s — an llm-response with text s and no tool calls.
llm-tool-call name #:id id #:question q #:input h — a response whose only content is a single tool call. #:question is a shortcut that wraps the value in (hasheq 'question q).
llm-multi text-or-response call ... — combine prose with one or more tool calls in the same response.
4.11.9 8. When to use live vs. mock
Live tests catch behavior drift in prompts. Mock tests catch wiring bugs deterministically and fast. Write both: a single live test per ashlar to pin behavior, and a suite of mock tests for the wiring.
4.11.10 Troubleshooting
check-tool-called? fails but you can see the tool fire in logs. The recorder drains stone-logger at level 'info. If you muted the logger or swapped it, the events never reach the recorder.
with-live-harness times out but the ashlar seemed to finish. The body ran on a thread and threw — the harness syncs on thread-dead-evt, which fires on either clean exit or exception. Remove the timeout briefly to see the real error.
ashlar-with-tool-stub errors with "ashlar has no tool named X". The tool name isn’t in the ashlar’s middleware, and there’s no child subtree carrying it. Double-check the tool symbol matches (middleware-name ...).
with-mock-harness raises "ran out of scripted responses". The agent made more LLM calls than you supplied responses for. Either add more or tighten #:decide.
4.11.11 See also
The test harness — the conceptual model.
Add human interaction to a pipeline — the channel model behind the live ask_user tool.
Write a custom ashlar — the ashlar shape these tests exercise.
4.12 Use the designing-ashlars skill
The designing-ashlars skill is a Claude Code skill that walks you through scaffolding a new Stone pipeline collaboratively. It ships inside the Stone package and installs into your local ~/.claude/skills directory with one raco command. This guide shows how to install it, what to expect during a session, what gets written to disk, and when to reach for something else instead.
4.12.1 Prerequisites
Stone installed and visible to raco.
Claude Code installed, with skills enabled.
An empty directory if you’re starting greenfield, or an existing Stone project you want to extend.
4.12.2 1. Install the skill
From any directory, run:
raco stone install-skill |
This copies the bundled skill payload from the Stone package into ~/.claude/skills/designing-ashlars, including a full snapshot of the Stone scribblings (reference, explanation, how-to, tutorials) under ~/.claude/skills/designing-ashlars/scribblings/. The skill agent reads from that local snapshot for primitive lookup, so it works identically whether you installed Stone from source or via raco pkg install stone. Claude Code picks the skill up the next time you start a session.
When you upgrade Stone, re-run install-skill to refresh the scribblings snapshot — the bare command (no –force) refreshes scribblings only and leaves the skill body unchanged, so it’s safe to run any time:
raco stone install-skill |
To overwrite the skill body itself (e.g. after the designing-ashlars skill source changes upstream), pass –force:
raco stone install-skill --force |
Pass –help to print usage; unknown flags are rejected with an error.
4.12.3 2. What happens in a session
Once installed, the skill triggers on phrases like "design an ashlar", "scaffold a Stone pipeline", or "I want Stone to do X". It runs four phases:
Intake. Five required fields, asked one at a time: goal, inputs, outputs, provider, and whether the pipeline ever needs to ask a human something. While you answer, the skill silently scans any existing Stone code so it can reuse node-type symbols rather than inventing parallel synonyms.
Decomposition. A locked English-and-pseudocode proposal: one-line-per-step list, the same shape as Stone topology code, plus a "decisions worth challenging" block. You accept whole, accept with renames, or restructure. After acceptance the decomposition is locked — no silent re-decomposition mid-implementation.
Per-step design and verify. For each ashlar in the decomposition, a six-beat micro-cycle: atomic-or-LLM, tools and middleware, adversary design, write the code, write the test, run the test. Never advance past a red test.
Final assembly. Caller wiring in main.rkt, an info.rkt reconciliation pass, an optional smoke run (gated on an explicit yes), and a hand-off summary listing every ashlar and every file produced.
4.12.4 3. What gets generated
The deliverable is a multi-file scaffold on disk, not a single self-contained .rkt blob:
ashlars.rkt — one (make-ashlar ...) or (make-agent-ashlar ...) per step, plus any adversaries.
topology.rkt — the ~> / ashlar-loop / ashlar-match form that ties the ashlars together, exporting a single pipeline binding.
main.rkt — the entry point: caller construction, the (default-model ...) call, ask-human channel wiring (if the pipeline takes human input), and a top-level run-pipeline call.
tests.rkt — contract tests for atomic ashlars, two-test adversary pairs (good draft passes, bad draft fails) for LLM steps with adversaries, and smoke tests for LLM steps without.
middleware.rkt — only generated if the design surfaced custom (make-tool ...) definitions; stays absent otherwise.
The four-file layout (five with middleware) is the default deliverable, not a deluxe option. It exists so you can iterate on prompts, topology, and tests independently without re-reading the whole pipeline.
4.12.5 4. When NOT to use this skill
The skill is for designing new pipelines. Reach for something else when:
You’re debugging an existing pipeline. See Debug a pipeline and Trace a run. The skill’s intake-then-design flow assumes there’s nothing running yet.
You’re extending a single ashlar’s body — adding one more match arm, tightening a prompt, swapping a model. That’s an edit, not a design; open the file and make the change.
You’re making a small topology edit — wrapping one step in ashlar-loop, inserting a guard, swapping ~> for a match. Use Validate a topology to check the result; the full design flow is overkill.
You have a conceptual question about Stone. Open the scribblings: raco docs stone. The skill produces code, not explanations.
4.12.6 See also
Ashlars — the conceptual model the skill is scaffolding against.
Validate a topology — used inside the skill’s Beat 4 to confirm couplings.
Debug a pipeline — what to do once the scaffold is running and something goes wrong.
4.13 Coming from LangChain / LlamaIndex / bare SDK
If you’ve written LLM pipelines before in Python, Stone looks unfamiliar twice over: the composition model differs from what chain-of-calls frameworks do, and the host language is Racket. This guide translates the mental models. It doesn’t teach Stone from scratch — for that, start with Getting Started. It’s for the case where you already know what you want to build and need the Stone vocabulary for the pieces you’re used to thinking in.
4.13.1 If you’re coming from LangChain / LangGraph
The closest mental mapping is LangGraph. Both frameworks separate topology from execution; both make coordination a first-class object. The differences are where it gets interesting.
4.13.1.1 State
LangGraph |
| Stone |
A graph state dict passed between nodes, mutable per-edge |
| An append-only typed DAG read by every ashlar, mutated only by appending |
Reducers on state fields merge updates |
| No merges — every new node has an identity; history is intact |
State is effectively flat (dict of fields) |
| State is a graph of typed nodes; order and lineage are visible |
In LangGraph, a node writes to the graph state by returning a partial dict that the framework merges. In Stone, an ashlar writes a single new typed node. The previous state isn’t overwritten — it’s still in the DAG, and subsequent ashlars can walk back to it via dag-nearest-ancestor or dag-query-all.
This matters when you need to retry, branch, or see what the pipeline has actually accumulated. LangGraph shows you the current state dict; Stone shows you the entire history.
4.13.1.2 Edges and routing
LangGraph |
| Stone |
add_edge / add_conditional_edges build the graph |
| ~> sequences ashlars; ashlar-match branches on DAG content |
Conditional routing via a predicate function |
| ashlar-match with an extractor/lens dispatches on the latest node |
START and END nodes are explicit |
| Pipeline is just an ashlar; run-pipeline applies it to (make-dag) |
LangGraph’s add_conditional_edges is Stone’s ashlar-match. Both dispatch based on runtime data; the difference is that ashlar-match’s branches are collected statically, so the validator can walk them before the pipeline runs.
4.13.1.3 Chains
A LangChain chain (prompt | llm | parser) is usually one make-agent-ashlar in Stone — the parser lives inside the agent ashlar’s response-format machinery. A multi-step chain becomes a ~> of ashlars. A chain with memory becomes an ashlar that reads from the DAG; there is no separate memory object.
4.13.1.4 Agents
LangChain agents and LangGraph ReAct agents map onto make-agent-ashlar with a middleware list. Tools become make-tool middlewares. The agent’s decide function picks between 'loop (run another turn) and 'continue (exit with the current draft).
The adversary + heal-with pattern has no direct analog in LangChain. It’s closest to a ReAct agent with a validation step, but the rejection feedback enters the agent’s conversation rather than being injected as a new prompt. See Agents and Tools.
4.13.1.5 Streaming
LangChain/LangGraph streaming works out of the box in Python. Stone’s OpenAI-compatible caller streams by default; the Anthropic caller doesn’t yet. If streaming observability matters, stick to OpenAI-compat or wait for the Anthropic streaming adapter. See Provider constraints.
4.13.2 If you’re coming from LlamaIndex
LlamaIndex focuses on retrieval and query engines; its pipeline abstractions (QueryPipeline, Workflow) are adjacent to Stone’s territory.
LlamaIndex |
| Stone |
QueryPipeline with modules linked by input/output keys |
| ~> chain of ashlars coupled by node types |
Workflow with event-driven step dispatch |
| ashlar-match / ashlar-loop dispatching on DAG content |
Built-in vector stores, retrievers, rerankers |
| Bring your own — retrieval is just another ashlar |
Stone doesn’t ship RAG primitives. If your work is "index a corpus, retrieve, rerank, answer," LlamaIndex probably remains the faster path and you can call it from a Stone ashlar when you need pipeline-level composition.
Where Stone helps over LlamaIndex Workflow: the DAG is visible and replayable. Every intermediate result is a typed node with a stable identity; you can print it, compare two runs, or feed it into a downstream pipeline without extra serialization plumbing.
4.13.3 If you’re coming from a bare SDK
You have a script that calls the Anthropic or OpenAI SDK in a loop. It works. Why would you reach for Stone?
You wouldn’t — until the script grows any of:
A loop with a termination condition that depends on structured output.
A human-in-the-loop approval step that pauses execution.
Fan-out over a list of items followed by a reducer.
Multi-turn tool use with deterministic validation after each turn.
Replay of a failed run from the point of failure.
A typed trace of what the model saw and decided at each step.
Each of these is a state machine, and hand-rolling them around SDK calls gets expensive fast. Stone’s tax is learning the framework; your current tax is the state machines you already maintain. The break-even is roughly "do I want to do this pattern more than once in more than one place."
4.13.3.1 Mapping SDK calls to Stone
A single SDK call becomes one make-agent-ashlar with #:max-turns 1 and #:middleware '() — the single-shot pattern. The response text is available via node-text on the produced node.
A multi-turn tool-using agent becomes make-agent-ashlar with tool middlewares and a decide function. The framework handles the conversation list, tool schema injection, tool dispatch, and the turn loop. You don’t write that loop.
A retry wrapper becomes a ashlar-loop with a predicate. A fallback becomes a ashlar-match branching on the previous ashlar’s result.
4.13.4 Racket-specific things LangChain doesn’t have
A few Racket-native idioms Stone uses that don’t have Python analogs:
Parameters (default-model, current-trace-id) — dynamic-scope values that propagate through every call inside a parameterize body. Closest Python analog: contextvars.ContextVar.
Channels — the ask-human rendezvous uses Racket channels, which are synchronous from one side and asynchronous from the other. Closest Python analog: asyncio.Queue.
Keyword arguments — Stone uses #:like-this keyword args everywhere. They’re required to appear after positional arguments and can appear in any order.
See Reading Racket for the minimum you need to parse Stone code.
4.13.5 Why you’d choose Stone
Three reasons it’s worth the switch:
Compose-time validation. Every topology can be walked before it runs. Missing producer, typo’d query, lens that doesn’t match the schema — all caught in milliseconds without an LLM call.
Immutable, content-addressed history. Every run produces a DAG you can print, diff, replay, or feed into the next pipeline. When a production pipeline misbehaves, you open the DAG and see what happened instead of hunting through logs.
One layer. A six-turn tool-using agent and a three-line parser compose with the same primitive. You’re not juggling two frameworks (scaffold + agent runtime) and their interactions.
Three reasons you wouldn’t:
Ecosystem. LangChain/LlamaIndex have more integrations, more vector stores, more provider adapters. If your work is primarily gluing existing libraries, Python wins.
Streaming on Anthropic. Not yet implemented. If streaming to the Anthropic API matters today, use the SDK or wait.
You don’t need the invariants. One-off scripts rarely benefit from compose-time validation. Reach for the SDK.
4.13.6 See also
Why Stone — the full positioning.
Getting Started — build a two-ashlar pipeline.
Ashlars — the atomic unit, in depth.
The DAG as Pipeline State — the shared medium that replaces chain memory and graph state.