On this page:
tap/  c
exhaust/  c
gen:  source
15.1 Defining Source Types
15.2 Source Types
15.3 Source Expressions
15.4 Untrusted Source Expressions
15.5 Transferring Bytes
$transfer:  scope
$transfer:  progress
$transfer:  budget
$transfer:  budget:  exceeded
$transfer:  budget:  rejected
$transfer:  timeout

15 Data Sourcing

 (require xiden/source) package: xiden

A source is a value that implements gen:source. When used with fetch, a source produces an input port and an estimate of how many bytes that port can produce. Xiden uses sources to read data with safety limits. To tap a source means gaining a reference to the input port and estimate. To exhaust a source means gaining a reference to a contextual error value. We can also say a source is tapped or exhausted.

Note that these terms are linguistic conveniences. There is no value representing a tapped or exhausted state. The only difference is where control ends up in the program, and what references become available as a result of using fetch on a source.


tap/c : chaperone-contract?

(-> input-port?
    (or/c +inf.0 exact-positive-integer?)
A contract for a procedure used to tap a source.

The procedure is given an input port, and an estimate of the maximum number of bytes the port can produce. This estimate could be +inf.0 to allow unlimited reading, provided the user allows this in their configuration.

A contract for a procedure used when a source is exhausted.

The sole argument to the procedure depends on the source type.




source? : predicate/c


(fetch source tap exhaust)  any/c

  source : source?
  tap : tap/c
  exhaust : exhaust/c
gen:source is a generic interface that requires an implementation of fetch. source? returns #t for values that do so.

fetch attempts to tap source. If successful, fetch calls tap in tail position, passing the input port and the estimated maximum number of bytes that port is expected to produce.

Otherwise, fetch calls exhaust in tail position using a source-dependent argument.


(logged-fetch id source tap)  logged?

  id : any/c
  source : source?
  tap : tap/c


(struct $fetch $message (id errors))

  id : any/c
  errors : (listof $message?)
Returns a logged procedure that applies fetch to source and tap.

The computed value of the logged procedure is FAILURE if the source is exhausted. Otherwise, the value is what’s returned from tap.

The log will gain a ($fetch id errors) message, where errors is empty if the fetch is successful.

15.1 Defining Source Types


(define-source (id [field field-contract] ...) body ...)

Defines a new source type.

On expansion, define-source defines a new structure type using (struct id (field ...)). The type is created with a guard that enforces per-field contracts. Instances implement gen:source.

define-source injects several bindings into the lexical context of body:

To understand how these injected bindings work together, let’s go through a few examples.

Use %tap to fulfil data with an input port and an estimated data length. In the simplest case, you can return constant data.

byte-source uses %tap like so:

(define-source (byte-source [data bytes?])
  (%tap (open-input-bytes data)
        (bytes-length data)))

Notice that the data is used to both define a data field (where it appears by bytes?) and to reference the value contained in that field (within open-input-bytes and bytes-length).

Use %fail in tail position with error information to indicate a source was exhausted.

file-source uses %fail like so:

(define-source (file-source [path path-string?])
  (with-handlers ([exn:fail:filesystem? %fail])
    (%tap (open-input-file path)
          (+ (* 20 1024)
             (file-size path)))))

Note that %fail is an exhaust/c procedure, so it does not have to be given an exception as an argument.

%fetch is a recursive variant of fetch that uses %tap, but a possibly different exhaust/c procedure. This allows sources to control an entire fetch process and fall back to alternatives.

first-available-source uses a resursive fetch to iterate on available sources until it has none left to check.

(define-source (first-available-source [available (listof source?)] [errors list?])
  (if (null? available)
      (%fail (reverse errors))
      (%fetch (car available)
              (λ (e)
                (%fetch (first-available-source (cdr available) (cons e errors))

Finally, %src is just a reference to an instance of the structure containing each field.


(bind-recursive-fetch %tap %fail)

  (->* (source?) (exhaust/c) any/c)
  %tap : tap/c
  %fail : exhaust/c
Returns a fetch-like procedure that accepts only a source and an optional procedure to mark the source exhausted (defaults to %fail).

15.2 Source Types


(struct exhausted-source (value))

  value : any/c
A source that is always exhausted, meaning that (fetch (exhausted-source v) void values) returns v.


(struct byte-source (data))

  data : bytes?
A source that, when tapped, yields bytes directly from data.


(struct first-available-source (sources errors))

  sources : (listof sources?)
  errors : list?
A source that, when tapped, yields bytes from the first tapped source.

If all sources for an instance are exhausted, then the instance is exhausted. As sources are visited, errors are functionally accumulated in errors.

The value produced for an exhausted first-available-source is the longest possible list bound to errors.


(struct text-source (data)
    #:extra-constructor-name make-text-source)
  data : string?
A source that, when tapped, produces bytes from the given string.


(struct lines-source (suffix lines)
    #:extra-constructor-name make-lines-source)
  suffix : (or/c #f char? string?)
  lines : (listof string?)
A source that, when tapped, produces bytes from a string. The string is defined as the combination of all lines, such that each line has the given suffix. If suffix is #f, then it uses a conventional line ending according to (system-type 'os).

(define src
  (lines-source "\r\n"
                '("#lang racket/base"
                  "(provide a)"
                  "(define a 1)")))
; "#lang racket/base (provide a) (define a 1) "
(fetch src consume void)


(struct file-source (path)
    #:extra-constructor-name make-file-source)
  path : path-string?
A source that, when tapped, yields bytes from the file located at path.

If the source is exhausted, it yields a relevant exn:fail:filesystem exception.


(struct http-source (request-url)
    #:extra-constructor-name make-http-source)
  request-url : (or/c url? url-string?)
A source that, when tapped, yields bytes from an HTTP response body. The response comes from a GET request to request-url, and the body is only used for a 2xx response.

If the source is exhausted, it yields a relevant exception.

The behavior of the source is impacted by XIDEN_DOWNLOAD_MAX_REDIRECTS.


(struct http-mirrors-source (request-urls)
    #:extra-constructor-name make-http-mirrors-source)
  request-urls : (listof (or/c url-string? url?))
Like http-source, but tries each of the given URLs using first-available-source.

15.3 Source Expressions

The following procedures are useful for declaring sources in a package input.


(sources variant ...)  source?

  variant : (or/c string? source?)
Like first-available-source, but each string argument is coerced to a source using coerce-source.


(coerce-source variant)  source?

  variant : (or/c string? source?)
Returns variant if it is already a source.

Otherwise, returns (string->source variant) in terms of the plugin.


(from-catalogs query-string [url-templates])

  (listof url-string?)
  query-string : string?
  url-templates : (listof string?) = (XIDEN_CATALOGS)
Returns a list of URL strings computed by URL-encoding query-string, and then replacing all occurrances of "$QUERY" in url-templates with the encoded string.


(from-file relative-path-expr)

Expands to a complete path. relative-path-expr is a relative path made complete with regards to the source directory in which this expression appears.

Due to this behavior, from-file will return different results when the containing source file changes location on disk.

15.4 Untrusted Source Expressions


(struct $bad-source-eval $message (reason datum))

  reason : (or/c 'security 'invariant)
  datum : any/c


(eval-untrusted-source-expression datum [ns])  logged?

  datum : any/c
  ns : namespace? = (current-namespace)
eval-untrusted-source-expression returns a logged procedure which evaluates (eval datum ns) in the context of a security guard. The security guard blocks all file operations (except 'exists), and all network operations.

If the evaluation produces a source, then the result of the logged procedure is that source, and no other messages will appear in the program log.

If the evaluation does not produce a source, then the result is FAILURE and the program log gains a ($bad-source-eval 'invariant datum).

If the evaluation is blocked by the security guard, then the result is FAILURE and the program log gains a ($bad-source-eval 'security datum).

15.5 Transferring Bytes

 (require xiden/port) package: xiden

xiden/port reprovides all bindings from racket/port, in addition to the bindings defined in this section.


(mebibytes->bytes mebibytes)  exact-nonnegative-integer?

  mebibytes : real?
Converts mebibytes to bytes, rounded up to the nearest exact integer.


(transfer bytes-source    
  #:on-status on-status    
  #:max-size max-size    
  #:buffer-size buffer-size    
  #:transfer-name transfer-name    
  #:est-size est-size    
  #:timeout-ms timeout-ms)  void?
  bytes-source : input-port?
  bytes-sink : output-port?
  on-status : (-> $transfer? any)
  max-size : (or/c +inf.0 exact-positive-integer?)
  buffer-size : exact-positive-integer?
  transfer-name : non-empty-string?
  est-size : (or/c +inf.0 real?)
  timeout-ms : (>=/c 0)
Like copy-port, except bytes are copied from bytes-source to bytes-sink, with at most buffer-size bytes at a time.

transfer applies on-status repeatedly and synchronously with $transfer messages.

transfer reads no more than N bytes from bytes-source, and will wait no longer than timeout-ms for the next available byte.

The value of N is computed using est-size and max-size. max-size is the prescribed upper limit for total bytes to copy. est-size is an estimated for the number of bytes that bytes-source will actually produce (this is typically not decided by the user). If (> est-size max-size), then the transfer will not start. Otherwise N is bound to est-size to hold bytes-source accountable for the estimate.

If est-size and max-size are both +inf.0, then transfer will not terminate if bytes-source does not produce eof.


(struct $transfer $message ()
A message pertaining to a transfer status.


(struct $transfer:scope $transfer (name message)
  name : string?
  message : (and/c $transfer? (not/c $transfer:scope?))
Contains a $transfer message from a call to transfer where the transfer-name argument was bound to name.


(struct $transfer:progress $transfer (bytes-read
  bytes-read : exact-nonnegative-integer?
  max-size : (or/c +inf.0 exact-positive-integer?)
  timestamp : exact-positive-integer?
Represents progress transferring bytes to a local source.

Unless max-size is +inf.0, (/ bytes-read max-size) approaches 1. You can use this along with the timestamp (in seconds) to reactively compute an estimated time to complete.


(struct $transfer:budget $transfer ()
A message pertaining to a transfer space budget.


(struct $transfer:budget:exceeded $message (size)
  size : exact-positive-integer?
A request to transfer bytes was halted because the transfer read overrun-size bytes more than allowed-max-size bytes.



(struct $transfer:budget:rejected $message (proposed-max-size
  proposed-max-size : (or/c +inf.0 exact-positive-integer?)
  allowed-max-size : exact-positive-integer?
A request to transfer bytes never started because the transfer estimated proposed-max-size bytes, which exceeds the user’s maximum of allowed-max-size.



(struct $transfer:timeout $message (bytes-read wait-time)
  bytes-read : exact-nonnegative-integer?
  wait-time : (>=/c 0)
A request to transfer bytes was halted after bytes-read bytes because no more bytes were available after wait-time milliseconds.