Racket

FFmpeg Decoder Definitions🔗ℹ

This module provides the direct FFmpeg-backed decoder layer used by the audio pipeline. It is deliberately small and stateful. A caller creates one decoder instance, opens one file on it, queries the selected audio stream, repeatedly asks for the next PCM block, and closes the instance again.

The module does not expose FFmpeg metadata. It only exposes the information needed for playback: stream count, sample rate, channel count, duration, bitrate, decoded PCM data, and sample positions. The output format is fixed: interleaved signed 32-bit PCM, four bytes per sample, using FFmpeg’s AV_SAMPLE_FMT_S32 sample format.

The FFmpeg libraries are loaded when the module is required. The module checks that the runtime FFmpeg major versions are in the supported range configured by the implementation. This binding targets the FFmpeg library major versions used by FFmpeg 6, 7, and 8: libavutil 58 to 60, libavcodec 60 to 62, libavformat 60 to 62, and libswresample 4 to 6. Unsupported runtime versions fail early, before a decoder instance is used.

On Windows, the private library loader may download the bundled sound-library set into Racket’s add-on directory before the FFI libraries are opened. On Unix-like systems, the FFmpeg libraries are expected to be installed by the operating system or platform package manager and to be reachable by Racket’s FFI library search path.

1 Layering🔗ℹ

This module is the low-level Racket FFI layer. It is normally wrapped by "ffmpeg-ffi.rkt" and then by "ffmpeg-decoder.rkt". The first wrapper adapts this module to the command protocol used by the audio decoder frontend. The second wrapper exposes the callback-oriented decoder interface used by the rest of the playback pipeline.

The distinction matters for buffer lifetime. At this level, fmpg-buffer returns the current buffer owned by the decoder instance. The adapter in "ffmpeg-ffi.rkt" copies that buffer before passing it to "ffmpeg-decoder.rkt". Code that uses this module directly must copy the buffer itself when the bytes must survive the next decoder operation.

2 Implementation strategy🔗ℹ

This module talks directly to the FFmpeg shared libraries through Racket’s FFI. There is no C shim that hides FFmpeg’s structs or normalizes their layout. The price of that choice is that the Racket side must know enough of the relevant C struct layouts to read the fields used by the decoder. The benefit is that the binding remains a Racket module with direct access to the platform FFmpeg libraries.

2.1 Versioned C struct layouts🔗ℹ

The module defines only partial FFmpeg structs. A partial definition includes the fields that are actually read by this decoder and enough preceding fields to compute their offsets. Fields that are not needed are represented only by their C type, or by a repetition count such as (6 _int). Tail fields after the last required member are not described.

The helper module "private/cstruct-helper.rkt" provides make-offsets and def-cstruct. The make-offsets form computes offsets for a sequence of C field types, while def-cstruct expands to a define-cstruct form whose public fields are placed at those explicit offsets. This keeps the actual accessors small while still accounting for skipped fields in the C layout.

The right layout is selected when the module is required, after the runtime FFmpeg major versions have been read from the libraries. For the supported range, AVCodecParameters uses one layout for libavcodec major version 60 and another for major versions 61 and 62. Likewise, AVFrame uses one layout for libavutil major version 58 and another for major versions 59 and 60. The other partial structs used by this module are defined with a single layout across the supported versions.

This is why the version check is performed before normal decoder use. The accessors are correct only for the FFmpeg major-version ranges for which the partial layouts were written. If a future FFmpeg major release changes a layout before one of the fields used here, the version range should be extended only after the affected partial definitions have been checked.

2.2 Sequential failure handling🔗ℹ

Most FFmpeg calls report ordinary failure through C-style return values or null pointers. The implementation treats those results as normal control flow, not as exceptional Racket failures. The let/assert form is used for this pattern. It behaves like a sequential binding form: each binding can be checked immediately, and a failed check returns the specified failure value for the whole form.

That style is used for setup paths such as opening a file, selecting stream information, allocating the codec context, and initializing the resampler. It keeps the success path linear while still giving each FFmpeg return value or pointer a local check. Predicates such as a-!nullptr?, a-nullptr?, a-true?, and a->=? express the usual FFmpeg checks directly next to the binding that produced the value.

For loops where decoding must stop immediately from a nested position, the module uses define/return from define-return. This gives functions such as fmpg-decode-next! and the internal resampler drain routine an explicit early-return continuation without using exceptions for normal FFmpeg outcomes. The two helpers are implementation dependencies; they are not re-exported by this module.

3 Decoder instances🔗ℹ

A decoder instance is an opaque value returned by fmpg-init. Its structure type and predicate are not exported. Pass the value back to the functions in this module and do not inspect it directly. The contracts below therefore use any/c for the instance argument. Operationally, that argument must be a value returned by fmpg-init.

The instance owns native FFmpeg resources: a format context, a codec context, an audio frame, a resampler, and the Racket byte string used for the current PCM block. Finalizers are installed as a last line of defence, but callers should still call fmpg-close! explicitly when playback stops or when the file is no longer needed. Explicit close keeps the lifetime of native resources predictable.

procedure
(fmpg-init) → any/c

Creates a new decoder instance. The result is an opaque instance value, or #f if the instance could not be created.

Creating the instance does not open a file. Use fmpg-open-file! before querying stream information or decoding audio.

procedure
(fmpg-open-file! instance filename) → (integer-in 0 1)
instance : any/c
filename : (or/c path? string?)

Opens filename on instance, reads the stream information, selects the best audio stream, initializes the codec context, and initializes the resampler.

The function returns 1 on success and 0 on failure. On failure, partially initialized native state is closed again.

An instance can only have one file open. Close it with fmpg-close! before opening another file on the same instance. A non-string, non-path filename is treated as an open failure and returns 0.

procedure
(fmpg-close! instance) → void?
instance : any/c

Closes instance if it is open and releases the native FFmpeg resources owned by the instance. The stored audio information is reset. Calling this function with #f or with an already closed instance is harmless.

procedure
(fmpg-is-open instance) → (integer-in 0 1)
instance : any/c

Returns 1 when instance is ready for decoding and 0 otherwise. An instance is ready only after a file has been opened, a usable audio stream has been selected, and the decoder and resampler have been initialized.

4 Audio stream information🔗ℹ

The decoder selects one audio stream for playback using FFmpeg’s best-stream selection. The stream count reports how many audio streams were found in the container, but decoding is performed only for the selected stream.

The term sample in this module means a sample frame: one time step in the audio stream, across all channels. For stereo 32-bit output, one sample frame therefore occupies (* 2 4) bytes in the returned PCM buffer.

procedure
(fmpg-audio-stream-count instance) → exact-nonnegative-integer?
instance : any/c

Returns the number of audio streams in the open container. If the instance is not open, the result is 0.

procedure
(fmpg-audio-sample-rate instance) → exact-nonnegative-integer?
instance : any/c

Returns the selected audio stream’s sample rate. If the instance is not ready, the result is 0.

procedure
(fmpg-audio-channels instance) → exact-nonnegative-integer?
instance : any/c

Returns the selected audio stream’s channel count. If the instance is not ready, the result is 0.

procedure
(fmpg-audio-bits-per-sample instance) → exact-positive-integer?
instance : any/c

Returns the fixed output sample width in bits. The current output format is 32-bit signed PCM, so this function returns 32. The value is independent of the input file’s original sample format and does not depend on the instance state.

procedure
(fmpg-audio-bytes-per-sample instance)
→ exact-positive-integer?
instance : any/c

Returns the fixed output sample width in bytes. The current output format is 32-bit signed PCM, so this function returns 4. The value is independent of the input file’s original sample format and does not depend on the instance state.

procedure
(fmpg-duration-ms instance) → exact-integer?
instance : any/c

Returns the duration of the selected audio stream in milliseconds. If the stream duration is not available, the container duration is used as a fallback. If no duration can be determined, or when the instance is not ready, the result is -1.

procedure
(fmpg-duration-samples instance) → exact-integer?
instance : any/c

Returns the duration of the selected audio stream in sample frames. If the stream duration is not available, the container duration is used as a fallback. If no duration can be determined, or when the instance is not ready, the result is -1.

procedure
(fmpg-file-bitrate instance) → exact-integer?
instance : any/c

Returns the container bitrate in bits per second. If the bitrate is unavailable or if the instance is not open, the result is -1.

5 Decoding🔗ℹ

Decoding is block oriented. Each call to fmpg-decode-next! clears the previous PCM block and attempts to produce the next decoded block for the selected audio stream. When the call returns 1, the block can be read with fmpg-buffer and described with the buffer query functions.

procedure
(fmpg-decode-next! instance) → (integer-in 0 1)
instance : any/c

Decodes until a block of PCM output is available or no more output can be produced. The function returns 1 when fmpg-buffer contains a non-empty PCM block. It returns 0 when the instance is not ready, when end of stream has been reached, or when FFmpeg reports an unrecoverable decode error.

The function does not distinguish end of stream from a decode failure. The intended playback loop treats 0 as no further PCM block available for this decoder instance.

Internally, decoding receives all currently available frames, reads packets for the selected audio stream, sends those packets to the codec, converts decoded frames through libswresample, and drains the resampler at end of stream. Non-selected packets are skipped.

procedure
(fmpg-seek-ms! instance target-pos-ms) → (integer-in 0 1)
instance : any/c
target-pos-ms : exact-nonnegative-integer?

Seeks the selected audio stream to target-pos-ms milliseconds and resets the decoder and resampler state. The function returns 1 on success and 0 on failure.

Seeking uses FFmpeg’s backward seek flag. After the seek, decoded audio before the requested target sample is discarded so the next buffer starts at, or as close as FFmpeg can provide to, the requested position.

6 Decoded buffers🔗ℹ

The PCM buffer belongs to the decoder instance. It is replaced by the next call to fmpg-decode-next!, fmpg-seek-ms!, or fmpg-close!. Treat the returned byte string as read-only. Copy it if it must outlive the next decoder operation or if another component may mutate it.

procedure
(fmpg-buffer instance) → (or/c bytes? #f)
instance : any/c

Returns the current decoded PCM block as a byte string, or #f when no PCM block is available.

The byte string contains interleaved signed 32-bit samples. Its logical frame count is available as the difference between fmpg-buffer-end-sample and fmpg-buffer-start-sample. Its byte size is also available through fmpg-buffer-size.

procedure
(fmpg-buffer-size instance) → exact-nonnegative-integer?
instance : any/c

Returns the number of valid bytes in the current PCM buffer. If no decoder state is available, or if the size would not fit in the internal integer range, the function returns 0.

procedure
(fmpg-buffer-start-sample instance)
→ exact-nonnegative-integer?
instance : any/c

Returns the first sample frame represented by the current PCM buffer. If no decoder state is available, the result is 0.

procedure
(fmpg-buffer-end-sample instance) → exact-nonnegative-integer?
instance : any/c

Returns the half-open end position of the current PCM buffer: the first sample frame after the current buffer. The number of sample frames in the buffer is the end position minus fmpg-buffer-start-sample. If no decoder state is available, the result is 0.

procedure
(fmpg-sample-position instance) → exact-nonnegative-integer?
instance : any/c

Returns the decoder’s next sample-frame position after the current output. During normal decoding it is the same as fmpg-buffer-end-sample for the current buffer. After a seek, it is reset to the target position before new audio is decoded.

7 FFmpeg version information🔗ℹ

procedure
(ffmpeg-version lib) →
(list/c exact-nonnegative-integer?
        exact-nonnegative-integer?
        exact-nonnegative-integer?)
   lib :
(or/c 'avutil 'avcodec 'avformat
      'swr 'swresample)

Returns the runtime version of one FFmpeg library as a three-element list containing the major, minor, and micro version numbers. The symbols 'swr and 'swresample both refer to libswresample.

The function raises an exception for an unknown library symbol.

8 Use through the decoder frontend🔗ℹ

The direct API above is normally wrapped by "ffmpeg-ffi.rkt" and by "ffmpeg-decoder.rkt". The frontend function ffmpeg-open returns a handle or #f when the file does not exist. Its stream-info callback receives a mutable hash with at least these playback keys:

(list 'sample-rate
      'channels
      'bits-per-sample
      'bytes-per-sample
      'total-samples
      'duration)

The audio callback receives the same hash extended for the current buffer with these keys:

(list 'sample
'current-time)

The hash is followed by a copied byte string and its valid byte count. The copy is made by "ffmpeg-ffi.rkt", not by the low-level buffer function itself.

The frontend’s seek function accepts a percentage of the stream and translates that percentage to a sample position. The adapter then translates the sample position to milliseconds and calls fmpg-seek-ms!. This is why the low-level module exposes millisecond seeking while the frontend exposes percentage seeking.

9 Example🔗ℹ

The following example opens a file, decodes all PCM blocks, and reports their byte ranges and sample ranges. A real playback loop would pass each buffer to the audio output layer before requesting the next block.

(define dec (fmpg-init))

(when (and dec (= (fmpg-open-file! dec "track.ogg") 1))
  (printf "~a Hz, ~a channels, ~a ms\n"
          (fmpg-audio-sample-rate dec)
          (fmpg-audio-channels dec)
          (fmpg-duration-ms dec))

  (let loop ()
    (when (= (fmpg-decode-next! dec) 1)
      (define pcm (fmpg-buffer dec))
      (define size (fmpg-buffer-size dec))
      (define start (fmpg-buffer-start-sample dec))
      (define end (fmpg-buffer-end-sample dec))
      (printf "decoded ~a bytes, samples [~a, ~a)\n"
              size start end)

      (loop)))

  (fmpg-close! dec))

A simple seek flow looks the same after the seek succeeds. The following code moves to 30 seconds and then requests the next decoded buffer.

(when (= (fmpg-seek-ms! dec 30000) 1)
  (when (= (fmpg-decode-next! dec) 1)
    (define pcm (fmpg-buffer dec))
    (define start (fmpg-buffer-start-sample dec))
    (printf "first buffer after seek starts at sample ~a\n" start)))

1	Layering
2	Implementation strategy
3	Decoder instances
4	Audio stream information
5	Decoding
6	Decoded buffers
7	FFmpeg version information
8	Use through the decoder frontend
9	Example