Lexers
1 Overview
1.1 Token Helpers
lexer-token-name
lexer-token-value
lexer-token-has-positions?
lexer-token-start
lexer-token-end
lexer-token-eof?
1.2 Profiles
2 CSS
make-css-lexer
css-string->tokens
2.1 CSS Returned Tokens
make-css-derived-lexer
css-string->derived-tokens
css-derived-token?
css-derived-token-tags
css-derived-token-has-tag?
css-derived-token-text
css-derived-token-start
css-derived-token-end
2.2 CSS Derived Tokens
css-profiles
3 HTML
make-html-lexer
html-string->tokens
3.1 HTML Returned Tokens
make-html-derived-lexer
html-string->derived-tokens
html-derived-token?
html-derived-token-tags
html-derived-token-has-tag?
html-derived-token-text
html-derived-token-start
html-derived-token-end
3.2 HTML Derived Tokens
html-profiles
4 Markdown
make-markdown-lexer
markdown-string->tokens
4.1 Markdown Returned Tokens
make-markdown-derived-lexer
markdown-string->derived-tokens
markdown-derived-token?
markdown-derived-token-tags
markdown-derived-token-has-tag?
markdown-derived-token-text
markdown-derived-token-start
markdown-derived-token-end
4.2 Markdown Derived Tokens
markdown-profiles
5 WAT
make-wat-lexer
wat-string->tokens
5.1 WAT Returned Tokens
make-wat-derived-lexer
wat-string->derived-tokens
wat-derived-token?
wat-derived-token-tags
wat-derived-token-has-tag?
wat-derived-token-text
wat-derived-token-start
wat-derived-token-end
5.2 WAT Derived Tokens
wat-profiles
6 Racket
make-racket-lexer
racket-string->tokens
6.1 Racket Returned Tokens
make-racket-derived-lexer
racket-string->derived-tokens
racket-derived-token?
racket-derived-token-tags
racket-derived-token-has-tag?
racket-derived-token-text
racket-derived-token-start
racket-derived-token-end
6.2 Racket Derived Tokens
racket-profiles
7 Scribble
make-scribble-lexer
scribble-string->tokens
7.1 Scribble Returned Tokens
make-scribble-derived-lexer
scribble-string->derived-tokens
scribble-derived-token?
scribble-derived-token-tags
scribble-derived-token-has-tag?
scribble-derived-token-text
scribble-derived-token-start
scribble-derived-token-end
7.2 Scribble Derived Tokens
scribble-profiles
8 Java  Script
make-javascript-lexer
javascript-string->tokens
8.1 Java  Script Returned Tokens
make-javascript-derived-lexer
javascript-string->derived-tokens
javascript-derived-token?
javascript-derived-token-tags
javascript-derived-token-has-tag?
javascript-derived-token-text
javascript-derived-token-start
javascript-derived-token-end
8.2 Java  Script Derived Tokens
javascript-profiles
9.1

Lexers🔗ℹ

Jens Axel Søgaard <jensaxel@soegaard.net>

This manual documents the public APIs in the lexers packages.

The library currently provides reusable lexers for multiple applications. Syntax coloring is the first intended application, but the lexer APIs are also designed to support other consumers.

    1 Overview

      1.1 Token Helpers

      1.2 Profiles

    2 CSS

      2.1 CSS Returned Tokens

      2.2 CSS Derived Tokens

    3 HTML

      3.1 HTML Returned Tokens

      3.2 HTML Derived Tokens

    4 Markdown

      4.1 Markdown Returned Tokens

      4.2 Markdown Derived Tokens

    5 WAT

      5.1 WAT Returned Tokens

      5.2 WAT Derived Tokens

    6 Racket

      6.1 Racket Returned Tokens

      6.2 Racket Derived Tokens

    7 Scribble

      7.1 Scribble Returned Tokens

      7.2 Scribble Derived Tokens

    8 JavaScript

      8.1 JavaScript Returned Tokens

      8.2 JavaScript Derived Tokens

1 Overview🔗ℹ

The public language modules currently available are:

Each language module currently exposes two related kinds of API:

  • A projected token API intended for general consumers such as syntax coloring.

  • A derived-token API intended for richer language-specific inspection and testing.

The projected APIs are intentionally close to parser-tools/lex. They return bare symbols, token? values, and optional position-token? wrappers built from the actual parser-tools/lex structures, so existing parser-oriented tools can consume them more easily.

The current profile split is:

  • 'coloring keeps trivia, emits 'unknown for recoverable malformed input, and includes source positions by default.

  • 'compiler skips trivia by default, raises on malformed input, and includes source positions by default.

Across languages, the projected lexer constructors return one-argument port readers. Create the lexer once, call it repeatedly on the same input port, and stop when the result is an end-of-file token. The projected category symbols themselves, such as 'identifier, 'literal, and 'keyword, are intended to be the stable public API.

1.1 Token Helpers🔗ℹ

The helper module lexers/token provides a small public API for inspecting wrapped or unwrapped projected token values without reaching directly into parser-tools/lex.

 (require lexers/token) package: lexers-lib

procedure

(lexer-token-name token)  symbol?

  token : (or/c symbol? token? position-token?)
Extracts the effective token category from a wrapped or unwrapped projected token value.

procedure

(lexer-token-value token)  any/c

  token : (or/c symbol? token? position-token?)
Extracts the effective token payload from a wrapped or unwrapped projected token value. For the bare end-of-file symbol, the result is #f.

procedure

(lexer-token-has-positions? token)  boolean?

  token : (or/c symbol? token? position-token?)
Determines whether a wrapped or unwrapped projected token value carries source positions.

procedure

(lexer-token-start token)  (or/c position? #f)

  token : (or/c symbol? token? position-token?)
Extracts the starting position from a wrapped projected token value. For unwrapped values, the result is #f.

procedure

(lexer-token-end token)  (or/c position? #f)

  token : (or/c symbol? token? position-token?)
Extracts the ending position from a wrapped projected token value. For unwrapped values, the result is #f.

procedure

(lexer-token-eof? token)  boolean?

  token : (or/c symbol? token? position-token?)
Determines whether a wrapped or unwrapped projected token value represents end of input.

1.2 Profiles🔗ℹ

The public projected APIs currently support the same profile names:

  • 'coloring

  • 'compiler

The current defaults are:

Profile

  

Trivia

  

Source Positions

  

Malformed Input

'coloring

  

'keep

  

#t

  

emit unknown tokens

'compiler

  

'skip

  

#t

  

raise an exception

For the keyword arguments accepted by make-css-lexer, css-string->tokens, make-html-lexer, html-string->tokens, make-javascript-lexer, javascript-string->tokens, make-markdown-lexer, markdown-string->tokens, make-racket-lexer, racket-string->tokens, make-scribble-lexer, scribble-string->tokens, make-wat-lexer, and wat-string->tokens:

  • #:profile selects the named default bundle.

  • #:trivia 'profile-default means “use the trivia policy from the selected profile”.

  • #:source-positions 'profile-default means “use the source-position setting from the selected profile”.

  • An explicit #:trivia or #:source-positions value overrides the selected profile default.

2 CSS🔗ℹ

 (require lexers/css) package: lexers-lib

The projected CSS API has two entry points:

procedure

(make-css-lexer [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (input-port? . -> . (or/c symbol? token? position-token?))
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Constructs a streaming CSS lexer.

The result is a procedure of one argument, an input port. Each call reads the next token from the port and returns one projected token value.

When #:source-positions is true, each result is a position-token? whose payload is either a bare symbol such as 'eof or a token? carrying a projected category such as 'identifier, 'literal, 'comment, or 'unknown.

When #:source-positions is false, the result is either a bare symbol or a token? directly.

The intended use is to create the lexer once, then call it repeatedly on the same port until it returns an end-of-file token.

Examples:
> (define lexer
    (make-css-lexer #:profile 'coloring))
> (define in
    (open-input-string "color: #fff;"))
> (port-count-lines! in)
> (list (lexer in)
        (lexer in)
        (lexer in)
        (lexer in))

(list

 (position-token (token 'identifier "color") (position 1 1 0) (position 6 1 5))

 (position-token (token 'delimiter ":") (position 6 1 5) (position 7 1 6))

 (position-token (token 'whitespace " ") (position 7 1 6) (position 8 1 7))

 (position-token (token 'literal "#fff") (position 8 1 7) (position 12 1 11)))

procedure

(css-string->tokens source 
  [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (listof (or/c symbol? token? position-token?))
  source : string?
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Tokenizes an entire CSS string using the projected token API.

This is a convenience wrapper over make-css-lexer. It opens a string port, enables line counting, repeatedly calls the port-based lexer until end-of-file, and returns the resulting token list.

2.1 CSS Returned Tokens🔗ℹ

The projected CSS API returns values in the same general shape as parser-tools/lex:

  • The end of input is reported as 'eof, either directly or inside a position-token?.

  • Most ordinary results are token? values whose token-name is a projected category and whose token-value contains language-specific text or metadata.

  • When #:source-positions is true, each result is wrapped in a position-token?.

  • When #:source-positions is false, results are returned without that outer wrapper.

Common projected CSS categories include:

  • 'whitespace

  • 'comment

  • 'identifier

  • 'literal

  • 'delimiter

  • 'unknown

  • 'eof

In 'coloring mode, whitespace and comments are kept, and recoverable malformed input is returned as 'unknown. In 'compiler mode, whitespace and comments are skipped by default, and malformed input raises an exception instead of producing an 'unknown token.

For the current CSS scaffold, token-value normally preserves the original source text of the emitted token. In particular:

  • For 'identifier, the value is the matched identifier text, such as "color" or "--brand-color".

  • For 'literal, the value is the matched literal text, such as "#fff", "12px", "url(foo.png)", or "rgb(".

  • For 'comment and 'whitespace, the value is the original comment or whitespace text when those categories are kept.

  • For 'delimiter, the value is the matched delimiter text, such as ":", ";", or "{".

  • For 'unknown in tolerant mode, the value is the malformed input text that could not be accepted.

Examples:
> (define inspect-lexer
    (make-css-lexer #:profile 'coloring))
> (define inspect-in
    (open-input-string "color: #fff;"))
> (port-count-lines! inspect-in)
> (define first-token
    (inspect-lexer inspect-in))
> (lexer-token-has-positions? first-token)

#t

> (lexer-token-name first-token)

'identifier

> (lexer-token-value first-token)

"color"

> (position-offset (lexer-token-start first-token))

1

> (position-offset (lexer-token-end first-token))

6

}

procedure

(make-css-derived-lexer)

  (input-port? . -> . (or/c 'eof css-derived-token?))
Constructs a streaming CSS lexer for the derived-token layer.

The result is a procedure of one argument, an input port. Each call reads the next raw CSS token from the port, computes its CSS-specific derived classifications, and returns one derived token value. At end of input, it returns 'eof.

The intended use is the same as for make-css-lexer: create the lexer once, then call it repeatedly on the same port until it returns 'eof.

Examples:
> (define derived-lexer
    (make-css-derived-lexer))
> (define derived-in
    (open-input-string "color: #fff;"))
> (port-count-lines! derived-in)
> (list (derived-lexer derived-in)
        (derived-lexer derived-in)
        (derived-lexer derived-in)
        (derived-lexer derived-in))

(list

 (css-derived-token

  (css-raw-token 'ident-token "color" (position 1 1 0) (position 6 1 5))

  '(property-name-candidate selector-token))

 (css-derived-token

  (css-raw-token 'colon-token ":" (position 6 1 5) (position 7 1 6))

  '())

 (css-derived-token

  (css-raw-token 'whitespace-token " " (position 7 1 6) (position 8 1 7))

  '())

 (css-derived-token

  (css-raw-token 'hash-token "#fff" (position 8 1 7) (position 12 1 11))

  '(color-literal selector-token)))

procedure

(css-string->derived-tokens source)

  (listof css-derived-token?)
  source : string?
Tokenizes an entire CSS string into derived CSS token values.

This is a convenience wrapper over make-css-derived-lexer. It opens a string port, enables line counting, repeatedly calls the derived lexer until it returns 'eof, and returns the resulting list of derived tokens.

procedure

(css-derived-token? v)  boolean?

  v : any/c
Recognizes derived CSS token values returned by make-css-derived-lexer and css-string->derived-tokens.

procedure

(css-derived-token-tags token)  (listof symbol?)

  token : css-derived-token?
Returns the CSS-specific classification tags attached to a derived CSS token.

procedure

(css-derived-token-has-tag? token tag)  boolean?

  token : css-derived-token?
  tag : symbol?
Determines whether a derived CSS token carries a given classification tag.

procedure

(css-derived-token-text token)  string?

  token : css-derived-token?
Returns the exact source text corresponding to a derived CSS token.

procedure

(css-derived-token-start token)  position?

  token : css-derived-token?
Returns the starting source position for a derived CSS token.

procedure

(css-derived-token-end token)  position?

  token : css-derived-token?
Returns the ending source position for a derived CSS token.

2.2 CSS Derived Tokens🔗ℹ

A derived CSS token pairs one raw CSS token with a small list of CSS-specific classification tags. This layer is more precise than the projected consumer-facing categories and is meant for inspection, testing, and language-aware tools.

The current CSS scaffold may attach tags such as:

  • 'at-rule-name

  • 'color-literal

  • 'color-function

  • 'selector-token

  • 'property-name

  • 'declaration-value-token

  • 'function-name

  • 'gradient-function

  • 'custom-property-name

  • 'property-name-candidate

  • 'string-literal

  • 'numeric-literal

  • 'length-dimension

  • 'malformed-token

Examples:
> (define derived-tokens
    (css-string->derived-tokens ".foo { color: red; background: rgb(1 2 3); }"))
> (map (lambda (token)
         (list (css-derived-token-text token)
               (css-derived-token-tags token)
               (css-derived-token-has-tag? token 'selector-token)
               (css-derived-token-has-tag? token 'property-name)
               (css-derived-token-has-tag? token 'declaration-value-token)
               (css-derived-token-has-tag? token 'color-literal)
               (css-derived-token-has-tag? token 'function-name)
               (css-derived-token-has-tag? token 'color-function)
               (css-derived-token-has-tag? token 'custom-property-name)
               (css-derived-token-has-tag? token 'string-literal)
               (css-derived-token-has-tag? token 'numeric-literal)
               (css-derived-token-has-tag? token 'length-dimension)))
       derived-tokens)

'(("." () #f #f #f #f #f #f #f #f #f #f)

  ("foo"

   (property-name-candidate selector-token)

   (selector-token)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f)

  ("{" () #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f)

  ("color"

   (property-name-candidate property-name)

   #f

   (property-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (":" () #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f)

  ("red"

   (property-name-candidate declaration-value-token)

   #f

   #f

   (declaration-value-token)

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (";" () #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f)

  ("background"

   (property-name-candidate property-name)

   #f

   (property-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (":" () #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f)

  ("rgb"

   (function-name color-function declaration-value-token)

   #f

   #f

   (declaration-value-token)

   #f

   (function-name color-function declaration-value-token)

   (color-function declaration-value-token)

   #f

   #f

   #f

   #f)

  ("(" () #f #f #f #f #f #f #f #f #f #f)

  ("1"

   (numeric-literal declaration-value-token)

   #f

   #f

   (declaration-value-token)

   #f

   #f

   #f

   #f

   #f

   (numeric-literal declaration-value-token)

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f)

  ("2"

   (numeric-literal declaration-value-token)

   #f

   #f

   (declaration-value-token)

   #f

   #f

   #f

   #f

   #f

   (numeric-literal declaration-value-token)

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f)

  ("3"

   (numeric-literal declaration-value-token)

   #f

   #f

   (declaration-value-token)

   #f

   #f

   #f

   #f

   #f

   (numeric-literal declaration-value-token)

   #f)

  (")" () #f #f #f #f #f #f #f #f #f #f)

  (";" () #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f)

  ("}" () #f #f #f #f #f #f #f #f #f #f))

}

value

css-profiles : immutable-hash?

The profile defaults used by the CSS lexer.

3 HTML🔗ℹ

 (require lexers/html) package: lexers-lib

The projected HTML API has two entry points:

procedure

(make-html-lexer [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (input-port? . -> . (or/c symbol? token? position-token?))
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Constructs a streaming HTML lexer.

The result is a procedure of one argument, an input port. Each call reads the next token from the port and returns one projected token value.

The projected HTML token stream includes ordinary markup tokens and inline delegated tokens from embedded <style> and <script> bodies.

When #:source-positions is true, each result is a position-token?. When it is false, the result is either a bare symbol or a token? directly.

Examples:
> (define lexer
    (make-html-lexer #:profile 'coloring))
> (define in
    (open-input-string "<section id=main>Hi</section>"))
> (port-count-lines! in)
> (list (lexer in)
        (lexer in)
        (lexer in)
        (lexer in))

(list

 (position-token (token 'delimiter "<") (position 1 1 0) (position 2 1 1))

 (position-token

  (token 'identifier "section")

  (position 2 1 1)

  (position 9 1 8))

 (position-token (token 'whitespace " ") (position 9 1 8) (position 10 1 9))

 (position-token

  (token 'identifier "id")

  (position 10 1 9)

  (position 12 1 11)))

procedure

(html-string->tokens source 
  [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (listof (or/c symbol? token? position-token?))
  source : string?
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Tokenizes an entire HTML string using the projected token API.

This is a convenience wrapper over make-html-lexer.

3.1 HTML Returned Tokens🔗ℹ

Common projected HTML categories include:

  • 'comment

  • 'keyword

  • 'identifier

  • 'literal

  • 'operator

  • 'delimiter

  • 'unknown

  • 'eof

For the current HTML scaffold:

  • tag names and attribute names project as 'identifier

  • attribute values, text nodes, entities, and delegated CSS/JS literals project as 'literal

  • punctuation such as <, </, >, />, and embedded interpolation boundaries project as 'delimiter or 'operator

  • comments project as 'comment

  • doctype/declaration markup projects as 'keyword

Examples:
> (define inspect-lexer
    (make-html-lexer #:profile 'coloring))
> (define inspect-in
    (open-input-string "<!doctype html><main id=\"app\">Hi &amp; bye</main>"))
> (port-count-lines! inspect-in)
> (define first-token
    (inspect-lexer inspect-in))
> (lexer-token-has-positions? first-token)

#t

> (lexer-token-name first-token)

'keyword

> (lexer-token-value first-token)

"<!doctype html>"

> (position-offset (lexer-token-start first-token))

1

> (position-offset (lexer-token-end first-token))

16

}

Constructs a streaming HTML lexer for the derived-token layer.

procedure

(html-string->derived-tokens source)

  (listof html-derived-token?)
  source : string?
Tokenizes an entire HTML string into derived HTML token values.

procedure

(html-derived-token? v)  boolean?

  v : any/c
Recognizes derived HTML token values returned by make-html-derived-lexer and html-string->derived-tokens.

procedure

(html-derived-token-tags token)  (listof symbol?)

  token : html-derived-token?
Returns the HTML-specific classification tags attached to a derived HTML token.

procedure

(html-derived-token-has-tag? token tag)  boolean?

  token : html-derived-token?
  tag : symbol?
Determines whether a derived HTML token carries a given classification tag.

procedure

(html-derived-token-text token)  string?

  token : html-derived-token?
Returns the exact source text corresponding to a derived HTML token.

procedure

(html-derived-token-start token)  position?

  token : html-derived-token?
Returns the starting source position for a derived HTML token.

procedure

(html-derived-token-end token)  position?

  token : html-derived-token?
Returns the ending source position for a derived HTML token.

3.2 HTML Derived Tokens🔗ℹ

The current HTML scaffold may attach tags such as:

  • 'html-tag-name

  • 'html-closing-tag-name

  • 'html-attribute-name

  • 'html-attribute-value

  • 'html-text

  • 'html-entity

  • 'html-doctype

  • 'comment

  • 'embedded-css

  • 'embedded-javascript

  • 'malformed-token

Delegated CSS and JavaScript body tokens keep their reusable semantic tags and gain an additional language marker such as 'embedded-css or 'embedded-javascript.

Examples:
> (define derived-tokens
    (html-string->derived-tokens
     "<!doctype html><section id=main class=\"card\">Hi &amp; bye<style>.hero { color: #c33; }</style><script>const root = document.querySelector(\"#app\");</script></section>"))
> (map (lambda (token)
         (list (html-derived-token-text token)
               (html-derived-token-tags token)
               (html-derived-token-has-tag? token 'html-tag-name)
               (html-derived-token-has-tag? token 'html-attribute-name)
               (html-derived-token-has-tag? token 'html-attribute-value)
               (html-derived-token-has-tag? token 'html-text)
               (html-derived-token-has-tag? token 'html-entity)
               (html-derived-token-has-tag? token 'embedded-css)
               (html-derived-token-has-tag? token 'embedded-javascript)))
       derived-tokens)

'(("<!doctype html>" (keyword html-doctype) #f #f #f #f #f #f #f)

  ("<" (delimiter) #f #f #f #f #f #f #f)

  ("section" (identifier html-tag-name) (html-tag-name) #f #f #f #f #f #f)

  (" " (whitespace) #f #f #f #f #f #f #f)

  ("id"

   (identifier html-attribute-name)

   #f

   (html-attribute-name)

   #f

   #f

   #f

   #f

   #f)

  ("=" (operator) #f #f #f #f #f #f #f)

  ("main"

   (literal html-attribute-value)

   #f

   #f

   (html-attribute-value)

   #f

   #f

   #f

   #f)

  (" " (whitespace) #f #f #f #f #f #f #f)

  ("class"

   (identifier html-attribute-name)

   #f

   (html-attribute-name)

   #f

   #f

   #f

   #f

   #f)

  ("=" (operator) #f #f #f #f #f #f #f)

  ("\"card\""

   (html-attribute-value literal)

   #f

   #f

   (html-attribute-value literal)

   #f

   #f

   #f

   #f)

  (">" (delimiter) #f #f #f #f #f #f #f)

  ("Hi " (literal html-text) #f #f #f (html-text) #f #f #f)

  ("&amp;" (literal html-entity) #f #f #f #f (html-entity) #f #f)

  (" bye" (literal html-text) #f #f #f (html-text) #f #f #f)

  ("<" (delimiter) #f #f #f #f #f #f #f)

  ("style" (identifier html-tag-name) (html-tag-name) #f #f #f #f #f #f)

  (">" (delimiter) #f #f #f #f #f #f #f)

  ("." (embedded-css delimiter) #f #f #f #f #f (embedded-css delimiter) #f)

  ("hero"

   (embedded-css identifier property-name-candidate selector-token)

   #f

   #f

   #f

   #f

   #f

   (embedded-css identifier property-name-candidate selector-token)

   #f)

  (" " (embedded-css whitespace) #f #f #f #f #f (embedded-css whitespace) #f)

  ("{" (embedded-css delimiter) #f #f #f #f #f (embedded-css delimiter) #f)

  (" " (embedded-css whitespace) #f #f #f #f #f (embedded-css whitespace) #f)

  ("color"

   (embedded-css identifier property-name-candidate property-name)

   #f

   #f

   #f

   #f

   #f

   (embedded-css identifier property-name-candidate property-name)

   #f)

  (":" (embedded-css delimiter) #f #f #f #f #f (embedded-css delimiter) #f)

  (" " (embedded-css whitespace) #f #f #f #f #f (embedded-css whitespace) #f)

  ("#c33"

   (embedded-css literal color-literal declaration-value-token)

   #f

   #f

   #f

   #f

   #f

   (embedded-css literal color-literal declaration-value-token)

   #f)

  (";" (embedded-css delimiter) #f #f #f #f #f (embedded-css delimiter) #f)

  (" " (embedded-css whitespace) #f #f #f #f #f (embedded-css whitespace) #f)

  ("}" (embedded-css delimiter) #f #f #f #f #f (embedded-css delimiter) #f)

  ("</" (delimiter) #f #f #f #f #f #f #f)

  ("style" (identifier html-closing-tag-name) #f #f #f #f #f #f #f)

  (">" (delimiter) #f #f #f #f #f #f #f)

  ("<" (delimiter) #f #f #f #f #f #f #f)

  ("script" (identifier html-tag-name) (html-tag-name) #f #f #f #f #f #f)

  (">" (delimiter) #f #f #f #f #f #f #f)

  ("const"

   (embedded-javascript keyword)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript keyword))

  (" "

   (embedded-javascript whitespace)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript whitespace))

  ("root"

   (embedded-javascript identifier declaration-name)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript identifier declaration-name))

  (" "

   (embedded-javascript whitespace)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript whitespace))

  ("="

   (embedded-javascript operator)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript operator))

  (" "

   (embedded-javascript whitespace)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript whitespace))

  ("document"

   (embedded-javascript identifier)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript identifier))

  ("."

   (embedded-javascript delimiter)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript delimiter))

  ("querySelector"

   (embedded-javascript identifier method-name property-name)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript identifier method-name property-name))

  ("("

   (embedded-javascript delimiter)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript delimiter))

  ("\"#app\""

   (embedded-javascript literal string-literal)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript literal string-literal))

  (")"

   (embedded-javascript delimiter)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript delimiter))

  (";"

   (embedded-javascript delimiter)

   #f

   #f

   #f

   #f

   #f

   #f

   (embedded-javascript delimiter))

  ("</" (delimiter) #f #f #f #f #f #f #f)

  ("script" (identifier html-closing-tag-name) #f #f #f #f #f #f #f)

  (">" (delimiter) #f #f #f #f #f #f #f)

  ("</" (delimiter) #f #f #f #f #f #f #f)

  ("section" (identifier html-closing-tag-name) #f #f #f #f #f #f #f)

  (">" (delimiter) #f #f #f #f #f #f #f))

}

value

html-profiles : immutable-hash?

The profile defaults used by the HTML lexer.

4 Markdown🔗ℹ

 (require lexers/markdown) package: lexers-lib

The projected Markdown API has two entry points:

The first Markdown implementation is a handwritten, parser-lite, GitHub-flavored Markdown lexer. It is line-oriented and can delegate raw HTML and known fenced-code languages to the existing HTML, CSS, JavaScript, Racket, Scribble, and WAT lexers.

procedure

(make-markdown-lexer [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (input-port? . -> . (or/c symbol? token? position-token?))
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Constructs a streaming Markdown lexer.

The result is a procedure of one argument, an input port. Each call reads the next projected Markdown token from the port and returns one projected token value.

When #:source-positions is true, each result is a position-token?. When it is false, the result is either a bare symbol or a token? directly.

The intended use is to create the lexer once, then call it repeatedly on the same port until it returns an end-of-file token.

Examples:
> (define lexer
    (make-markdown-lexer #:profile 'coloring))
> (define in
    (open-input-string "# Title\n\n```js\nconst x = 1;\n```\n"))
> (port-count-lines! in)
> (list (lexer in)
        (lexer in)
        (lexer in)
        (lexer in))

(list

 (position-token (token 'delimiter "#") (position 1 1 0) (position 2 1 1))

 (position-token (token 'whitespace " ") (position 2 1 1) (position 3 1 2))

 (position-token (token 'literal "Title") (position 3 1 2) (position 8 1 7))

 (position-token (token 'whitespace "\n") (position 8 1 7) (position 9 2 0)))

procedure

(markdown-string->tokens source 
  [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (listof (or/c symbol? token? position-token?))
  source : string?
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Tokenizes an entire Markdown string using the projected token API.

This is a convenience wrapper over make-markdown-lexer.

4.1 Markdown Returned Tokens🔗ℹ

Common projected Markdown categories include:

  • 'whitespace

  • 'identifier

  • 'literal

  • 'keyword

  • 'operator

  • 'delimiter

  • 'comment

  • 'unknown

  • 'eof

For the current Markdown scaffold:

  • ordinary prose, inline code text, code-block text, and link or image payload text project mostly as 'literal

  • language names and delegated name-like tokens project as 'identifier or 'keyword, depending on the delegated lexer

  • structural markers such as heading markers, list markers, brackets, pipes, backticks, and fence delimiters project as 'delimiter

  • comments only appear through delegated embedded HTML

  • recoverable malformed constructs project as 'unknown in 'coloring mode and raise in 'compiler mode

For source continuity, the derived Markdown stream preserves the newline after a fenced-code info string as an explicit whitespace token before the code body. Incomplete fenced-code blocks are tokenized best-effort instead of raising an internal error.

Examples:
> (define inspect-lexer
    (make-markdown-lexer #:profile 'coloring))
> (define inspect-in
    (open-input-string "# Title\n\nText with <span class=\"x\">hi</span>\n"))
> (port-count-lines! inspect-in)
> (define first-token
    (inspect-lexer inspect-in))
> (lexer-token-has-positions? first-token)

#t

> (lexer-token-name first-token)

'delimiter

> (lexer-token-value first-token)

"#"

> (position-offset (lexer-token-start first-token))

1

> (position-offset (lexer-token-end first-token))

2

}

Constructs a streaming Markdown lexer for the derived-token layer.

procedure

(markdown-string->derived-tokens source)

  (listof markdown-derived-token?)
  source : string?
Tokenizes an entire Markdown string into derived Markdown token values.

procedure

(markdown-derived-token? v)  boolean?

  v : any/c
Recognizes derived Markdown token values returned by make-markdown-derived-lexer and markdown-string->derived-tokens.

procedure

(markdown-derived-token-tags token)  (listof symbol?)

  token : markdown-derived-token?
Returns the Markdown-specific classification tags attached to a derived Markdown token.

procedure

(markdown-derived-token-has-tag? token tag)  boolean?

  token : markdown-derived-token?
  tag : symbol?
Determines whether a derived Markdown token carries a given classification tag.

procedure

(markdown-derived-token-text token)  string?

  token : markdown-derived-token?
Returns the exact source text corresponding to a derived Markdown token.

Returns the starting source position for a derived Markdown token.

procedure

(markdown-derived-token-end token)  position?

  token : markdown-derived-token?
Returns the ending source position for a derived Markdown token.

4.2 Markdown Derived Tokens🔗ℹ

The current Markdown scaffold may attach tags such as:

  • 'markdown-text

  • 'markdown-heading-marker

  • 'markdown-heading-text

  • 'markdown-blockquote-marker

  • 'markdown-list-marker

  • 'markdown-task-marker

  • 'markdown-thematic-break

  • 'markdown-code-span

  • 'markdown-code-fence

  • 'markdown-code-block

  • 'markdown-code-info-string

  • 'markdown-emphasis-delimiter

  • 'markdown-strong-delimiter

  • 'markdown-strikethrough-delimiter

  • 'markdown-link-text

  • 'markdown-link-destination

  • 'markdown-link-title

  • 'markdown-image-marker

  • 'markdown-autolink

  • 'markdown-table-pipe

  • 'markdown-table-alignment

  • 'markdown-table-cell

  • 'markdown-escape

  • 'markdown-hard-line-break

  • 'embedded-html

  • 'embedded-css

  • 'embedded-javascript

  • 'embedded-racket

  • 'embedded-scribble

  • 'embedded-wat

  • 'malformed-token

Delegated raw HTML and recognized fenced-code languages keep their reusable derived tags and gain Markdown embedding markers such as 'embedded-html, 'embedded-javascript, 'embedded-racket, or 'embedded-wat.

Examples:
> (define derived-tokens
    (markdown-string->derived-tokens
     "# Title\n\n- [x] done\n\n```js\nconst x = 1;\n```\n\nText <span class=\"x\">hi</span>\n"))
> (map (lambda (token)
         (list (markdown-derived-token-text token)
               (markdown-derived-token-tags token)))
       derived-tokens)

'(("#" (delimiter markdown-heading-marker))

  (" " (whitespace))

  ("Title" (literal markdown-heading-text))

  ("\n" (whitespace))

  ("\n" (whitespace))

  ("-" (delimiter markdown-list-marker))

  (" " (whitespace))

  ("[x]" (delimiter markdown-task-marker))

  (" " (whitespace))

  ("done" (literal markdown-text))

  ("\n" (whitespace))

  ("\n" (whitespace))

  ("```" (delimiter markdown-code-fence))

  ("js" (identifier markdown-code-info-string))

  ("\n" (whitespace))

  ("const" (keyword embedded-javascript markdown-code-block))

  (" " (whitespace embedded-javascript markdown-code-block))

  ("x" (identifier declaration-name embedded-javascript markdown-code-block))

  (" " (whitespace embedded-javascript markdown-code-block))

  ("=" (operator embedded-javascript markdown-code-block))

  (" " (whitespace embedded-javascript markdown-code-block))

  ("1" (literal numeric-literal embedded-javascript markdown-code-block))

  (";" (delimiter embedded-javascript markdown-code-block))

  ("\n" (whitespace embedded-javascript markdown-code-block))

  ("```" (delimiter markdown-code-fence))

  ("\n" (whitespace))

  ("\n" (whitespace))

  ("Text " (literal markdown-text))

  ("<" (delimiter embedded-html))

  ("span" (identifier html-tag-name embedded-html))

  (" " (whitespace embedded-html))

  ("class" (identifier html-attribute-name embedded-html))

  ("=" (operator embedded-html))

  ("\"x\"" (literal html-attribute-value embedded-html))

  (">" (delimiter embedded-html))

  ("hi" (literal html-text embedded-html))

  ("</" (delimiter embedded-html))

  ("span" (identifier html-closing-tag-name embedded-html))

  (">" (delimiter embedded-html))

  ("\n" (whitespace)))

}

value

markdown-profiles : immutable-hash?

The profile defaults used by the Markdown lexer.

5 WAT🔗ℹ

 (require lexers/wat) package: lexers-lib

The projected WAT API has two entry points:

The first WAT implementation is a handwritten lexer for WebAssembly text format. It targets WAT only, not binary .wasm files.

procedure

(make-wat-lexer [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (input-port? . -> . (or/c symbol? token? position-token?))
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Constructs a streaming WAT lexer.

The result is a procedure of one argument, an input port. Each call reads the next projected WAT token from the port and returns one projected token value.

When #:source-positions is true, each result is a position-token?. When it is false, the result is either a bare symbol or a token? directly.

The intended use is to create the lexer once, then call it repeatedly on the same port until it returns an end-of-file token.

The streaming port readers emit tokens incrementally. They do not buffer the entire remaining input before producing the first token.

Examples:
> (define lexer
    (make-wat-lexer #:profile 'coloring))
> (define in
    (open-input-string "(module (func (result i32) (i32.const 42)))"))
> (port-count-lines! in)
> (list (lexer in)
        (lexer in)
        (lexer in)
        (lexer in))

(list

 (position-token (token 'delimiter "(") (position 1 1 0) (position 2 1 1))

 (position-token (token 'keyword "module") (position 2 1 1) (position 8 1 7))

 (position-token (token 'whitespace " ") (position 8 1 7) (position 9 1 8))

 (position-token (token 'delimiter "(") (position 9 1 8) (position 10 1 9)))

procedure

(wat-string->tokens source 
  [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (listof (or/c symbol? token? position-token?))
  source : string?
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Tokenizes an entire WAT string using the projected token API.

This is a convenience wrapper over make-wat-lexer.

5.1 WAT Returned Tokens🔗ℹ

Common projected WAT categories include:

  • 'whitespace

  • 'comment

  • 'identifier

  • 'keyword

  • 'literal

  • 'delimiter

  • 'unknown

  • 'eof

For the current WAT scaffold:

  • form names, type names, and instruction names project as 'keyword

  • $-prefixed names and remaining word-like names project as 'identifier

  • strings and numeric literals project as 'literal

  • parentheses project as 'delimiter

  • comments project as 'comment

  • malformed input projects as 'unknown in 'coloring mode and raises in 'compiler mode

Projected and derived token text preserve the exact source slice, including whitespace and comments.

Examples:
> (define inspect-lexer
    (make-wat-lexer #:profile 'coloring))
> (define inspect-in
    (open-input-string ";; line comment\n(module (func (result i32) (i32.const 42)))"))
> (port-count-lines! inspect-in)
> (define first-token
    (inspect-lexer inspect-in))
> (lexer-token-has-positions? first-token)

#t

> (lexer-token-name first-token)

'comment

> (lexer-token-value first-token)

";; line comment"

> (position-offset (lexer-token-start first-token))

1

> (position-offset (lexer-token-end first-token))

16

}

procedure

(make-wat-derived-lexer)

  (input-port? . -> . (or/c 'eof wat-derived-token?))
Constructs a streaming WAT lexer for the derived-token layer.

procedure

(wat-string->derived-tokens source)

  (listof wat-derived-token?)
  source : string?
Tokenizes an entire WAT string into derived WAT token values.

procedure

(wat-derived-token? v)  boolean?

  v : any/c
Recognizes derived WAT token values returned by make-wat-derived-lexer and wat-string->derived-tokens.

procedure

(wat-derived-token-tags token)  (listof symbol?)

  token : wat-derived-token?
Returns the WAT-specific classification tags attached to a derived WAT token.

procedure

(wat-derived-token-has-tag? token tag)  boolean?

  token : wat-derived-token?
  tag : symbol?
Determines whether a derived WAT token carries a given classification tag.

procedure

(wat-derived-token-text token)  string?

  token : wat-derived-token?
Returns the exact source text corresponding to a derived WAT token.

procedure

(wat-derived-token-start token)  position?

  token : wat-derived-token?
Returns the starting source position for a derived WAT token.

procedure

(wat-derived-token-end token)  position?

  token : wat-derived-token?
Returns the ending source position for a derived WAT token.

5.2 WAT Derived Tokens🔗ℹ

The current WAT scaffold may attach tags such as:

  • 'wat-form

  • 'wat-type

  • 'wat-instruction

  • 'wat-identifier

  • 'wat-string-literal

  • 'wat-numeric-literal

  • 'comment

  • 'whitespace

  • 'malformed-token

Examples:
> (define derived-tokens
    (wat-string->derived-tokens
     "(module (func $answer (result i32) i32.const 42))"))
> (map (lambda (token)
         (list (wat-derived-token-text token)
               (wat-derived-token-tags token)))
       derived-tokens)

'(("(" (delimiter))

  ("module" (keyword wat-form))

  (" " (whitespace))

  ("(" (delimiter))

  ("func" (keyword wat-form))

  (" " (whitespace))

  ("$answer" (identifier wat-identifier))

  (" " (whitespace))

  ("(" (delimiter))

  ("result" (keyword wat-form))

  (" " (whitespace))

  ("i32" (keyword wat-type))

  (")" (delimiter))

  (" " (whitespace))

  ("i32.const" (keyword wat-instruction))

  (" " (whitespace))

  ("42" (literal wat-numeric-literal))

  (")" (delimiter))

  (")" (delimiter)))

}

value

wat-profiles : immutable-hash?

The profile defaults used by the WAT lexer.

6 Racket🔗ℹ

 (require lexers/racket) package: lexers-lib

The projected Racket API has two entry points:

This lexer is adapter-backed. It uses the lexer from syntax-color/racket-lexer as its raw engine and adapts that output into the public lexers projected and derived APIs.

When a source starts with "#lang at-exp", the adapter switches to the Scribble lexer family in Racket mode so that @litchar["@"] forms are tokenized as Scribble escapes instead of ordinary symbol text.

procedure

(make-racket-lexer [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (input-port? . -> . (or/c symbol? token? position-token?))
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Constructs a streaming Racket lexer.

The result is a procedure of one argument, an input port. Each call reads the next token from the port and returns one projected token value.

When #:source-positions is true, each result is a position-token?. When it is false, the result is either a bare symbol or a token? directly.

The intended use is to create the lexer once, then call it repeatedly on the same port until it returns an end-of-file token.

Examples:
> (define lexer
    (make-racket-lexer #:profile 'coloring))
> (define in
    (open-input-string "#:x \"hi\""))
> (port-count-lines! in)
> (list (lexer in)
        (lexer in)
        (lexer in))

(list

 (position-token (token 'literal "#:x") (position 1 1 0) (position 4 1 3))

 (position-token (token 'whitespace " ") (position 4 1 3) (position 5 1 4))

 (position-token (token 'literal "\"hi\"") (position 5 1 4) (position 9 1 8)))

procedure

(racket-string->tokens source 
  [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (listof (or/c symbol? token? position-token?))
  source : string?
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Tokenizes an entire Racket string using the projected token API.

This is a convenience wrapper over make-racket-lexer.

6.1 Racket Returned Tokens🔗ℹ

Common projected Racket categories include:

  • 'whitespace

  • 'comment

  • 'identifier

  • 'literal

  • 'delimiter

  • 'unknown

  • 'eof

For the current adapter:

  • comments and sexp comments project as 'comment

  • whitespace projects as 'whitespace

  • strings, constants, and hash-colon keywords project as 'literal

  • symbols, other, and no-color tokens project as 'identifier

  • parentheses project as 'delimiter

  • lexical errors project as 'unknown in 'coloring mode and raise in 'compiler mode

Projected and derived Racket token text preserve the exact consumed source slice, including multi-semicolon comment headers such as ;;;.

Examples:
> (define inspect-lexer
    (make-racket-lexer #:profile 'coloring))
> (define inspect-in
    (open-input-string "#;(+ 1 2) #:x"))
> (port-count-lines! inspect-in)
> (define first-token
    (inspect-lexer inspect-in))
> (lexer-token-has-positions? first-token)

#t

> (lexer-token-name first-token)

'comment

> (lexer-token-value first-token)

"#;"

}

Constructs a streaming Racket lexer for the derived-token layer.

procedure

(racket-string->derived-tokens source)

  (listof racket-derived-token?)
  source : string?
Tokenizes an entire Racket string into derived Racket token values.

procedure

(racket-derived-token? v)  boolean?

  v : any/c
Recognizes derived Racket token values returned by make-racket-derived-lexer and racket-string->derived-tokens.

procedure

(racket-derived-token-tags token)  (listof symbol?)

  token : racket-derived-token?
Returns the Racket-specific classification tags attached to a derived Racket token.

procedure

(racket-derived-token-has-tag? token tag)  boolean?

  token : racket-derived-token?
  tag : symbol?
Determines whether a derived Racket token carries a given classification tag.

procedure

(racket-derived-token-text token)  string?

  token : racket-derived-token?
Returns the exact source text corresponding to a derived Racket token.

procedure

(racket-derived-token-start token)  position?

  token : racket-derived-token?
Returns the starting source position for a derived Racket token.

procedure

(racket-derived-token-end token)  position?

  token : racket-derived-token?
Returns the ending source position for a derived Racket token.

6.2 Racket Derived Tokens🔗ℹ

The current Racket adapter may attach tags such as:

  • 'racket-comment

  • 'racket-sexp-comment

  • 'racket-whitespace

  • 'racket-constant

  • 'racket-string

  • 'racket-symbol

  • 'racket-parenthesis

  • 'racket-hash-colon-keyword

  • 'racket-commented-out

  • 'racket-datum

  • 'racket-open

  • 'racket-close

  • 'racket-continue

  • 'racket-usual-special-form

  • 'racket-definition-form

  • 'racket-binding-form

  • 'racket-conditional-form

  • 'racket-error

  • 'scribble-text for "#lang at-exp" text regions

  • 'scribble-command-char for @litchar["@"] in "#lang at-exp" sources

  • 'scribble-command for command names such as @litchar["@"]bold in "#lang at-exp" sources

  • 'scribble-body-delimiter

  • 'scribble-optional-delimiter

  • 'scribble-racket-escape

The ‘usual special form‘ tags are heuristic. They are meant to help ordinary Racket tooling recognize common built-in forms such as define, define-values, if, and let, but they are not guarantees about expanded meaning. In particular, a token whose text is "define" may still receive 'racket-usual-special-form even in a program where define has been rebound, because the lexer does not perform expansion or binding resolution.

Examples:
> (define derived-tokens
    (racket-string->derived-tokens "#;(+ 1 2) #:x \"hi\""))
> (map (lambda (token)
         (list (racket-derived-token-text token)
               (racket-derived-token-tags token)))
       derived-tokens)

'(("#;" (comment racket-sexp-comment racket-continue))

  ("(" (delimiter racket-parenthesis racket-open comment racket-commented-out))

  ("+" (identifier racket-symbol racket-datum comment racket-commented-out))

  (" "

   (whitespace racket-whitespace racket-continue comment racket-commented-out))

  ("1" (literal racket-constant racket-datum comment racket-commented-out))

  (" "

   (whitespace racket-whitespace racket-continue comment racket-commented-out))

  ("2" (literal racket-constant racket-datum comment racket-commented-out))

  (")"

   (delimiter racket-parenthesis racket-close comment racket-commented-out))

  (" " (whitespace racket-whitespace racket-continue))

  ("#:x" (literal racket-hash-colon-keyword racket-datum))

  (" " (whitespace racket-whitespace racket-continue))

  ("\"hi\"" (literal racket-string racket-datum)))

> (define at-exp-derived-tokens
    (racket-string->derived-tokens "#lang at-exp racket\n(define x @bold{hi})\n"))
> (map (lambda (token)
         (list (racket-derived-token-text token)
               (racket-derived-token-tags token)))
       at-exp-derived-tokens)

'(("#lang at-exp" (identifier racket-other racket-datum))

  (" " (whitespace racket-whitespace racket-continue))

  ("racket" (identifier racket-symbol racket-datum))

  ("\n" (whitespace racket-whitespace racket-continue))

  ("(" (delimiter racket-parenthesis racket-open))

  ("define"

   (identifier

    racket-symbol

    racket-datum

    racket-usual-special-form

    racket-definition-form))

  (" " (whitespace racket-whitespace racket-continue))

  ("x" (identifier racket-symbol racket-datum))

  (" " (whitespace racket-whitespace racket-continue))

  ("@" (delimiter racket-parenthesis racket-datum scribble-command-char))

  ("bold" (identifier racket-symbol racket-datum scribble-command))

  ("{" (delimiter racket-parenthesis racket-open scribble-body-delimiter))

  ("hi" (literal scribble-text racket-continue))

  ("}" (delimiter racket-parenthesis racket-close scribble-body-delimiter))

  (")" (delimiter racket-parenthesis racket-close))

  ("\n" (whitespace racket-whitespace racket-continue)))

}

value

racket-profiles : immutable-hash?

The profile defaults used by the Racket lexer.

7 Scribble🔗ℹ

 (require lexers/scribble) package: lexers-lib

The projected Scribble API has two entry points:

This lexer is adapter-backed. It uses syntax-color/scribble-lexer as its raw engine and adapts that output into the public lexers projected and derived APIs.

The first implementation defaults to Scribble’s inside/text mode via make-scribble-inside-lexer. Command-character customization is intentionally deferred.

procedure

(make-scribble-lexer [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (input-port? . -> . (or/c symbol? token? position-token?))
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Constructs a streaming Scribble lexer.

The result is a procedure of one argument, an input port. Each call reads the next token from the port and returns one projected token value.

When #:source-positions is true, each result is a position-token?. When it is false, the result is either a bare symbol or a token? directly.

The intended use is to create the lexer once, then call it repeatedly on the same port until it returns an end-of-file token.

Examples:
> (define lexer
    (make-scribble-lexer #:profile 'coloring))
> (define in
    (open-input-string "@title{Hi}\nText"))
> (port-count-lines! in)
> (list (lexer in)
        (lexer in)
        (lexer in)
        (lexer in))

(list

 (position-token (token 'delimiter "@") (position 1 1 0) (position 2 1 1))

 (position-token (token 'identifier "title") (position 2 1 1) (position 7 1 6))

 (position-token (token 'delimiter "{") (position 7 1 6) (position 8 1 7))

 (position-token (token 'literal "Hi") (position 8 1 7) (position 10 1 9)))

procedure

(scribble-string->tokens source 
  [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions]) 
  (listof (or/c symbol? token? position-token?))
  source : string?
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
Tokenizes an entire Scribble string using the projected token API.

This is a convenience wrapper over make-scribble-lexer.

7.1 Scribble Returned Tokens🔗ℹ

Common projected Scribble categories include:

  • 'whitespace

  • 'comment

  • 'identifier

  • 'literal

  • 'delimiter

  • 'unknown

  • 'eof

For the current adapter:

  • text, strings, and constants project as 'literal

  • whitespace projects as 'whitespace

  • symbol and other tokens project as 'identifier

  • parentheses, the command character, and body or optional delimiters project as 'delimiter

  • lexical errors project as 'unknown in 'coloring mode and raise in 'compiler mode

For source fidelity, the Scribble adapter preserves the exact source slice for projected and derived token text, including whitespace spans that contain one or more newlines.

Examples:
> (define inspect-lexer
    (make-scribble-lexer #:profile 'coloring))
> (define inspect-in
    (open-input-string "@title{Hi}"))
> (port-count-lines! inspect-in)
> (define first-token
    (inspect-lexer inspect-in))
> (lexer-token-has-positions? first-token)

#t

> (lexer-token-name first-token)

'delimiter

> (lexer-token-value first-token)

"@"

}

Constructs a streaming Scribble lexer for the derived-token layer.

procedure

(scribble-string->derived-tokens source)

  (listof scribble-derived-token?)
  source : string?
Tokenizes an entire Scribble string into derived Scribble token values.

procedure

(scribble-derived-token? v)  boolean?

  v : any/c
Recognizes derived Scribble token values returned by make-scribble-derived-lexer and scribble-string->derived-tokens.

procedure

(scribble-derived-token-tags token)  (listof symbol?)

  token : scribble-derived-token?
Returns the Scribble-specific classification tags attached to a derived Scribble token.

procedure

(scribble-derived-token-has-tag? token tag)  boolean?

  token : scribble-derived-token?
  tag : symbol?
Determines whether a derived Scribble token carries a given classification tag.

procedure

(scribble-derived-token-text token)  string?

  token : scribble-derived-token?
Returns the exact source text corresponding to a derived Scribble token.

Returns the starting source position for a derived Scribble token.

procedure

(scribble-derived-token-end token)  position?

  token : scribble-derived-token?
Returns the ending source position for a derived Scribble token.

7.2 Scribble Derived Tokens🔗ℹ

The current Scribble adapter may attach tags such as:

  • 'scribble-comment

  • 'scribble-whitespace

  • 'scribble-text

  • 'scribble-string

  • 'scribble-constant

  • 'scribble-symbol

  • 'scribble-parenthesis

  • 'scribble-other

  • 'scribble-error

  • 'scribble-command

  • 'scribble-command-char

  • 'scribble-body-delimiter

  • 'scribble-optional-delimiter

  • 'scribble-racket-escape

These tags describe reusable Scribble structure, not presentation. In particular, 'scribble-command only means that a symbol-like token is being used as a command name after "@". It does not mean the lexer has inferred higher-level document semantics for commands such as title or itemlist.

Examples:
> (define derived-tokens
    (scribble-string->derived-tokens
     "@title{Hi}\n@racket[(define x 1)]"))
> (map (lambda (token)
         (list (scribble-derived-token-text token)
               (scribble-derived-token-tags token)))
       derived-tokens)

'(("@" (delimiter scribble-parenthesis scribble-command-char))

  ("title" (identifier scribble-symbol scribble-command))

  ("{" (delimiter scribble-parenthesis scribble-body-delimiter))

  ("Hi" (literal scribble-text))

  ("}" (delimiter scribble-parenthesis scribble-body-delimiter))

  ("\n" (whitespace scribble-whitespace))

  ("@" (delimiter scribble-parenthesis scribble-command-char))

  ("racket" (identifier scribble-symbol scribble-command))

  ("[" (delimiter scribble-parenthesis scribble-optional-delimiter))

  ("(" (delimiter scribble-parenthesis scribble-racket-escape))

  ("define" (scribble-racket-escape))

  (" " (whitespace scribble-whitespace scribble-racket-escape))

  ("x" (identifier scribble-symbol scribble-racket-escape))

  (" " (whitespace scribble-whitespace scribble-racket-escape))

  ("1" (literal scribble-constant scribble-racket-escape))

  (")" (delimiter scribble-parenthesis scribble-racket-escape))

  ("]"

   (delimiter

    scribble-parenthesis

    scribble-optional-delimiter

    scribble-racket-escape)))

}

value

scribble-profiles : immutable-hash?

The profile defaults used by the Scribble lexer.

8 JavaScript🔗ℹ

 (require lexers/javascript) package: lexers-lib

The projected JavaScript API has two entry points:

procedure

(make-javascript-lexer [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions 
  #:jsx? jsx?]) 
  (input-port? . -> . (or/c symbol? token? position-token?))
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
  jsx? : boolean? = #f
Constructs a streaming JavaScript lexer.

The result is a procedure of one argument, an input port. Each call reads the next token from the port and returns one projected token value.

When #:source-positions is true, each result is a position-token? whose payload is either a bare symbol such as 'eof or a token? carrying a projected category such as 'keyword, 'identifier, 'literal, 'operator, 'comment, or 'unknown.

When #:source-positions is false, the result is either a bare symbol or a token? directly.

The intended use is to create the lexer once, then call it repeatedly on the same port until it returns an end-of-file token.

When #:jsx? is true, the lexer accepts a small JSX extension inside JavaScript expressions. The projected token categories remain the same, while the derived-token API exposes JSX-specific structure.

Examples:
> (define lexer
    (make-javascript-lexer #:profile 'coloring))
> (define in
    (open-input-string "const x = 1;"))
> (port-count-lines! in)
> (list (lexer in)
        (lexer in)
        (lexer in)
        (lexer in))

(list

 (position-token (token 'keyword "const") (position 1 1 0) (position 6 1 5))

 (position-token (token 'whitespace " ") (position 6 1 5) (position 7 1 6))

 (position-token (token 'identifier "x") (position 7 1 6) (position 8 1 7))

 (position-token (token 'whitespace " ") (position 8 1 7) (position 9 1 8)))

procedure

(javascript-string->tokens 
  source 
  [#:profile profile 
  #:trivia trivia 
  #:source-positions source-positions 
  #:jsx? jsx?]) 
  (listof (or/c symbol? token? position-token?))
  source : string?
  profile : (or/c 'coloring 'compiler) = 'coloring
  trivia : (or/c 'profile-default 'keep 'skip)
   = 'profile-default
  source-positions : (or/c 'profile-default boolean?)
   = 'profile-default
  jsx? : boolean? = #f
Tokenizes an entire JavaScript string using the projected token API.

This is a convenience wrapper over make-javascript-lexer. It opens a string port, enables line counting, repeatedly calls the port-based lexer until end-of-file, and returns the resulting token list.

8.1 JavaScript Returned Tokens🔗ℹ

The projected JavaScript API uses the same output shape:

  • The end of input is reported as 'eof, either directly or inside a position-token?.

  • Ordinary results are usually token? values whose token-name is a projected category and whose token-value contains language-specific text or metadata.

  • When #:source-positions is true, each result is wrapped in a position-token?.

  • When #:source-positions is false, results are returned without that outer wrapper.

Common projected JavaScript categories include:

  • 'whitespace

  • 'comment

  • 'keyword

  • 'identifier

  • 'literal

  • 'operator

  • 'delimiter

  • 'unknown

  • 'eof

In 'coloring mode, whitespace and comments are kept, and recoverable malformed input is returned as 'unknown. In 'compiler mode, whitespace and comments are skipped by default, and malformed input raises an exception instead of producing an 'unknown token.

For the current JavaScript scaffold, token-value also preserves the original source text of the emitted token. In particular:

  • For 'keyword and 'identifier, the value is the matched identifier text, such as "const" or "name".

  • For 'literal, the value is the matched literal text, such as "1" or "\"hello\"".

  • For 'comment and 'whitespace, the value is the original comment or whitespace text when those categories are kept.

  • For 'operator and 'delimiter, the value is the matched character text, such as "=", ";", or "(".

  • For 'unknown in tolerant mode, the value is the malformed input text that could not be accepted.

Examples:
> (define inspect-lexer
    (make-javascript-lexer #:profile 'coloring))
> (define inspect-in
    (open-input-string "const x = 1;"))
> (port-count-lines! inspect-in)
> (define first-token
    (inspect-lexer inspect-in))
> (lexer-token-has-positions? first-token)

#t

> (lexer-token-name first-token)

'keyword

> (lexer-token-value first-token)

"const"

> (position-offset (lexer-token-start first-token))

1

> (position-offset (lexer-token-end first-token))

6

}

procedure

(make-javascript-derived-lexer [#:jsx? jsx?])

  (input-port? . -> . (or/c 'eof javascript-derived-token?))
  jsx? : boolean? = #f
Constructs a streaming JavaScript lexer for the derived-token layer.

The result is a procedure of one argument, an input port. Each call reads the next raw JavaScript token from the port, computes its JavaScript-specific derived classifications, and returns one derived token value. At end of input, it returns 'eof.

The intended use is the same as for make-javascript-lexer: create the lexer once, then call it repeatedly on the same port until it returns 'eof.

Examples:
> (define derived-lexer
    (make-javascript-derived-lexer))
> (define derived-in
    (open-input-string "const x = 1;"))
> (port-count-lines! derived-in)
> (list (derived-lexer derived-in)
        (derived-lexer derived-in)
        (derived-lexer derived-in)
        (derived-lexer derived-in))

(list

 (javascript-derived-token

  (javascript-raw-token

   'identifier-token

   "const"

   (position 1 1 0)

   (position 6 1 5))

  '(keyword))

 (javascript-derived-token

  (javascript-raw-token

   'whitespace-token

   " "

   (position 6 1 5)

   (position 7 1 6))

  '())

 (javascript-derived-token

  (javascript-raw-token

   'identifier-token

   "x"

   (position 7 1 6)

   (position 8 1 7))

  '(identifier declaration-name))

 (javascript-derived-token

  (javascript-raw-token

   'whitespace-token

   " "

   (position 8 1 7)

   (position 9 1 8))

  '()))

procedure

(javascript-string->derived-tokens source 
  [#:jsx? jsx?]) 
  (listof javascript-derived-token?)
  source : string?
  jsx? : boolean? = #f
Tokenizes an entire JavaScript string into derived JavaScript token values.

This is a convenience wrapper over make-javascript-derived-lexer. It opens a string port, enables line counting, repeatedly calls the derived lexer until it returns 'eof, and returns the resulting list of derived tokens.

procedure

(javascript-derived-token? v)  boolean?

  v : any/c
Recognizes derived JavaScript token values returned by make-javascript-derived-lexer and javascript-string->derived-tokens.

Returns the JavaScript-specific classification tags attached to a derived JavaScript token.

procedure

(javascript-derived-token-has-tag? token    
  tag)  boolean?
  token : javascript-derived-token?
  tag : symbol?
Determines whether a derived JavaScript token carries a given classification tag.

Returns the exact source text corresponding to a derived JavaScript token.

Returns the starting source position for a derived JavaScript token.

Returns the ending source position for a derived JavaScript token.

8.2 JavaScript Derived Tokens🔗ℹ

A derived JavaScript token pairs one raw JavaScript token with a small list of JavaScript-specific classification tags. This layer is more precise than the projected consumer-facing categories and is meant for inspection, testing, and language-aware tools.

The current JavaScript scaffold may attach tags such as:

  • 'keyword

  • 'identifier

  • 'declaration-name

  • 'parameter-name

  • 'object-key

  • 'property-name

  • 'method-name

  • 'private-name

  • 'static-keyword-usage

  • 'string-literal

  • 'numeric-literal

  • 'regex-literal

  • 'template-literal

  • 'template-chunk

  • 'template-interpolation-boundary

  • 'jsx-tag-name

  • 'jsx-closing-tag-name

  • 'jsx-attribute-name

  • 'jsx-text

  • 'jsx-interpolation-boundary

  • 'jsx-fragment-boundary

  • 'comment

  • 'malformed-token

Examples:
> (define derived-tokens
    (javascript-string->derived-tokens
     "class Box { static create() { return this.value; } #secret = 1; }\nfunction wrap(name) { return name; }\nconst item = obj.run();\nconst data = { answer: 42 };\nconst greeting = `a ${name} b`;\nreturn /ab+c/i;"))
> (map (lambda (token)
         (list (javascript-derived-token-text token)
               (javascript-derived-token-tags token)
               (javascript-derived-token-has-tag? token 'keyword)
               (javascript-derived-token-has-tag? token 'identifier)
               (javascript-derived-token-has-tag? token 'declaration-name)
               (javascript-derived-token-has-tag? token 'parameter-name)
               (javascript-derived-token-has-tag? token 'object-key)
               (javascript-derived-token-has-tag? token 'property-name)
               (javascript-derived-token-has-tag? token 'method-name)
               (javascript-derived-token-has-tag? token 'private-name)
               (javascript-derived-token-has-tag? token 'static-keyword-usage)
               (javascript-derived-token-has-tag? token 'numeric-literal)
               (javascript-derived-token-has-tag? token 'regex-literal)
               (javascript-derived-token-has-tag? token 'template-literal)
               (javascript-derived-token-has-tag? token 'template-chunk)
               (javascript-derived-token-has-tag? token 'template-interpolation-boundary)))
       derived-tokens)

'(("class" (keyword) (keyword) #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("Box"

   (identifier declaration-name)

   #f

   (identifier declaration-name)

   (declaration-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("{" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("static"

   (keyword static-keyword-usage)

   (keyword static-keyword-usage)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (static-keyword-usage)

   #f

   #f

   #f

   #f

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("create" (identifier) #f (identifier) #f #f #f #f #f #f #f #f #f #f #f #f)

  ("(" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (")" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("{" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("return" (keyword) (keyword) #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("this" (keyword) (keyword) #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("." () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("value"

   (identifier property-name)

   #f

   (identifier property-name)

   #f

   #f

   #f

   (property-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (";" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("}" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("#secret"

   (private-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (private-name)

   #f

   #f

   #f

   #f

   #f

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("=" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("1"

   (numeric-literal)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (numeric-literal)

   #f

   #f

   #f

   #f)

  (";" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("}" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("\n" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("function" (keyword) (keyword) #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("wrap"

   (identifier declaration-name)

   #f

   (identifier declaration-name)

   (declaration-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  ("(" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("name"

   (identifier parameter-name)

   #f

   (identifier parameter-name)

   #f

   (parameter-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (")" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("{" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("return" (keyword) (keyword) #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("name" (identifier) #f (identifier) #f #f #f #f #f #f #f #f #f #f #f #f)

  (";" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("}" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("\n" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("const" (keyword) (keyword) #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("item"

   (identifier declaration-name)

   #f

   (identifier declaration-name)

   (declaration-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("=" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("obj" (identifier) #f (identifier) #f #f #f #f #f #f #f #f #f #f #f #f)

  ("." () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("run"

   (identifier method-name property-name)

   #f

   (identifier method-name property-name)

   #f

   #f

   #f

   (property-name)

   (method-name property-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  ("(" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (")" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (";" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("\n" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("const" (keyword) (keyword) #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("data"

   (identifier declaration-name)

   #f

   (identifier declaration-name)

   (declaration-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("=" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("{" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("answer"

   (identifier object-key)

   #f

   (identifier object-key)

   #f

   #f

   (object-key)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (":" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("42"

   (numeric-literal)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (numeric-literal)

   #f

   #f

   #f

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("}" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (";" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("\n" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("const" (keyword) (keyword) #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("greeting"

   (identifier declaration-name)

   #f

   (identifier declaration-name)

   (declaration-name)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("=" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("`"

   (template-literal)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (template-literal)

   #f

   #f)

  ("a "

   (template-literal template-chunk)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (template-literal template-chunk)

   (template-chunk)

   #f)

  ("${"

   (template-literal template-interpolation-boundary)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (template-literal template-interpolation-boundary)

   #f

   (template-interpolation-boundary))

  ("name" (identifier) #f (identifier) #f #f #f #f #f #f #f #f #f #f #f #f)

  ("}"

   (template-literal template-interpolation-boundary)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (template-literal template-interpolation-boundary)

   #f

   (template-interpolation-boundary))

  (" b"

   (template-literal template-chunk)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (template-literal template-chunk)

   (template-chunk)

   #f)

  ("`"

   (template-literal)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (template-literal)

   #f

   #f)

  (";" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("\n" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("return" (keyword) (keyword) #f #f #f #f #f #f #f #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f #f #f #f #f #f #f #f #f)

  ("/ab+c/i"

   (regex-literal)

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   #f

   (regex-literal)

   #f

   #f

   #f)

  (";" () #f #f #f #f #f #f #f #f #f #f #f #f #f #f))

}

Examples:
> (define jsx-derived-tokens
    (javascript-string->derived-tokens
     "const el = <Button kind=\"primary\">Hello {name}</Button>;\nconst frag = <>ok</>;"
     #:jsx? #t))
> (map (lambda (token)
         (list (javascript-derived-token-text token)
               (javascript-derived-token-tags token)
               (javascript-derived-token-has-tag? token 'jsx-tag-name)
               (javascript-derived-token-has-tag? token 'jsx-closing-tag-name)
               (javascript-derived-token-has-tag? token 'jsx-attribute-name)
               (javascript-derived-token-has-tag? token 'jsx-text)
               (javascript-derived-token-has-tag? token 'jsx-interpolation-boundary)
               (javascript-derived-token-has-tag? token 'jsx-fragment-boundary)))
       jsx-derived-tokens)

'(("const" (keyword) #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f)

  ("el" (identifier declaration-name) #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f)

  ("=" () #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f)

  ("<" () #f #f #f #f #f #f)

  ("Button" (identifier jsx-tag-name) (jsx-tag-name) #f #f #f #f #f)

  (" " () #f #f #f #f #f #f)

  ("kind" (identifier jsx-attribute-name) #f #f (jsx-attribute-name) #f #f #f)

  ("=" () #f #f #f #f #f #f)

  ("\"primary\"" (string-literal) #f #f #f #f #f #f)

  (">" () #f #f #f #f #f #f)

  ("Hello " (jsx-text) #f #f #f (jsx-text) #f #f)

  ("{"

   (jsx-interpolation-boundary)

   #f

   #f

   #f

   #f

   (jsx-interpolation-boundary)

   #f)

  ("name" (identifier) #f #f #f #f #f #f)

  ("}"

   (jsx-interpolation-boundary)

   #f

   #f

   #f

   #f

   (jsx-interpolation-boundary)

   #f)

  ("</" () #f #f #f #f #f #f)

  ("Button"

   (identifier jsx-closing-tag-name)

   #f

   (jsx-closing-tag-name)

   #f

   #f

   #f

   #f)

  (">" () #f #f #f #f #f #f)

  (";" () #f #f #f #f #f #f)

  ("\n" () #f #f #f #f #f #f)

  ("const" (keyword) #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f)

  ("frag" (identifier declaration-name) #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f)

  ("=" () #f #f #f #f #f #f)

  (" " () #f #f #f #f #f #f)

  ("<>" (jsx-fragment-boundary) #f #f #f #f #f (jsx-fragment-boundary))

  ("ok" (jsx-text) #f #f #f (jsx-text) #f #f)

  ("</>" (jsx-fragment-boundary) #f #f #f #f #f (jsx-fragment-boundary))

  (";" () #f #f #f #f #f #f))

}

value

javascript-profiles : immutable-hash?

The profile defaults used by the JavaScript lexer.