String Tools
1 String Functions
1.1 Conventions
1.2 Function Index
1.3 Splitting and Slicing
string-slice
string-slice/  step
string-at
string-split-at
string-replace-range
1.4 Character Counting
string-count
1.5 Needle Counting
string-count-needle
1.6 Index Search and Trimming
string-index
string-index-right
string-skip
string-skip-right
string-trim-left
string-trim-right
string-trim-both
1.7 Substring Search and Partitioning
string-find-needle
string-find-last-needle
string-find-all-needle
string-partition
string-partition-right
string-between
1.8 Prefix and Suffix Utilities
string-remove-prefix
string-remove-suffix
string-ensure-prefix
string-ensure-suffix
string-common-prefix
string-common-suffix
1.9 Lines
string-lines
string-count-lines
string-line-start-indices
string-normalize-newlines
string-expand-tabs
string-display-width
string-chomp
string-chop-newline
1.10 String Construction and Transformation
string-repeat
string-reverse
string-capitalize
string-swapcase
string-rot13
string-pluralize
string-singularize
string-ensure-ends-with-newline
string-map
string-map!
string-intersperse
1.11 Escaping and Cleaning
string-escape-visible
string-unescape-visible
string-quote
string-unquote
string-escape-regexp
string-escape-json
string-unescape-json
string-strip-ansi
string-squeeze
1.12 Tokenization and Scanning
string-tokenize
string-fields
string-scan
1.13 Formatting and Layout
string-wrap
string-indent
string-dedent
string-elide
1.14 Similarity and Distance
string-levenshtein
string-jaro-winkler
string-similarity
1.15 Case Conversion and Predicates
string-blank?
string-ascii?
string-digit?
2 Character Sets
char-set?
empty-char-set
make-char-set
list->char-set
string->char-set
char-set-add
char-set-add-range
char-set-member?
char-set-union
char-set-intersection
char-set-difference
char-set-size
3 Extended Examples
3.1 Logs
3.2 CSV-Like Import Cleaning
3.3 Config Normalization and Patching
9.1

String Tools🔗ℹ

Jens Axel Søgaard <jensaxel@soegaard.net>

 (require string-tools) package: string-tools-lib

1 String Functions🔗ℹ

This section documents string-processing procedures provided by this package. They focus on practical operations such as splitting, substring counting, and character counting with explicit index bounds.

The procedures are intended to complement racket/string, so a typical setup is:

(require racket/string string-tools)

Procedures that are close in purpose to existing Racket procedures use distinct names (for example, string-at instead of string-ref) to avoid name clashes. Some procedure names follow the SRFI 13 naming tradition. If you need both libraries at once, prefix SRFI 13, for example:

(require (prefix-in srfi: srfi/13) string-tools)

Before diving into the individual procedures, consider skimming Extended Examples. It includes one example on parsing and analyzing structured log lines, one example on cleaning and validating CSV-like imported rows, and one example on normalizing and patching an INI-like configuration text.

1.1 Conventions🔗ℹ

These conventions apply throughout the string procedures in this section.

  • For procedures that accept start and end, negative indices count from the end of the string, and indices are clamped to valid bounds.

  • start is included and end is not included when selecting a substring.

  • A value of -1 denotes the index of the last character. Because end is not included, using end as -1 stops just before the last character.

  • In string-slice/step, #f for start or end means the bound is omitted and defaults according to the step direction.

  • In procedures that accept a character matcher, a matcher may be a character, a character set, a string (treated as a character set), or a unary predicate on characters.

1.2 Function Index🔗ℹ

Use this overview as a quick map from task to procedure family.

Access and Slicing
Accessstring-at
Slicingstring-slice string-slice/step
Split/Replacestring-split-at string-replace-range
Character Counting
Countingstring-count string-count-lines
Needle Counting
Needlesstring-count-needle
Index Search and Trimming
Index/Skipstring-index string-index-right string-skip string-skip-right
Trimmingstring-trim-both string-trim-left string-trim-right
Search and Partitioning
Needle Searchstring-find-needle string-find-last-needle string-find-all-needle
Partitioningstring-partition string-partition-right string-between
Prefix and Suffix
Normalizestring-remove-prefix string-remove-suffix string-ensure-prefix string-ensure-suffix
Common Partsstring-common-prefix string-common-suffix
Lines
Line Opsstring-lines string-line-start-indices string-normalize-newlines string-chomp string-chop-newline string-ensure-ends-with-newline
Tabs/Widthstring-expand-tabs string-display-width
Construction and Transformation
Case/Mapstring-capitalize string-swapcase string-map string-map!
Transformstring-repeat string-reverse string-rot13 string-pluralize string-singularize string-intersperse
Escaping and Cleaning
Quotingstring-quote string-unquote
Visible Escapesstring-escape-visible string-unescape-visible
JSON/Regexp/ANSIstring-escape-json string-unescape-json string-escape-regexp string-strip-ansi string-squeeze
Tokenization and Scanning
Tokenize/Fieldsstring-tokenize string-fields
Scanstring-scan
Formatting and Layout
Layoutstring-wrap string-indent string-dedent string-elide
Similarity and Distance
Metricsstring-levenshtein string-jaro-winkler string-similarity
Case Conversion and Predicates
Predicatesstring-blank? string-ascii? string-digit?

1.3 Splitting and Slicing🔗ℹ

This subsection covers positional extraction and replacement operations, from safe single-character access to stepped slicing and split-at-index workflows.

ℹ️ Think of string-slice as a nicer substring.

procedure

(string-slice s [start end])  string?

  s : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Compared to substring, this procedure accepts negative indices and clamps out-of-range indices instead of raising bounds errors.

Indices are normalized and clamped to the string bounds: negative indices count backward from the end, and out-of-range indices are clamped to valid positions. If the normalized end is less than or equal to the normalized start, the result is the empty string.

Examples:
> (string-slice "abcdef")

"abcdef"

> (string-slice "abcdef" 1 4)

"bcd"

> (string-slice "abcdef" -3 -1)

"de"

> (string-slice "abcdef" -100 100)

"abcdef"

> (string-slice "abcdef" 4 2)

""

Related: string-slice/step, string-at.

procedure

(string-slice/step s [start end step])  string?

  s : string?
  start : (or/c exact-integer? #f) = #f
  end : (or/c exact-integer? #f) = #f
  step : exact-integer? = 1
Like string-slice, but also supports stepping.

⚠️ Gotcha: With negative step, omitted bounds (#f) behave differently from explicit negative indices such as -1.

When step is positive, traversal is left to right. When step is negative, traversal is right to left. A zero step raises an exception.

If start or end is #f, the bound is treated as omitted and defaults according to the step direction.

Indices may be negative and are clamped to the string bounds.

Examples:
> (string-slice/step "abcdef")

"abcdef"

> (string-slice/step "abcdef" 0 6 2)

"ace"

> (string-slice/step "abcdef" 5 #f -2)

"fdb"

> (string-slice/step "abcdef" #f #f -1)

"fedcba"

Related: string-slice, string-at.

ℹ️ Think of string-at as a safer string-ref.

procedure

(string-at s i [default])  any/c

  s : string?
  i : exact-integer?
  default : any/c = #f
Returns the character at index i.

Indices are clamped to the string bounds, and negative indices count from the end of the string.

⚠️ Gotcha: For non-empty strings, out-of-range indices are clamped, so default is only used when s is empty.

If s is empty, default is returned.

Examples:
> (string-at "abc" 0)

#\a

> (string-at "abc" -1)

#\c

> (string-at "abc" 3)

#\c

> (string-at "abc" -10)

#\a

> (string-at "" 0 #\x)

#\x

Related: string-slice, string-slice/step.

procedure

(string-split-at s i ...)  (listof string?)

  s : string?
  i : exact-integer?
Splits s at the given indices and returns the resulting substrings as a list.

Indices may be negative and are clamped to the string bounds. The indices may be given in any order and may contain duplicates; they are sorted and deduplicated before splitting.

The returned list contains the substrings of s between successive cut positions, including the beginning and end of the string.

Examples:
> (string-split-at "abcdef" 2 4)

'("ab" "cd" "ef")

> (string-split-at "abcdef" 4 2)

'("ab" "cd" "ef")

> (string-split-at "abc" 1 1 2)

'("a" "b" "c")

> (string-split-at "abc")

'("abc")

> (string-split-at "abc" 0)

'("" "abc")

> (string-split-at "abc" -1)

'("ab" "c")

> (string-split-at "abc" 3)

'("abc" "")

If no indices are provided, the result is a list containing s itself.

If exactly one index is provided, the result is a two-element list consisting of the prefix and suffix at that index.

An exception is raised if any index is not an exact integer.

procedure

(string-replace-range s    
  start    
  end    
  replacement)  string?
  s : string?
  start : exact-integer?
  end : exact-integer?
  replacement : string?
Replaces the selected portion of s with replacement.

The replaced portion starts at start and continues up to end, excluding the character at end.

Indices may be negative and are clamped to the string bounds.

Examples:
> (string-replace-range "abcdef" 2 4 "XY")

"abXYef"

> (string-replace-range "abcdefgh" 2 6 "X")

"abXgh"

> (string-replace-range "abcdefgh" 2 4 "WXYZ")

"abWXYZefgh"

> (string-replace-range "abcdef" -4 -2 "XY")

"abXYef"

> (string-replace-range "abcdef" 4 2 "XY")

"abXYef"

1.4 Character Counting🔗ℹ

This subsection covers character-wise counting with flexible matching criteria, including character, character-set, string, and predicate forms.

procedure

(string-count s to-count [start end])  exact-nonnegative-integer?

  s : string?
  to-count : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Counts characters in s that satisfy to-count between start and end.

Indices may be negative and are clamped to the string bounds.

If to-count is a procedure, it is applied to each character as a predicate. If it is a character set, each character is tested for membership. If it is a character, char=? is used. If it is a string, the string is converted to a character set.

An exception is raised if start or end is not an exact integer, or if to-count is not a character, character set, string, or unary procedure.

Examples:
> (string-count "banana" #\a)

3

> (string-count "banana" "an")

5

> (require string-tools/char-set)
> (string-count "banana" (make-char-set #\a #\n))

5

> (string-count "a1b2c3" char-numeric?)

3

> (string-count "banana" #\a 2 6)

2

> (string-count "banana" #\a -5 -1)

2

This procedure provides character counting with flexible matching criteria.

1.5 Needle Counting🔗ℹ

This subsection groups substring-occurrence counting utilities for bounded regions of a string. Use these procedures when you need non-overlapping needle counts rather than per-character counting.

procedure

(string-count-needle s needle [start end])

  exact-nonnegative-integer?
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Counts the number of non-overlapping occurrences of needle in s, restricted to the substring from start to end (exclusively).

The search begins at start and stops before end, which defaults to the length of s. In other words, start is included and end is not included.

Indices may be negative and are clamped to the string bounds.

Occurrences are counted from left to right and do not overlap.

If needle is the empty string, the result is the number of insertion positions in the selected substring.

An exception is raised if start or end is not an exact integer.

Examples:
> (string-count-needle "banana" "na")

2

> (string-count-needle "aaaa" "aa")

2

> (string-count-needle "aaaa" "aaa")

1

> (string-count-needle "banana" "na" 3 6)

1

> (string-count-needle "banana" "na" -4 -1)

1

> (string-count-needle "abc" "")

4

> (string-count-needle "abc" "" 1 3)

3

This procedure is intended to complement the string-search utilities in racket/string by providing a direct substring-count operation.

1.6 Index Search and Trimming🔗ℹ

This subsection combines left-to-right and right-to-left index/skip operations with matcher-driven trimming over bounded substring regions.

procedure

(string-index s to-find [start end])

  (or/c exact-nonnegative-integer? #f)
  s : string?
  to-find : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Searches s from left to right and returns the first index from start to end (end not included) whose character matches to-find. Returns #f when no match is found.

Indices may be negative and are clamped to the string bounds.

Matching rules:
  • If to-find is a character, char=? is used.

  • If it is a character set, membership is tested.

  • If it is a string, the string is converted to a character set.

  • If it is a procedure, the procedure is used as a predicate.

Examples:
> (string-index "banana" #\a)

1

> (string-index "banana" "nz")

2

> (string-index "banana" (make-char-set #\n #\z))

2

> (string-index "a1b2c3" char-numeric?)

1

> (string-index "banana" #\a -5 -1)

1

This procedure searches from left to right with configurable matching criteria.

procedure

(string-index-right s to-find [start end])

  (or/c exact-nonnegative-integer? #f)
  s : string?
  to-find : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Searches s from right to left and returns the first matching index encountered from start to end (end not included). Returns #f when no match is found.

Indices may be negative and are clamped to the string bounds.

The right-to-left search starts at (sub1 end).

Examples:
> (string-index-right "banana" #\a)

5

> (string-index-right "banana" "nz")

4

> (string-index-right "banana" (make-char-set #\n #\z))

4

> (string-index-right "a1b2c3" char-numeric?)

5

> (string-index-right "banana" #\a -5 -1)

3

This procedure searches from right to left with configurable matching criteria.

procedure

(string-skip s to-skip [start end])

  (or/c exact-nonnegative-integer? #f)
  s : string?
  to-skip : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Like string-index, but uses the complement of the matching criteria: it searches left to right for the first character from start to end (end not included), that does not match to-skip.

Indices may be negative and are clamped to the string bounds.

Examples:
> (string-skip "   abc" #\space)

3

> (string-skip "aaab" "a")

3

> (string-skip "123x5" char-numeric?)

3

> (string-skip "   abc" #\space -4 100)

3

This procedure provides left-to-right skipping using the complement criterion.

procedure

(string-skip-right s to-skip [start end])

  (or/c exact-nonnegative-integer? #f)
  s : string?
  to-skip : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Like string-index-right, but uses the complement of the matching criteria: it searches right to left for the first character in the substring from start to end (end not included), that does not match to-skip.

Indices may be negative and are clamped to the string bounds.

Examples:
> (string-skip-right "abc   " #\space)

2

> (string-skip-right "baaa" "a")

0

> (string-skip-right "5x321" char-numeric?)

1

> (string-skip-right "abc   " #\space -100 -1)

2

This procedure provides right-to-left skipping using the complement criterion.

procedure

(string-trim-left s [to-trim start end])  string?

  s : string?
  to-trim : (or/c char? char-set? string? (-> char? any/c))
   = char-whitespace?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Returns a copy of the substring from start to end (end not included), after removing matching characters from the left side.

Indices may be negative and are clamped to the string bounds.

If to-trim is omitted, whitespace is trimmed. If to-trim is a string, it is converted to a character set.

Examples:
> (string-trim-left "   abc  ")

"abc  "

> (string-trim-left "aaab" #\a)

"b"

> (string-trim-left "aaab" "a")

"b"

> (string-trim-left "abbaXYZ" "ab")

"XYZ"

> (string-trim-left "123x5" char-numeric?)

"x5"

> (string-trim-left "xxabcxx" #\x 2 7)

"abcxx"

> (string-trim-left "xxabcxx" #\x -5 -1)

"abcx"

> (string-trim-left "abcde" #\x 1 3)

"bc"

procedure

(string-trim-right s [to-trim start end])  string?

  s : string?
  to-trim : (or/c char? char-set? string? (-> char? any/c))
   = char-whitespace?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Returns a copy of the substring from start to end (end not included), after removing matching characters from the right side.

Indices may be negative and are clamped to the string bounds.

If to-trim is omitted, whitespace is trimmed. If to-trim is a string, it is converted to a character set.

Examples:
> (string-trim-right "   abc  ")

"   abc"

> (string-trim-right "baaa" #\a)

"b"

> (string-trim-right "baaa" "a")

"b"

> (string-trim-right "XYZabba" "ab")

"XYZ"

> (string-trim-right "5x321" char-numeric?)

"5x"

> (string-trim-right "xxabcxx" #\x 1 6)

"xabc"

> (string-trim-right "xxabcxx" #\x -5 -1)

"abc"

ℹ️ Similar to string-trim, but this procedure uses the matcher conventions in this module for to-trim.

procedure

(string-trim-both s [to-trim start end])  string?

  s : string?
  to-trim : (or/c char? char-set? string? (-> char? any/c))
   = char-whitespace?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Returns a copy of the substring from start to end (end not included), after removing matching characters from both the left and right sides.

Indices may be negative and are clamped to the string bounds.

If to-trim is omitted, whitespace is trimmed. If to-trim is a string, it is converted to a character set.

Examples:
> (string-trim-both "   abc  ")

"abc"

> (string-trim-both "aaabaa" #\a)

"b"

> (string-trim-both "aaabaa" "a")

"b"

> (string-trim-both "abbaXYZabba" "ab")

"XYZ"

> (string-trim-both "123x5" char-numeric?)

"x"

> (string-trim-both "xxabcxx" #\x 1 6)

"abc"

> (string-trim-both "xxabcxx" #\x -5 -1)

"abc"

1.7 Substring Search and Partitioning🔗ℹ

This subsection groups substring search and partitioning helpers that return indices, ranges, or before/needle/after splits.

procedure

(string-find-needle s needle [start end])

  (or/c exact-nonnegative-integer? #f)
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Searches s from left to right for needle from start to end (end not included). Returns the index of the first match, or #f if no match is found.

Indices may be negative and are clamped to the string bounds.

If needle is the empty string, the result is start.

Examples:
> (string-find-needle "banana" "na")

2

> (string-find-needle "banana" "na" 3 6)

4

> (string-find-needle "banana" "na" -4 -1)

2

> (string-find-needle "abc" "")

0

This procedure provides direct substring search.

Related: string-find-all-needle, string-scan.

procedure

(string-find-last-needle s needle [start end])

  (or/c exact-nonnegative-integer? #f)
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Searches s from right to left for needle from start to end (end not included). Returns the index of the last match, or #f if no match is found.

Indices may be negative and are clamped to the string bounds.

If needle is the empty string, the result is end.

Examples:
> (string-find-last-needle "banana" "na")

4

> (string-find-last-needle "banana" "na" 0 5)

2

> (string-find-last-needle "banana" "na" -4 -1)

2

> (string-find-last-needle "abc" "")

3

This procedure is a right-to-left substring search companion to string-find-needle.

procedure

(string-find-all-needle s 
  needle 
  [start 
  end 
  #:overlap? overlap? 
  #:ranges? ranges?]) 
  
(listof (or/c exact-nonnegative-integer?
              (cons/c exact-nonnegative-integer?
                      exact-nonnegative-integer?)))
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  overlap? : boolean? = #f
  ranges? : boolean? = #f
Returns all matches of needle from start to end (end not included).

By default, returns start indices. When #:ranges? is true, returns (cons start end) pairs for each match.

When #:overlap? is true, overlapping matches are included.

Indices may be negative and are clamped to the string bounds.

Examples:
> (string-find-all-needle "banana" "na")

'(2 4)

> (string-find-all-needle "banana" "na" #:ranges? #t)

'((2 . 4) (4 . 6))

> (string-find-all-needle "aaaa" "aa" #:overlap? #t)

'(0 1 2)

Related: string-find-needle, string-scan.

procedure

(string-partition s needle [start end])  
string? string? string?
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Searches for the first occurrence of needle from start to end (end not included), and returns three values:

Indices may be negative and are clamped to the string bounds.

  • the substring before the match

  • the matched separator

  • the substring after the match

If no match is found, the second and third values are empty strings, and the first value is the selected substring.

Examples:
> (call-with-values (λ () (string-partition "a:b:c" ":")) list)

'("a" ":" "b:c")

> (call-with-values (λ () (string-partition "abc" ":")) list)

'("abc" "" "")

> (call-with-values (λ () (string-partition "banana" "na")) list)

'("ba" "na" "na")

> (call-with-values (λ () (string-partition "banana" "na" -4 -1)) list)

'("" "na" "n")

procedure

(string-partition-right s needle [start end])

  
string? string? string?
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Searches for the last occurrence of needle from start to end (end not included), and returns three values:

Indices may be negative and are clamped to the string bounds.

  • the substring before the match

  • the matched separator

  • the substring after the match

If no match is found, the second and third values are empty strings, and the first value is the selected substring.

Examples:
> (call-with-values (λ () (string-partition-right "a:b:c" ":")) list)

'("a:b" ":" "c")

> (call-with-values (λ () (string-partition-right "abc" ":")) list)

'("abc" "" "")

> (call-with-values (λ () (string-partition-right "banana" "na")) list)

'("bana" "na" "")

> (call-with-values (λ () (string-partition-right "banana" "na" -4 -1)) list)

'("" "na" "n")

procedure

(string-between s 
  left 
  right 
  [start 
  end 
  #:left-match left-match 
  #:right-match right-match 
  #:include-left? include-left? 
  #:include-right? include-right?]) 
  (or/c string? #f)
  s : string?
  left : (or/c char? string?)
  right : (or/c char? string?)
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  left-match : (or/c 'first 'last) = 'first
  right-match : (or/c 'first 'last) = 'first
  include-left? : boolean? = #f
  include-right? : boolean? = #f
Returns the substring between left and right, or #f if delimiters are not found with the selected matching options.

Delimiters may be strings or single characters.

left-match and right-match choose whether each delimiter uses its first or last match in the selected bounds. The include-left? and include-right? options control whether delimiters are included.

Indices may be negative and are clamped to the string bounds.

Examples:
> (string-between "a[b]c" "[" "]")

"b"

> (string-between "a[b]c[d]e" "[" "]" #:left-match 'last)

"d"

> (string-between "a[b]c[d]e" "[" "]" #:right-match 'last)

"b]c[d"

> (string-between "a[b]c" "[" "]" #:include-left? #t #:include-right? #t)

"[b]"

> (string-between "a[b]c" #\[ #\])

"b"

1.8 Prefix and Suffix Utilities🔗ℹ

This subsection provides small prefix/suffix primitives for normalization and path/key shaping, including remove-if-present and ensure-if-missing forms.

procedure

(string-remove-prefix s prefix)  string?

  s : string?
  prefix : string?
If s starts with prefix, returns s without that prefix. Otherwise returns s unchanged.

Examples:
> (string-remove-prefix "foobar" "foo")

"bar"

> (string-remove-prefix "foobar" "bar")

"foobar"

procedure

(string-remove-suffix s suffix)  string?

  s : string?
  suffix : string?
If s ends with suffix, returns s without that suffix. Otherwise returns s unchanged.

Examples:
> (string-remove-suffix "foobar" "bar")

"foo"

> (string-remove-suffix "foobar" "foo")

"foobar"

procedure

(string-ensure-prefix s prefix)  string?

  s : string?
  prefix : string?
If s does not start with prefix, returns a new string with prefix prepended. Otherwise returns s unchanged.

Examples:
> (string-ensure-prefix "bar" "foo")

"foobar"

> (string-ensure-prefix "foobar" "foo")

"foobar"

procedure

(string-ensure-suffix s suffix)  string?

  s : string?
  suffix : string?
If s does not end with suffix, returns a new string with suffix appended. Otherwise returns s unchanged.

Examples:
> (string-ensure-suffix "foo" "bar")

"foobar"

> (string-ensure-suffix "foobar" "bar")

"foobar"

procedure

(string-common-prefix a b)  string?

  a : string?
  b : string?
Returns the longest common prefix of a and b.

Examples:
> (string-common-prefix "foobar" "foobaz")

"fooba"

> (string-common-prefix "abc" "xyz")

""

procedure

(string-common-suffix a b)  string?

  a : string?
  b : string?
Returns the longest common suffix of a and b.

Examples:
> (string-common-suffix "foobar" "xxbar")

"bar"

> (string-common-suffix "abc" "xyz")

""

1.9 Lines🔗ℹ

This subsection groups line-oriented utilities for text and file processing, including line splitting, counting, newline normalization, and display-column handling.

procedure

(string-lines s)  (listof string?)

  s : string?
Splits s into lines.

Line separators recognized are #\newline, #\return, and the two-character sequence #\return followed by #\newline.

If s ends with a line separator, no extra trailing empty line is added.

Examples:
> (string-lines "")

'()

> (string-lines "a\nb")

'("a" "b")

> (string-lines "a\r\nb")

'("a" "b")

> (string-lines "a\rb")

'("a" "b")

> (string-lines "a\n")

'("a")

procedure

(string-count-lines s)  exact-positive-integer?

  s : string?
Counts the number of lines in s.

Line boundaries follow #\newline, #\return, and #\return followed by #\newline. A #\return followed by #\newline counts as one line boundary.

Examples:
> (string-count-lines "")

1

> (string-count-lines "a\nb")

2

> (string-count-lines "a\r\nb")

2

> (string-count-lines "a\n")

2

procedure

(string-line-start-indices s)

  (listof exact-nonnegative-integer?)
  s : string?
Returns a list of character offsets for the start of each line in s.

Line boundaries follow the same rules as string-count-lines.

Examples:
> (string-line-start-indices "")

'(0)

> (string-line-start-indices "a\nb")

'(0 2)

> (string-line-start-indices "a\r\nb")

'(0 3)

> (string-line-start-indices "a\n")

'(0 2)

procedure

(string-normalize-newlines s)  string?

  s : string?
Converts line endings in s so that every newline sequence becomes a single #\newline.

Both #\return and #\return followed by #\newline are normalized to #\newline.

Examples:
> (string-normalize-newlines "a\r\nb")

"a\nb"

> (string-normalize-newlines "a\rb")

"a\nb"

> (string-normalize-newlines "\r\n\rx\r")

"\n\nx\n"

procedure

(string-expand-tabs s    
  [#:tab-width tab-width    
  #:start-column start-column])  string?
  s : string?
  tab-width : exact-positive-integer? = 8
  start-column : exact-nonnegative-integer? = 0
Replaces tab characters in s with spaces according to tab stops.

Tabs advance to the next tab stop determined by tab-width. Newline and return characters reset the running column to zero.

Examples:
> (string-expand-tabs "a\tb")

"a       b"

> (string-expand-tabs "ab\tcd" #:tab-width 4)

"ab  cd"

> (string-expand-tabs "\t" #:tab-width 4 #:start-column 2)

"  "

procedure

(string-display-width s 
  [#:tab-width tab-width 
  #:start-column start-column]) 
  exact-nonnegative-integer?
  s : string?
  tab-width : exact-positive-integer? = 8
  start-column : exact-nonnegative-integer? = 0
Returns a monospace display-width approximation for the length of the final line of s.

ASCII printable characters count as width 1. Tabs advance to the next tab stop. Newline and return reset the running column to zero.

Examples:
> (string-display-width "a\tb")

9

> (string-display-width "a\nbc")

2

> (string-display-width "\t" #:tab-width 4 #:start-column 2)

4

procedure

(string-chomp s)  string?

  s : string?
Removes one trailing newline from s when present. The removed newline may be either "\n" or "\r\n".

If no trailing newline is present, s is returned unchanged.

Examples:
> (string-chomp "abc")

"abc"

> (string-chomp "abc\n")

"abc"

> (string-chomp "abc\r\n")

"abc"

> (string-chomp "abc\n\n")

"abc\n"

procedure

(string-chop-newline s)  string?

  s : string?
Alias of string-chomp.

1.10 String Construction and Transformation🔗ℹ

This subsection collects string-building and transformation utilities, from repetition and case conversion to simple linguistic and mapping helpers.

procedure

(string-repeat s n)  string?

  s : string?
  n : exact-nonnegative-integer?
Returns a string consisting of s repeated n times.

Examples:
> (string-repeat "ab" 0)

""

> (string-repeat "ab" 3)

"ababab"

procedure

(string-reverse s)  string?

  s : string?
Returns s with its characters in reverse order.

Examples:
> (string-reverse "")

""

> (string-reverse "abc")

"cba"

procedure

(string-capitalize s)  string?

  s : string?
Returns a string where the first character is uppercased and the remaining characters are lowercased.

Use string-upcase when every character should be uppercased. Use string-titlecase for title-casing behavior across words.

Examples:
> (string-capitalize "")

""

> (string-capitalize "hello world")

"Hello world"

> (string-capitalize "hELLO WORLD")

"Hello world"

procedure

(string-swapcase s)  string?

  s : string?
Returns a string where uppercase letters are converted to lowercase and lowercase letters are converted to uppercase. Non-letter characters are unchanged.

Examples:
> (string-swapcase "")

""

> (string-swapcase "AbC")

"aBc"

> (string-swapcase "hello WORLD")

"HELLO world"

procedure

(string-rot13 s)  string?

  s : string?
Applies ROT13 to ASCII letters in s.

Examples:
> (string-rot13 "Hello, World!")

"Uryyb, Jbeyq!"

> (string-rot13 (string-rot13 "Racket"))

"Racket"

> (string-rot13 (string-rot13 "uryyb"))

"uryyb"

procedure

(string-pluralize s)  string?

  s : string?
Returns a simple English-ish plural form of s using lightweight heuristics.

Examples:
> (string-pluralize "cat")

"cats"

> (string-pluralize "box")

"boxes"

> (string-pluralize "city")

"cities"

procedure

(string-singularize s)  string?

  s : string?
Returns a simple English-ish singular form of s using lightweight heuristics.

⚠️ Gotcha: This is heuristic, not full linguistic inflection.

Examples:
> (string-singularize "cats")

"cat"

> (string-singularize "boxes")

"box"

> (string-singularize "cities")

"city"

procedure

(string-ensure-ends-with-newline s)  string?

  s : string?
Ensures that s ends with #\newline, adding one when needed.

Examples:
> (string-ensure-ends-with-newline "")

"\n"

> (string-ensure-ends-with-newline "abc")

"abc\n"

> (string-ensure-ends-with-newline "abc\n")

"abc\n"

procedure

(string-map proc s [start end])  string?

  proc : (-> char? char?)
  s : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Applies proc to each character in s from start to end and returns the resulting string.

This procedure does not mutate s.

Indices may be negative and are clamped to the string bounds.

Examples:
> (string-map char-upcase "abc")

"ABC"

> (string-map char-upcase "abcdef" 1 4)

"aBCDef"

> (string-map char-upcase "abcdef" -4 -1)

"abCDEf"

procedure

(string-map! proc s [start end])  void?

  proc : (-> char? char?)
  s : (and/c string? (not/c immutable?))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
Applies proc to each character in s from start to end, mutating s in place.

Indices may be negative and are clamped to the string bounds.

Examples:
> (define m (string-copy "abcdef"))
> (string-map! char-upcase m -4 -1)
> m

"abCDEf"

procedure

(string-intersperse sep xs)  string?

  sep : string?
  xs : (listof string?)
Joins the strings in xs, inserting sep between consecutive elements.

Examples:
> (string-intersperse "," '())

""

> (string-intersperse "," '("a"))

"a"

> (string-intersperse "," '("a" "b" "c"))

"a,b,c"

1.11 Escaping and Cleaning🔗ℹ

This subsection groups escaping and cleanup utilities for both human-visible text and machine-oriented string formats such as quoted literals and JSON string content.

procedure

(string-escape-visible s)  string?

  s : string?
Returns a display-oriented escaped version of s, where common control characters are replaced by visible escape sequences.

Escapes include "\\n", "\\r", "\\t", "\\b", "\\f", and "\\\\". Other ASCII control characters are rendered as "\\xNN".

Examples:
> (string-escape-visible "\n\t\r")

"\\n\\t\\r"

> (string-escape-visible "\\x")

"\\\\x"

> (string-escape-visible (string #\nul #\rubout))

"\\x00\\x7F"

> (string-unescape-visible (string-escape-visible "a\n\tb"))

"a\n\tb"

procedure

(string-unescape-visible s)  string?

  s : string?
Parses visible escape sequences in s and returns the corresponding string with escaped characters restored.

Recognized escapes include "\\n", "\\r", "\\t", "\\b", "\\f", "\\\\", "\\xNN", and "\\x...;".

Examples:
> (string-unescape-visible "\\n\\t")

"\n\t"

> (string-unescape-visible "\\x00\\x7F")

"\u0000\u007F"

> (string-unescape-visible "\\x3BB;")

"λ"

> (string-unescape-visible (string-escape-visible "a\n\tb"))

"a\n\tb"

Related: string-quote, string-unquote.

procedure

(string-quote s [#:quote-char quote-char])  string?

  s : string?
  quote-char : char? = #\"
Wraps s in quotes and escapes embedded quote, backslash, and common control characters.

Examples:
> (string-quote "He said \"hi\"")

"\"He said \\\"hi\\\"\""

> (string-quote "a'b" #:quote-char #\')

"'a\\'b'"

> (string-unquote (string-quote "a\nb"))

"a\nb"

Related: string-unquote, string-escape-visible, string-escape-json.

procedure

(string-unquote s [#:quote-char quote-char])  string?

  s : string?
  quote-char : char? = #\"
Removes matching outer quotes from s and unescapes the quoted content.

An exception is raised when outer quotes are missing or escapes are malformed.

Examples:
> (string-unquote "\"a\\nb\"")

"a\nb"

> (string-unquote "'a\\'b'" #:quote-char #\')

"a'b"

> (string-unquote (string-quote "He said \"hi\""))

"He said \"hi\""

Related: string-quote, string-unescape-visible, string-unescape-json.

procedure

(string-escape-regexp s)  string?

  s : string?
Escapes s so it can be used as a literal regular-expression pattern.

Examples:
> (string-escape-regexp "a+b")

"a\\+b"

> (regexp-match? (regexp (string-escape-regexp "a+b")) "a+b")

#t

procedure

(string-escape-json s)  string?

  s : string?
Escapes s as JSON string content, without adding outer quotes.

Examples:
> (string-escape-json "\"\\/\n")

"\\\"\\\\\\/\\n"

> (string-escape-json (string #\nul #\u001F))

"\\u0000\\u001F"

> (string-unescape-json (string-escape-json "hello\nλ"))

"hello\nλ"

Related: string-unescape-json, string-quote.

procedure

(string-unescape-json s)  string?

  s : string?
Unescapes JSON string content, including "\\uXXXX" escapes and surrogate pairs.

Examples:
> (string-unescape-json "\\u0041\\u03BB")

"Aλ"

> (string-unescape-json "\\uD83D\\uDE00")

"😀"

> (string-unescape-json (string-escape-json "hello\nλ"))

"hello\nλ"

Related: string-escape-json, string-unquote.

procedure

(string-strip-ansi s)  string?

  s : string?
Removes common terminal ANSI/VT control sequences from s, including color/style control sequences and OSC metadata sequences.

Examples:
> (string-strip-ansi "\e[31mred\e[0m")

"red"

> (string-strip-ansi "a\e]0;title\ab")

"ab"

procedure

(string-squeeze s [to-squeeze])  string?

  s : string?
  to-squeeze : (or/c char? char-set? string? (procedure-arity-includes/c 1))
   = char-whitespace?
Collapses consecutive matching characters in s into a single character.

If to-squeeze is a character, characters equal to it are squeezed. If it is a character set, characters in the set are squeezed. If it is a string, the string is treated as a character set. If it is a procedure, it is used as the character predicate.

Examples:
> (string-squeeze "a   b    c" #\space)

"a b c"

> (string-squeeze "a\t \n\nb")

"a\tb"

> (string-squeeze "baaaana" "a")

"bana"

1.12 Tokenization and Scanning🔗ℹ

This subsection groups parsing-oriented helpers that split text into tokens or fields and scan text for successive match ranges.

procedure

(string-tokenize s    
  [to-separate    
  start    
  end    
  #:quote quote    
  #:escape escape])  (listof string?)
  s : string?
  to-separate : (or/c char? char-set? string? (procedure-arity-includes/c 1))
   = char-whitespace?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  quote : (or/c #f char?) = #f
  escape : (or/c #f char?) = #\\
Splits s into non-empty tokens using to-separate as a character matcher.

If quote is provided, separators inside quoted text are ignored. If escape is provided, the escaped next character is treated literally.

Indices may be negative and are clamped to the string bounds.

Examples:
> (string-tokenize "  a  b   c  ")

'("a" "b" "c")

> (string-tokenize "a,b,c" #\,)

'("a" "b" "c")

> (string-tokenize "a,\"b,c\",d" #\, #:quote #\")

'("a" "b,c" "d")

> (string-tokenize "a,b\\,c,d" #\, #:escape #\\)

'("a" "b,c" "d")

> (string-tokenize "abc def ghi" char-whitespace? -7 -1)

'("def" "gh")

procedure

(string-fields s 
  [to-separate 
  start 
  end 
  #:quote quote 
  #:escape escape 
  #:widths widths 
  #:include-rest? include-rest?]) 
  (listof string?)
  s : string?
  to-separate : (or/c char? char-set? string? (procedure-arity-includes/c 1))
   = #\,
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  quote : (or/c #f char?) = #f
  escape : (or/c #f char?) = #\\
  widths : (or/c #f (listof exact-positive-integer?)) = #f
  include-rest? : boolean? = #f
Splits s into fields.

In delimiter mode, empty fields are preserved. In fixed-width mode (#:widths), the widths list defines field lengths from left to right. When #:include-rest? is true in fixed-width mode, one additional field contains any remaining substring.

Indices may be negative and are clamped to the string bounds.

Examples:
> (string-fields "a,b,,c," #\,)

'("a" "b" "" "c" "")

> (string-fields "a,\"b,c\",d" #\, #:quote #\")

'("a" "b,c" "d")

> (string-fields "abcdefgh" #\, #:widths (quote(2 3 2)))

'("ab" "cde" "fg")

> (string-fields "abcdefgh" #\, #:widths (quote(2 3 2))#:include-rest? #t)

'("ab" "cde" "fg" "h")

> (string-fields "a,b,c,d" #\, -5 -1)

'("b" "c" "")

procedure

(string-scan s 
  matcher 
  [start 
  end 
  #:overlap? overlap?]) 
  
(-> (or/c (cons/c exact-nonnegative-integer?
                  exact-nonnegative-integer?)
           #f))
  s : string?
  matcher : (or/c string? char? char-set? (procedure-arity-includes/c 1))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  overlap? : boolean? = #f
Returns a generator that produces successive match ranges as (cons start end) pairs, and then returns #f.

If matcher is a string, it is treated as a substring needle. If matcher is a character, character set, or predicate, matching characters are returned as one-character ranges.

Indices may be negative and are clamped to the string bounds.

Examples:
> (define (collect-ranges g)
    (let loop ([acc '()])
      (define v (g))
      (if v (loop (cons v acc)) (reverse acc))))
> (collect-ranges (string-scan "banana" "na"))

'((2 . 4) (4 . 6))

> (collect-ranges (string-scan "aaaa" "aa" #:overlap? #t))

'((0 . 2) (1 . 3) (2 . 4))

> (collect-ranges (string-scan "abc123x" char-numeric?))

'((3 . 4) (4 . 5) (5 . 6))

> (collect-ranges (string-scan "banana" #\a -5 -1))

'((1 . 2) (3 . 4))

Related: string-find-all-needle, string-find-needle.

1.13 Formatting and Layout🔗ℹ

This subsection covers presentation-oriented text shaping, including wrapping, indentation normalization, and width-constrained truncation.

procedure

(string-wrap s    
  width    
  [#:mode mode    
  #:preserve-words? preserve-words?])  string?
  s : string?
  width : exact-positive-integer?
  mode : (or/c 'soft 'hard) = 'soft
  preserve-words? : boolean? = #t
Wraps s to the requested line width.

In 'soft mode, wrapping prefers whitespace boundaries. In 'hard mode, lines are split exactly at width characters.

When preserve-words? is true in soft mode, long words are kept intact instead of being split.

Examples:
> (string-wrap "alpha beta gamma" 10)

"alpha\nbeta gamma"

> (string-wrap "supercalifragilistic" 8)

"supercalifragilistic"

> (string-wrap "supercalifragilistic" 8 #:preserve-words? #f)

"supercal\nifragili\nstic"

> (string-wrap "abcdefghij" 4 #:mode 'hard)

"abcd\nefgh\nij"

procedure

(string-indent s n-or-prefix)  string?

  s : string?
  n-or-prefix : (or/c exact-nonnegative-integer? string?)
Indents every line in s.

If n-or-prefix is a nonnegative integer, that many spaces are used. If it is a string, that string is used as the line prefix.

Examples:
> (string-indent "a\nb" 2)

"  a\n  b"

> (string-indent "a\nb" "-> ")

"-> a\n-> b"

procedure

(string-dedent s)  string?

  s : string?
Removes common leading indentation from all non-blank lines in s.

Indentation is measured using leading spaces and tabs.

Examples:
> (string-dedent "  a\n  b")

"a\nb"

> (string-dedent "    a\n      b")

"a\n  b"

> (string-dedent "  a\n\n  b")

"a\n\nb"

procedure

(string-elide s    
  width    
  [#:where where    
  #:ellipsis ellipsis])  string?
  s : string?
  width : exact-nonnegative-integer?
  where : (or/c 'left 'right 'middle) = 'right
  ellipsis : string? = "..."
Truncates s to at most width characters by inserting ellipsis.

The where option chooses whether truncation happens on the left, right, or in the middle.

Examples:
> (string-elide "abcdef" 5)

"ab..."

> (string-elide "abcdef" 5 #:where 'left)

"...ef"

> (string-elide "abcdef" 5 #:where 'middle)

"a...f"

> (string-elide "abcdef" 6 #:ellipsis "..")

"abcdef"

1.14 Similarity and Distance🔗ℹ

This subsection provides string similarity and distance metrics useful for ranking candidates, fuzzy matching, and suggestion-style diagnostics.

procedure

(string-levenshtein a b)  exact-nonnegative-integer?

  a : string?
  b : string?
Computes the Levenshtein edit distance between a and b. The result is the minimum number of insertions, deletions, and substitutions needed to transform one string into the other.

Time complexity is O ((string-length a) * (string-length b)).

Examples:
> (string-levenshtein "kitten" "sitting")

3

> (string-levenshtein "flaw" "lawn")

2

> (string-levenshtein "abc" "abc")

0

procedure

(string-jaro-winkler a 
  b 
  [#:prefix-scale prefix-scale]) 
  inexact-real?
  a : string?
  b : string?
  prefix-scale : real? = 0.1
Computes the Jaro-Winkler similarity score between a and b. Scores are in the range from 0.0 to 1.0, where larger values indicate stronger similarity. The prefix-scale value must be between 0.0 and 0.25.

Time complexity is approximately O ((string-length a) * (string-length b)) in the worst case.

Examples:
> (string-jaro-winkler "martha" "marhta")

0.9611111111111111

> (string-jaro-winkler "martha" "xyz")

0.0

procedure

(string-similarity a b)  inexact-real?

  a : string?
  b : string?
Returns a suggestion-oriented similarity score between a and b. This is currently an alias of string-jaro-winkler.

Example:
> (string-similarity "dixon" "dicksonx")

0.8133333333333332

1.15 Case Conversion and Predicates🔗ℹ

This subsection provides lightweight whole-string predicates for whitespace, ASCII, and digit checks.

procedure

(string-blank? s)  boolean?

  s : string?
Returns #t if every character in s is whitespace.

Examples:
> (string-blank? "")

#t

> (string-blank? " \t\n")

#t

> (string-blank? " a ")

#f

procedure

(string-ascii? s)  boolean?

  s : string?
Returns #t if every character in s is an ASCII character.

Examples:
> (string-ascii? "")

#t

> (string-ascii? "ABC123!?")

#t

> (string-ascii? "café")

#f

procedure

(string-digit? s)  boolean?

  s : string?
Returns #t if every character in s is an ASCII digit (#\0 through #\9).

Examples:
> (string-digit? "")

#t

> (string-digit? "0123456789")

#t

> (string-digit? "12a3")

#f

2 Character Sets🔗ℹ

This section documents the character-set utilities used by string-count and available directly through string-tools/char-set.

Conceptually, a character set represents a collection of characters with membership operations and set operations such as union, intersection, and difference. It is useful when you want to classify characters efficiently and reuse that classification across multiple string-processing steps.

Character sets are represented with a hybrid structure: an ASCII bit mask for codepoints 0 through 127, plus a normalized collection of non-ASCII inclusive ranges. This representation gives fast membership tests for common ASCII text while keeping non-ASCII sets compact.

 (require string-tools/char-set) package: string-tools-lib

procedure

(char-set? v)  boolean?

  v : any/c
Returns #t when v is a character set value.

Examples:
> (require string-tools/char-set)
> (char-set? (make-char-set #\a #\b))

#t

> (char-set? "ab")

#f

The empty character set.

Examples:
> (require string-tools/char-set)
> (char-set-size empty-char-set)

0

> (char-set-member? empty-char-set #\a)

#f

procedure

(make-char-set ch ...)  char-set?

  ch : char?
Builds a character set containing the given characters.

Examples:
> (require string-tools/char-set)
> (make-char-set #\a #\b #\a)

(char-set 475368975085586025561263702016 '#())

procedure

(list->char-set xs)  char-set?

  xs : (listof char?)
Builds a character set from a list of characters.

Examples:
> (require string-tools/char-set)
> (define cs (list->char-set (list #\a #\b #\a)))
> (char-set-size cs)

2

> (char-set-member? cs #\b)

#t

procedure

(string->char-set s)  char-set?

  s : string?
Builds a character set from the distinct characters in s.

Examples:
> (require string-tools/char-set)
> (define cs (string->char-set "banana"))
> (char-set-size cs)

3

> (char-set-member? cs #\n)

#t

procedure

(char-set-add cs ch)  char-set?

  cs : char-set?
  ch : char?
Returns a character set containing all characters in cs and ch.

Examples:
> (require string-tools/char-set)
> (define cs (char-set-add empty-char-set #\x))
> (char-set-member? cs #\x)

#t

procedure

(char-set-add-range cs lo-ch hi-ch)  char-set?

  cs : char-set?
  lo-ch : char?
  hi-ch : char?
Returns a character set containing cs plus all characters from lo-ch to hi-ch, inclusive.

Examples:
> (require string-tools/char-set)
> (define letters (char-set-add-range empty-char-set #\a #\z))
> (char-set-member? letters #\m)

#t

> (char-set-member? letters #\A)

#f

procedure

(char-set-member? cs ch)  boolean?

  cs : char-set?
  ch : char?
Checks whether ch is in cs.

Examples:
> (require string-tools/char-set)
> (define vowels (make-char-set #\a #\e #\i #\o #\u))
> (char-set-member? vowels #\e)

#t

> (char-set-member? vowels #\y)

#f

procedure

(char-set-union a b)  char-set?

  a : char-set?
  b : char-set?
Returns the union of a and b.

Examples:
> (require string-tools/char-set)
> (define vowels (make-char-set #\a #\e #\i #\o #\u))
> (define y      (make-char-set #\y))
> (char-set-member? (char-set-union vowels y) #\y)

#t

procedure

(char-set-intersection a b)  char-set?

  a : char-set?
  b : char-set?
Returns the intersection of a and b.

Examples:
> (require string-tools/char-set)
> (define a (make-char-set #\a #\b #\c))
> (define b (make-char-set #\b #\c #\d))
> (char-set-size (char-set-intersection a b))

2

procedure

(char-set-difference a b)  char-set?

  a : char-set?
  b : char-set?
Returns the set difference a - b.

Examples:
> (require string-tools/char-set)
> (define letters (char-set-add-range empty-char-set #\a #\f))
> (define vowels  (make-char-set #\a #\e))
> (char-set-member? (char-set-difference letters vowels) #\b)

#t

> (char-set-member? (char-set-difference letters vowels) #\a)

#f

procedure

(char-set-size cs)  exact-nonnegative-integer?

  cs : char-set?
Returns the number of characters in cs.

Examples:
> (require string-tools/char-set)
> (char-set-size (make-char-set #\a #\b #\a))

2

Examples:
> (require string-tools/char-set)
> (define vowels (make-char-set #\a #\e #\i #\o #\u))
> (char-set-member? vowels #\e)

#t

> (char-set-size vowels)

5

> (define letters (char-set-add-range empty-char-set #\a #\z))
> (char-set-size (char-set-difference letters vowels))

21

3 Extended Examples🔗ℹ

This section presents three end-to-end workflows: log analysis, CSV-like import cleaning, and configuration normalization with patching.

3.1 Logs🔗ℹ

This extended example uses a small synthetic log and shows a full normalize-parse-analyze pipeline. Each log line uses the format:

ts level service=... request_id=... msg="..."

Here, ts is the time stamp.

Prepare a small synthetic log input.

> (define raw-log
    (string-append
     "2026-02-21T22:10:00Z INFO service=api request_id=abc123 msg=\"start\"\r\n"
     "\e[31m2026-02-21T22:10:01Z ERROR service=api request_id=abc123 msg=\"timeout\"\e[0m\r\n"
     "2026-02-21T22:10:02Z WARN service=worker request_id=def456 msg=\"retrying\"\n"
     "2026-02-21T22:10:03Z INFO service=api request_id=abc123 msg=\"done\"\r"))

Normalize line endings and remove ANSI terminal escapes.

> (define cleaned-log
    (string-strip-ansi (string-normalize-newlines raw-log)))

Turn text into non-empty log lines and inspect quick counts.

> (define lines
    (filter (λ (s) (not (string-blank? s)))
            (string-lines cleaned-log)))
> (length lines)

4

> (string-count-needle cleaned-log "ERROR")

1

Define a parser that turns one line into a record.

> (define (line->record line)
    (define fs         (string-fields line #\space))
    (define ts         (list-ref fs 0))
    (define level      (list-ref fs 1))
    (define service    (string-between line "service=" " "))
    (define request-id (string-between line "request_id=" " "))
    (define msg        (string-between line "msg=\"" "\"" #:right-match 'last))
    (list ts level service request-id msg))

Parse all lines and inspect the first parsed record.

> (define records (map line->record lines))
> (car records)

'("2026-02-21T22:10:00Z" "INFO" "api" "abc123" "start")

Select all ERROR records.

> (define error-records
    (filter (λ (r) (string=? (list-ref r 1) "ERROR")) records))
> error-records

'(("2026-02-21T22:10:01Z" "ERROR" "api" "abc123" "timeout"))

3.2 CSV-Like Import Cleaning🔗ℹ

This example shows a small CSV-like import pipeline with quoted fields, whitespace cleanup, and row-level validation diagnostics.

Prepare a small CSV-like input and split it into rows.

> (define raw-csv
    (string-append
     "id,name,socre\r\n"
     "1,\"Alice\",98\r\n"
     "2,\" Bob  \",87\r\n"
     "x,\"Mallory\",91\r\n"
     "4,\"Eve\",9a\r\n"
     "5,\"\",100\r\n"))
> (define rows
    (string-lines (string-normalize-newlines raw-csv)))

Parse header and data rows.

> (define header (string-fields (car rows) #\, #:quote #\"))
> (define data-rows (cdr rows))

Validate header names and suggest likely intended names.

> (define expected-header '("id" "name" "score"))
> (define (best-column-suggestion col)
    (define-values (best-name best-score)
      (for/fold ([best-name #f] [best-score -1.0]) ([cand (in-list expected-header)])
        (define score (string-similarity col cand))
        (if (> score best-score)
            (values cand score)
            (values best-name best-score))))
    (if (and best-name (>= best-score 0.7))
        best-name
        #f))
> (define header-diagnostics
    (for/list ([col (in-list header)] #:unless (member col expected-header))
      (define suggestion (best-column-suggestion col))
      (if suggestion
          (string-append "unknown column " col "; did you mean " suggestion "?")
          (string-append "unknown column " col))))
> header-diagnostics

'("unknown column socre; did you mean score?")

Define a small field normalizer used during import.

> (define (clean-field s)
    (string-trim-both (string-squeeze s #\space) #\space))

Parse each row as CSV-like fields and inspect parsed rows.

> (define parsed
    (for/list ([row (in-list data-rows)])
      (for/list ([field (in-list (string-fields row #\, #:quote #\"))])
        (clean-field field))))
> parsed

'(("1" "Alice" "98")

  ("2" "Bob" "87")

  ("x" "Mallory" "91")

  ("4" "Eve" "9a")

  ("5" "" "100"))

Validate rows: id and score must be digits; name must be non-blank.

> (define (row-error fs)
    (define id    (list-ref fs 0))
    (define name  (list-ref fs 1))
    (define score (list-ref fs 2))
    (cond
      [(not (string-digit? id))    "invalid id"]
      [(string-blank? name)        "blank name"]
      [(not (string-digit? score)) "invalid score"]
      [else #f]))

Keep diagnostics for rows that fail validation.

> (define diagnostics
    (for/list ([row (in-list data-rows)]
               [fs  (in-list parsed)]
               #:when (row-error fs))
      (list (row-error fs)
            (string-escape-visible row))))
> header

'("id" "name" "socre")

> diagnostics

'(("invalid id" "x,\"Mallory\",91")

  ("invalid score" "4,\"Eve\",9a")

  ("blank name" "5,\"\",100"))

3.3 Config Normalization and Patching🔗ℹ

This example parses an INI-like configuration text, validates keys, suggests fixes for unknown keys, patches one value in-place, and emits normalized output with a final newline.

Prepare and normalize a small INI-like input.

> (define raw-config
    (string-append
     "; demo config\r\n"
     "host = example.org\r\n"
     "port = 8080\r\n"
     "timeout = 30\r\n"
     "retris = 2\r\n"
     "mode fast\r\n"))
> (define normalized (string-normalize-newlines raw-config))
> (define lines      (string-lines normalized))
> (define expected-keys '("host" "port" "timeout" "retries" "mode"))

Define helpers for comment detection and line parsing.

> (define (comment-line? t)
    (memv (string-at t 0 #f) '(#\# #\;)))
> (define (parse-config-line line)
    (define t (string-trim-both line))
    (cond
      [(string-blank? t)
       #f]
      [(comment-line? t)
       #f]
      [else
       (define-values (lhs sep rhs) (string-partition t "="))
       (if (string=? sep "")
           (list 'invalid (string-escape-visible line))
           (list (string-trim-both lhs)
                 (string-trim-both rhs)))]))

Parse all lines and inspect the intermediate representation.

> (define parsed-lines
    (filter (λ (x) x)
            (map parse-config-line lines)))
> parsed-lines

'(("host" "example.org")

  ("port" "8080")

  ("timeout" "30")

  ("retris" "2")

  (invalid "mode fast"))

Validate keys and produce diagnostics with similarity-based suggestions.

> (define (best-key-suggestion k)
    (define-values (best-name best-score)
      (for/fold ([best-name #f] [best-score -1.0]) ([cand (in-list expected-keys)])
        (define score (string-similarity k cand))
        (if (> score best-score)
            (values cand score)
            (values best-name best-score))))
    (if (and best-name (>= best-score 0.7))
        best-name
        #f))
> (define diagnostics
    (for/list ([entry (in-list parsed-lines)]
               #:when
               (or (eq? (car entry) 'invalid)
                   (and (string? (car entry))
                        (not (member (car entry) expected-keys)))))
      (cond
        [(eq? (car entry) 'invalid)
         (string-append "malformed line: " (cadr entry))]
        [else
         (define key (car entry))
         (define sug (best-key-suggestion key))
         (if sug
             (string-append "unknown key " key "; did you mean " sug "?")
             (string-append "unknown key " key))])))
> diagnostics

'("unknown key retris; did you mean retries?" "malformed line: mode fast")

Patch one setting in-place and normalize final output.

> (define old-timeout "timeout = 30")
> (define i           (string-find-needle normalized old-timeout))
> (define patched
    (if i
        (string-replace-range normalized
                              i
                              (+ i (string-length old-timeout))
                              "timeout = 45")
        normalized))
> (define final-config
    (string-ensure-ends-with-newline patched))
> (displayln final-config)

; demo config

host = example.org

port = 8080

timeout = 45

retris = 2

mode fast