Racket

String Tools🔗ℹ

Jens Axel Søgaard <jensaxel@soegaard.net>

1 String Functions🔗ℹ

This section documents string-processing procedures provided by this package. They focus on practical operations such as splitting, substring counting, and character counting with explicit index bounds.

The procedures are intended to complement racket/string, so a typical setup is:

(require racket/string string-tools)

Procedures that are close in purpose to existing Racket procedures use distinct names (for example, string-at instead of string-ref) to avoid name clashes. Some procedure names follow the SRFI 13 naming tradition. If you need both libraries at once, prefix SRFI 13, for example:

(require (prefix-in srfi: srfi/13) string-tools)

Before diving into the individual procedures, consider skimming Extended Examples. It includes one example on parsing and analyzing structured log lines, one example on cleaning and validating CSV-like imported rows, and one example on normalizing and patching an INI-like configuration text.

1.1 Conventions🔗ℹ

These conventions apply throughout the string procedures in this section.

For procedures that accept start and end, negative indices count from the end of the string, and indices are clamped to valid bounds.
start is included and end is not included when selecting a substring.
A value of -1 denotes the index of the last character. Because end is not included, using end as -1 stops just before the last character.
In string-slice/step, #f for start or end means the bound is omitted and defaults according to the step direction.
In procedures that accept a character matcher, a matcher may be a character, a character set, a string (treated as a character set), or a unary predicate on characters.

1.2 Function Index🔗ℹ

Use this overview as a quick map from task to procedure family.

Access and Slicing

Access	string-at
Slicing	string-slice string-slice/step
Split/Replace	string-split-at string-replace-range

Character Counting

Counting

string-count string-count-lines

Needle Counting

Needles

string-count-needle

Index Search and Trimming

Index/Skip	string-index string-index-right string-skip string-skip-right
Trimming	string-trim-both string-trim-left string-trim-right

Search and Partitioning

Needle Search	string-find-needle string-find-last-needle string-find-all-needle
Partitioning	string-partition string-partition-right string-between

Prefix and Suffix

Normalize	string-remove-prefix string-remove-suffix string-ensure-prefix string-ensure-suffix
Common Parts	string-common-prefix string-common-suffix

Lines

Line Ops	string-lines string-line-start-indices string-normalize-newlines string-chomp string-chop-newline string-ensure-ends-with-newline
Tabs/Width	string-expand-tabs string-display-width

Construction and Transformation

Case/Map	string-capitalize string-swapcase string-map string-map!
Transform	string-repeat string-reverse string-rot13 string-pluralize string-singularize string-intersperse

Escaping and Cleaning

Quoting	string-quote string-unquote
Visible Escapes	string-escape-visible string-unescape-visible
JSON/Regexp/ANSI	string-escape-json string-unescape-json string-escape-regexp string-strip-ansi string-squeeze

Tokenization and Scanning

Tokenize/Fields	string-tokenize string-fields
Scan	string-scan

Formatting and Layout

Layout

string-wrap string-indent string-dedent string-elide

Similarity and Distance

Metrics

string-levenshtein string-jaro-winkler string-similarity

Case Conversion and Predicates

Predicates

string-blank? string-ascii? string-digit?

1.3 Splitting and Slicing🔗ℹ

This subsection covers positional extraction and replacement operations, from safe single-character access to stepped slicing and split-at-index workflows.

ℹ️ Think of string-slice as a nicer substring.

procedure
(string-slice s [start end]) → string?
  s : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Compared to substring, this procedure accepts negative indices and clamps out-of-range indices instead of raising bounds errors.

Indices are normalized and clamped to the string bounds: negative indices count backward from the end, and out-of-range indices are clamped to valid positions. If the normalized end is less than or equal to the normalized start, the result is the empty string.

Examples:

> (string-slice "abcdef")
"abcdef"
> (string-slice "abcdef" 1 4)
"bcd"
> (string-slice "abcdef" -3 -1)
"de"
> (string-slice "abcdef" -100 100)
"abcdef"
> (string-slice "abcdef" 4 2)
""

Related: string-slice/step, string-at.

procedure
(string-slice/step s [start end step]) → string?
  s : string?
  start : (or/c exact-integer? #f) = #f
  end : (or/c exact-integer? #f) = #f
  step : exact-integer? = 1

Like string-slice, but also supports stepping.

⚠️ Gotcha: With negative step, omitted bounds (#f) behave differently from explicit negative indices such as -1.

When step is positive, traversal is left to right. When step is negative, traversal is right to left. A zero step raises an exception.

If start or end is #f, the bound is treated as omitted and defaults according to the step direction.

Indices may be negative and are clamped to the string bounds.

Examples:

> (string-slice/step "abcdef")
"abcdef"
> (string-slice/step "abcdef" 0 6 2)
"ace"
> (string-slice/step "abcdef" 5 #f -2)
"fdb"
> (string-slice/step "abcdef" #f #f -1)
"fedcba"

Related: string-slice, string-at.

ℹ️ Think of string-at as a safer string-ref.

procedure
(string-at s i [default]) → any/c
  s : string?
  i : exact-integer?
  default : any/c = #f

Returns the character at index i.

Indices are clamped to the string bounds, and negative indices count from the end of the string.

⚠️ Gotcha: For non-empty strings, out-of-range indices are clamped, so default is only used when s is empty.

If s is empty, default is returned.

Examples:

> (string-at "abc" 0)
#\a
> (string-at "abc" -1)
#\c
> (string-at "abc" 3)
#\c
> (string-at "abc" -10)
#\a
> (string-at "" 0 #\x)
#\x

Related: string-slice, string-slice/step.

procedure
(string-split-at s i ...) → (listof string?)
s : string?
i : exact-integer?

Splits s at the given indices and returns the resulting substrings as a list.

Indices may be negative and are clamped to the string bounds. The indices may be given in any order and may contain duplicates; they are sorted and deduplicated before splitting.

The returned list contains the substrings of s between successive cut positions, including the beginning and end of the string.

Examples:

> (string-split-at "abcdef" 2 4)
'("ab" "cd" "ef")
> (string-split-at "abcdef" 4 2)
'("ab" "cd" "ef")
> (string-split-at "abc" 1 1 2)
'("a" "b" "c")
> (string-split-at "abc")
'("abc")
> (string-split-at "abc" 0)
'("" "abc")
> (string-split-at "abc" -1)
'("ab" "c")
> (string-split-at "abc" 3)
'("abc" "")

If no indices are provided, the result is a list containing s itself.

If exactly one index is provided, the result is a two-element list consisting of the prefix and suffix at that index.

An exception is raised if any index is not an exact integer.

procedure
(string-replace-range s
start
end
replacement) → string?
  s : string?
  start : exact-integer?
  end : exact-integer?
  replacement : string?

Replaces the selected portion of s with replacement.

The replaced portion starts at start and continues up to end, excluding the character at end.

Indices may be negative and are clamped to the string bounds.

Examples:

> (string-replace-range "abcdef" 2 4 "XY")
"abXYef"
> (string-replace-range "abcdefgh" 2 6 "X")
"abXgh"
> (string-replace-range "abcdefgh" 2 4 "WXYZ")
"abWXYZefgh"
> (string-replace-range "abcdef" -4 -2 "XY")
"abXYef"
> (string-replace-range "abcdef" 4 2 "XY")
"abXYef"

1.4 Character Counting🔗ℹ

This subsection covers character-wise counting with flexible matching criteria, including character, character-set, string, and predicate forms.

procedure
(string-count s to-count [start end]) → exact-nonnegative-integer?
  s : string?
  to-count : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Counts characters in s that satisfy to-count between start and end.

Indices may be negative and are clamped to the string bounds.

If to-count is a procedure, it is applied to each character as a predicate. If it is a character set, each character is tested for membership. If it is a character, char=? is used. If it is a string, the string is converted to a character set.

An exception is raised if start or end is not an exact integer, or if to-count is not a character, character set, string, or unary procedure.

Examples:

> (string-count "banana" #\a)
3
> (string-count "banana" "an")
5
> (require string-tools/char-set)
> (string-count "banana" (make-char-set #\a #\n))
5
> (string-count "a1b2c3" char-numeric?)
3
> (string-count "banana" #\a 2 6)
2
> (string-count "banana" #\a -5 -1)
2

This procedure provides character counting with flexible matching criteria.

1.5 Needle Counting🔗ℹ

This subsection groups substring-occurrence counting utilities for bounded regions of a string. Use these procedures when you need non-overlapping needle counts rather than per-character counting.

procedure
(string-count-needle s needle [start end])
→ exact-nonnegative-integer?
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Counts the number of non-overlapping occurrences of needle in s, restricted to the substring from start to end (exclusively).

The search begins at start and stops before end, which defaults to the length of s. In other words, start is included and end is not included.

Indices may be negative and are clamped to the string bounds.

Occurrences are counted from left to right and do not overlap.

If needle is the empty string, the result is the number of insertion positions in the selected substring.

An exception is raised if start or end is not an exact integer.

Examples:

> (string-count-needle "banana" "na")
2
> (string-count-needle "aaaa" "aa")
2
> (string-count-needle "aaaa" "aaa")
1
> (string-count-needle "banana" "na" 3 6)
1
> (string-count-needle "banana" "na" -4 -1)
1
> (string-count-needle "abc" "")
4
> (string-count-needle "abc" "" 1 3)
3

This procedure is intended to complement the string-search utilities in racket/string by providing a direct substring-count operation.

1.6 Index Search and Trimming🔗ℹ

This subsection combines left-to-right and right-to-left index/skip operations with matcher-driven trimming over bounded substring regions.

procedure
(string-index s to-find [start end])
→ (or/c exact-nonnegative-integer? #f)
  s : string?
  to-find : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Searches s from left to right and returns the first index from start to end (end not included) whose character matches to-find. Returns #f when no match is found.

Indices may be negative and are clamped to the string bounds.

Matching rules:

If to-find is a character, char=? is used.
If it is a character set, membership is tested.
If it is a string, the string is converted to a character set.
If it is a procedure, the procedure is used as a predicate.

Examples:

> (string-index "banana" #\a)
1
> (string-index "banana" "nz")
2
> (string-index "banana" (make-char-set #\n #\z))
2
> (string-index "a1b2c3" char-numeric?)
1
> (string-index "banana" #\a -5 -1)
1

This procedure searches from left to right with configurable matching criteria.

procedure
(string-index-right s to-find [start end])
→ (or/c exact-nonnegative-integer? #f)
  s : string?
  to-find : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Searches s from right to left and returns the first matching index encountered from start to end (end not included). Returns #f when no match is found.

Indices may be negative and are clamped to the string bounds.

The right-to-left search starts at (sub1 end).

Examples:

> (string-index-right "banana" #\a)
5
> (string-index-right "banana" "nz")
4
> (string-index-right "banana" (make-char-set #\n #\z))
4
> (string-index-right "a1b2c3" char-numeric?)
5
> (string-index-right "banana" #\a -5 -1)
3

This procedure searches from right to left with configurable matching criteria.

procedure
(string-skip s to-skip [start end])
→ (or/c exact-nonnegative-integer? #f)
  s : string?
  to-skip : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Like string-index, but uses the complement of the matching criteria: it searches left to right for the first character from start to end (end not included), that does not match to-skip.

Indices may be negative and are clamped to the string bounds.

Examples:

> (string-skip " abc" #\space)
3
> (string-skip "aaab" "a")
3
> (string-skip "123x5" char-numeric?)
3
> (string-skip " abc" #\space -4 100)
3

This procedure provides left-to-right skipping using the complement criterion.

procedure
(string-skip-right s to-skip [start end])
→ (or/c exact-nonnegative-integer? #f)
  s : string?
  to-skip : (or/c char? char-set? string? (-> char? any/c))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Like string-index-right, but uses the complement of the matching criteria: it searches right to left for the first character in the substring from start to end (end not included), that does not match to-skip.

Indices may be negative and are clamped to the string bounds.

Examples:

> (string-skip-right "abc " #\space)
2
> (string-skip-right "baaa" "a")
0
> (string-skip-right "5x321" char-numeric?)
1
> (string-skip-right "abc " #\space -100 -1)
2

This procedure provides right-to-left skipping using the complement criterion.

procedure
(string-trim-left s [to-trim start end]) → string?
  s : string?
   to-trim : (or/c char? char-set? string? (-> char? any/c))
= char-whitespace?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Returns a copy of the substring from start to end (end not included), after removing matching characters from the left side.

Indices may be negative and are clamped to the string bounds.

If to-trim is omitted, whitespace is trimmed. If to-trim is a string, it is converted to a character set.

Examples:

> (string-trim-left " abc ")
"abc "
> (string-trim-left "aaab" #\a)
"b"
> (string-trim-left "aaab" "a")
"b"
> (string-trim-left "abbaXYZ" "ab")
"XYZ"
> (string-trim-left "123x5" char-numeric?)
"x5"
> (string-trim-left "xxabcxx" #\x 2 7)
"abcxx"
> (string-trim-left "xxabcxx" #\x -5 -1)
"abcx"
> (string-trim-left "abcde" #\x 1 3)
"bc"

procedure
(string-trim-right s [to-trim start end]) → string?
  s : string?
   to-trim : (or/c char? char-set? string? (-> char? any/c))
= char-whitespace?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Returns a copy of the substring from start to end (end not included), after removing matching characters from the right side.

Indices may be negative and are clamped to the string bounds.

If to-trim is omitted, whitespace is trimmed. If to-trim is a string, it is converted to a character set.

Examples:

> (string-trim-right " abc ")
" abc"
> (string-trim-right "baaa" #\a)
"b"
> (string-trim-right "baaa" "a")
"b"
> (string-trim-right "XYZabba" "ab")
"XYZ"
> (string-trim-right "5x321" char-numeric?)
"5x"
> (string-trim-right "xxabcxx" #\x 1 6)
"xabc"
> (string-trim-right "xxabcxx" #\x -5 -1)
"abc"

ℹ️ Similar to string-trim, but this procedure uses the matcher conventions in this module for to-trim.

procedure
(string-trim-both s [to-trim start end]) → string?
  s : string?
   to-trim : (or/c char? char-set? string? (-> char? any/c))
= char-whitespace?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Returns a copy of the substring from start to end (end not included), after removing matching characters from both the left and right sides.

Indices may be negative and are clamped to the string bounds.

If to-trim is omitted, whitespace is trimmed. If to-trim is a string, it is converted to a character set.

Examples:

> (string-trim-both " abc ")
"abc"
> (string-trim-both "aaabaa" #\a)
"b"
> (string-trim-both "aaabaa" "a")
"b"
> (string-trim-both "abbaXYZabba" "ab")
"XYZ"
> (string-trim-both "123x5" char-numeric?)
"x"
> (string-trim-both "xxabcxx" #\x 1 6)
"abc"
> (string-trim-both "xxabcxx" #\x -5 -1)
"abc"

1.7 Substring Search and Partitioning🔗ℹ

This subsection groups substring search and partitioning helpers that return indices, ranges, or before/needle/after splits.

procedure
(string-find-needle s needle [start end])
→ (or/c exact-nonnegative-integer? #f)
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Searches s from left to right for needle from start to end (end not included). Returns the index of the first match, or #f if no match is found.

Indices may be negative and are clamped to the string bounds.

If needle is the empty string, the result is start.

Examples:

> (string-find-needle "banana" "na")
2
> (string-find-needle "banana" "na" 3 6)
4
> (string-find-needle "banana" "na" -4 -1)
2
> (string-find-needle "abc" "")
0

This procedure provides direct substring search.

Related: string-find-all-needle, string-scan.

procedure
(string-find-last-needle s needle [start end])
→ (or/c exact-nonnegative-integer? #f)
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Searches s from right to left for needle from start to end (end not included). Returns the index of the last match, or #f if no match is found.

Indices may be negative and are clamped to the string bounds.

If needle is the empty string, the result is end.

Examples:

> (string-find-last-needle "banana" "na")
4
> (string-find-last-needle "banana" "na" 0 5)
2
> (string-find-last-needle "banana" "na" -4 -1)
2
> (string-find-last-needle "abc" "")
3

This procedure is a right-to-left substring search companion to string-find-needle.

procedure
(string-find-all-needle s
needle
[ start
end
#:overlap? overlap?
#:ranges? ranges?])
→
(listof (or/c exact-nonnegative-integer?
              (cons/c exact-nonnegative-integer?
                      exact-nonnegative-integer?)))
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  overlap? : boolean? = #f
  ranges? : boolean? = #f

Returns all matches of needle from start to end (end not included).

By default, returns start indices. When #:ranges? is true, returns (cons start end) pairs for each match.

When #:overlap? is true, overlapping matches are included.

Indices may be negative and are clamped to the string bounds.

Examples:

> (string-find-all-needle "banana" "na")
'(2 4)
> (string-find-all-needle "banana" "na" #:ranges? #t)
'((2 . 4) (4 . 6))
> (string-find-all-needle "aaaa" "aa" #:overlap? #t)
'(0 1 2)

Related: string-find-needle, string-scan.

procedure
(string-partition s needle [start end]) →
string? string? string?
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Searches for the first occurrence of needle from start to end (end not included), and returns three values:

Indices may be negative and are clamped to the string bounds.

the substring before the match
the matched separator
the substring after the match

If no match is found, the second and third values are empty strings, and the first value is the selected substring.

Examples:

> (call-with-values (λ () (string-partition "a:b:c" ":")) list)
'("a" ":" "b:c")
> (call-with-values (λ () (string-partition "abc" ":")) list)
'("abc" "" "")
> (call-with-values (λ () (string-partition "banana" "na")) list)
'("ba" "na" "na")
> (call-with-values (λ () (string-partition "banana" "na" -4 -1)) list)
'("" "na" "n")

procedure
(string-partition-right s needle [start end])
→
string? string? string?
  s : string?
  needle : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Searches for the last occurrence of needle from start to end (end not included), and returns three values:

Indices may be negative and are clamped to the string bounds.

the substring before the match
the matched separator
the substring after the match

If no match is found, the second and third values are empty strings, and the first value is the selected substring.

Examples:

> (call-with-values (λ () (string-partition-right "a:b:c" ":")) list)
'("a:b" ":" "c")
> (call-with-values (λ () (string-partition-right "abc" ":")) list)
'("abc" "" "")
> (call-with-values (λ () (string-partition-right "banana" "na")) list)
'("bana" "na" "")
> (call-with-values (λ () (string-partition-right "banana" "na" -4 -1)) list)
'("" "na" "n")

procedure
(string-between s
left
right
[ start
end
#:left-match left-match
#:right-match right-match
#:include-left? include-left?
#:include-right? include-right?])
→ (or/c string? #f)
  s : string?
  left : (or/c char? string?)
  right : (or/c char? string?)
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  left-match : (or/c 'first 'last) = 'first
  right-match : (or/c 'first 'last) = 'first
  include-left? : boolean? = #f
  include-right? : boolean? = #f

Returns the substring between left and right, or #f if delimiters are not found with the selected matching options.

Delimiters may be strings or single characters.

left-match and right-match choose whether each delimiter uses its first or last match in the selected bounds. The include-left? and include-right? options control whether delimiters are included.

Indices may be negative and are clamped to the string bounds.

Examples:

> (string-between "a[b]c" "[" "]")
"b"
> (string-between "a[b]c[d]e" "[" "]" #:left-match 'last)
"d"
> (string-between "a[b]c[d]e" "[" "]" #:right-match 'last)
"b]c[d"
> (string-between "a[b]c" "[" "]" #:include-left? #t #:include-right? #t)
"[b]"
> (string-between "a[b]c" #\[ #\])
"b"

1.8 Prefix and Suffix Utilities🔗ℹ

This subsection provides small prefix/suffix primitives for normalization and path/key shaping, including remove-if-present and ensure-if-missing forms.

procedure
(string-remove-prefix s prefix) → string?
s : string?
prefix : string?

If s starts with prefix, returns s without that prefix. Otherwise returns s unchanged.

Examples:

> (string-remove-prefix "foobar" "foo")
"bar"
> (string-remove-prefix "foobar" "bar")
"foobar"

procedure
(string-remove-suffix s suffix) → string?
s : string?
suffix : string?

If s ends with suffix, returns s without that suffix. Otherwise returns s unchanged.

Examples:

> (string-remove-suffix "foobar" "bar")
"foo"
> (string-remove-suffix "foobar" "foo")
"foobar"

procedure
(string-ensure-prefix s prefix) → string?
s : string?
prefix : string?

If s does not start with prefix, returns a new string with prefix prepended. Otherwise returns s unchanged.

Examples:

> (string-ensure-prefix "bar" "foo")
"foobar"
> (string-ensure-prefix "foobar" "foo")
"foobar"

procedure
(string-ensure-suffix s suffix) → string?
s : string?
suffix : string?

If s does not end with suffix, returns a new string with suffix appended. Otherwise returns s unchanged.

Examples:

> (string-ensure-suffix "foo" "bar")
"foobar"
> (string-ensure-suffix "foobar" "bar")
"foobar"

procedure
(string-common-prefix a b) → string?
a : string?
b : string?

Returns the longest common prefix of a and b.

Examples:

> (string-common-prefix "foobar" "foobaz")
"fooba"
> (string-common-prefix "abc" "xyz")
""

procedure
(string-common-suffix a b) → string?
a : string?
b : string?

Returns the longest common suffix of a and b.

Examples:

> (string-common-suffix "foobar" "xxbar")
"bar"
> (string-common-suffix "abc" "xyz")
""

1.9 Lines🔗ℹ

This subsection groups line-oriented utilities for text and file processing, including line splitting, counting, newline normalization, and display-column handling.

procedure
(string-lines s) → (listof string?)
s : string?

Splits s into lines.

Line separators recognized are #\newline, #\return, and the two-character sequence #\return followed by #\newline.

If s ends with a line separator, no extra trailing empty line is added.

Examples:

> (string-lines "")
'()
> (string-lines "a\nb")
'("a" "b")
> (string-lines "a\r\nb")
'("a" "b")
> (string-lines "a\rb")
'("a" "b")
> (string-lines "a\n")
'("a")

procedure
(string-count-lines s) → exact-positive-integer?
s : string?

Counts the number of lines in s.

Line boundaries follow #\newline, #\return, and #\return followed by #\newline. A #\return followed by #\newline counts as one line boundary.

Examples:

> (string-count-lines "")
1
> (string-count-lines "a\nb")
2
> (string-count-lines "a\r\nb")
2
> (string-count-lines "a\n")
2

procedure
(string-line-start-indices s)
→ (listof exact-nonnegative-integer?)
s : string?

Returns a list of character offsets for the start of each line in s.

Line boundaries follow the same rules as string-count-lines.

Examples:

> (string-line-start-indices "")
'(0)
> (string-line-start-indices "a\nb")
'(0 2)
> (string-line-start-indices "a\r\nb")
'(0 3)
> (string-line-start-indices "a\n")
'(0 2)

procedure
(string-normalize-newlines s) → string?
s : string?

Converts line endings in s so that every newline sequence becomes a single #\newline.

Both #\return and #\return followed by #\newline are normalized to #\newline.

Examples:

> (string-normalize-newlines "a\r\nb")
"a\nb"
> (string-normalize-newlines "a\rb")
"a\nb"
> (string-normalize-newlines "\r\n\rx\r")
"\n\nx\n"

procedure
(string-expand-tabs s
[ #:tab-width tab-width
#:start-column start-column]) → string?
  s : string?
  tab-width : exact-positive-integer? = 8
  start-column : exact-nonnegative-integer? = 0

Replaces tab characters in s with spaces according to tab stops.

Tabs advance to the next tab stop determined by tab-width. Newline and return characters reset the running column to zero.

Examples:

> (string-expand-tabs "a\tb")
"a       b"
> (string-expand-tabs "ab\tcd" #:tab-width 4)
"ab  cd"
> (string-expand-tabs "\t" #:tab-width 4 #:start-column 2)
"  "

procedure
(string-display-width s
[ #:tab-width tab-width
#:start-column start-column])
→ exact-nonnegative-integer?
  s : string?
  tab-width : exact-positive-integer? = 8
  start-column : exact-nonnegative-integer? = 0

Returns a monospace display-width approximation for the length of the final line of s.

ASCII printable characters count as width 1. Tabs advance to the next tab stop. Newline and return reset the running column to zero.

Examples:

> (string-display-width "a\tb")
9
> (string-display-width "a\nbc")
2
> (string-display-width "\t" #:tab-width 4 #:start-column 2)
4

procedure
(string-chomp s) → string?
s : string?

Removes one trailing newline from s when present. The removed newline may be either "\n" or "\r\n".

If no trailing newline is present, s is returned unchanged.

Examples:

> (string-chomp "abc")
"abc"
> (string-chomp "abc\n")
"abc"
> (string-chomp "abc\r\n")
"abc"
> (string-chomp "abc\n\n")
"abc\n"

procedure
(string-chop-newline s) → string?
s : string?

Alias of string-chomp.

1.10 String Construction and Transformation🔗ℹ

This subsection collects string-building and transformation utilities, from repetition and case conversion to simple linguistic and mapping helpers.

procedure
(string-repeat s n) → string?
s : string?
n : exact-nonnegative-integer?

Returns a string consisting of s repeated n times.

Examples:

> (string-repeat "ab" 0)
""
> (string-repeat "ab" 3)
"ababab"

procedure
(string-reverse s) → string?
s : string?

Returns s with its characters in reverse order.

Examples:

> (string-reverse "")
""
> (string-reverse "abc")
"cba"

procedure
(string-capitalize s) → string?
s : string?

Returns a string where the first character is uppercased and the remaining characters are lowercased.

Use string-upcase when every character should be uppercased. Use string-titlecase for title-casing behavior across words.

Examples:

> (string-capitalize "")
""
> (string-capitalize "hello world")
"Hello world"
> (string-capitalize "hELLO WORLD")
"Hello world"

procedure
(string-swapcase s) → string?
s : string?

Returns a string where uppercase letters are converted to lowercase and lowercase letters are converted to uppercase. Non-letter characters are unchanged.

Examples:

> (string-swapcase "")
""
> (string-swapcase "AbC")
"aBc"
> (string-swapcase "hello WORLD")
"HELLO world"

🌐 See ROT13 on Wikipedia.

procedure
(string-rot13 s) → string?
s : string?

Applies ROT13 to ASCII letters in s.

Examples:

> (string-rot13 "Hello, World!")
"Uryyb, Jbeyq!"
> (string-rot13 (string-rot13 "Racket"))
"Racket"
> (string-rot13 (string-rot13 "uryyb"))
"uryyb"

procedure
(string-pluralize s) → string?
s : string?

Returns a simple English-ish plural form of s using lightweight heuristics.

Examples:

> (string-pluralize "cat")
"cats"
> (string-pluralize "box")
"boxes"
> (string-pluralize "city")
"cities"

procedure
(string-singularize s) → string?
s : string?

Returns a simple English-ish singular form of s using lightweight heuristics.

⚠️ Gotcha: This is heuristic, not full linguistic inflection.

Examples:

> (string-singularize "cats")
"cat"
> (string-singularize "boxes")
"box"
> (string-singularize "cities")
"city"

procedure
(string-ensure-ends-with-newline s) → string?
s : string?

Ensures that s ends with #\newline, adding one when needed.

Examples:

> (string-ensure-ends-with-newline "")
"\n"
> (string-ensure-ends-with-newline "abc")
"abc\n"
> (string-ensure-ends-with-newline "abc\n")
"abc\n"

procedure
(string-map proc s [start end]) → string?
  proc : (-> char? char?)
  s : string?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Applies proc to each character in s from start to end and returns the resulting string.

This procedure does not mutate s.

Indices may be negative and are clamped to the string bounds.

Examples:

> (string-map char-upcase "abc")
"ABC"
> (string-map char-upcase "abcdef" 1 4)
"aBCDef"
> (string-map char-upcase "abcdef" -4 -1)
"abCDEf"

procedure
(string-map! proc s [start end]) → void?
  proc : (-> char? char?)
  s : (and/c string? (not/c immutable?))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)

Applies proc to each character in s from start to end, mutating s in place.

Indices may be negative and are clamped to the string bounds.

Examples:

> (define m (string-copy "abcdef"))
> (string-map! char-upcase m -4 -1)
> m
"abCDEf"

procedure
(string-intersperse sep xs) → string?
sep : string?
xs : (listof string?)

Joins the strings in xs, inserting sep between consecutive elements.

Examples:

> (string-intersperse "," '())
""
> (string-intersperse "," '("a"))
"a"
> (string-intersperse "," '("a" "b" "c"))
"a,b,c"

1.11 Escaping and Cleaning🔗ℹ

This subsection groups escaping and cleanup utilities for both human-visible text and machine-oriented string formats such as quoted literals and JSON string content.

procedure
(string-escape-visible s) → string?
s : string?

Returns a display-oriented escaped version of s, where common control characters are replaced by visible escape sequences.

Escapes include "\\n", "\\r", "\\t", "\\b", "\\f", and "\\\\". Other ASCII control characters are rendered as "\\xNN".

Examples:

> (string-escape-visible "\n\t\r")
"\\n\\t\\r"
> (string-escape-visible "\\x")
"\\\\x"
> (string-escape-visible (string #\nul #\rubout))
"\\x00\\x7F"
> (string-unescape-visible (string-escape-visible "a\n\tb"))
"a\n\tb"

procedure
(string-unescape-visible s) → string?
s : string?

Parses visible escape sequences in s and returns the corresponding string with escaped characters restored.

Recognized escapes include "\\n", "\\r", "\\t", "\\b", "\\f", "\\\\", "\\xNN", and "\\x...;".

Examples:

> (string-unescape-visible "\\n\\t")
"\n\t"
> (string-unescape-visible "\\x00\\x7F")
"\u0000\u007F"
> (string-unescape-visible "\\x3BB;")
"λ"
> (string-unescape-visible (string-escape-visible "a\n\tb"))
"a\n\tb"

Related: string-quote, string-unquote.

procedure
(string-quote s [#:quote-char quote-char]) → string?
s : string?
quote-char : char? = #\"

Wraps s in quotes and escapes embedded quote, backslash, and common control characters.

Examples:

> (string-quote "He said \"hi\"")
"\"He said \\\"hi\\\"\""
> (string-quote "a'b" #:quote-char #\')
"'a\\'b'"
> (string-unquote (string-quote "a\nb"))
"a\nb"

Related: string-unquote, string-escape-visible, string-escape-json.

procedure
(string-unquote s [#:quote-char quote-char]) → string?
s : string?
quote-char : char? = #\"

Removes matching outer quotes from s and unescapes the quoted content.

An exception is raised when outer quotes are missing or escapes are malformed.

Examples:

> (string-unquote "\"a\\nb\"")
"a\nb"
> (string-unquote "'a\\'b'" #:quote-char #\')
"a'b"
> (string-unquote (string-quote "He said \"hi\""))
"He said \"hi\""

Related: string-quote, string-unescape-visible, string-unescape-json.

procedure
(string-escape-regexp s) → string?
s : string?

Escapes s so it can be used as a literal regular-expression pattern.

Examples:

> (string-escape-regexp "a+b")
"a\\+b"
> (regexp-match? (regexp (string-escape-regexp "a+b")) "a+b")
#t

procedure
(string-escape-json s) → string?
s : string?

Escapes s as JSON string content, without adding outer quotes.

Examples:

> (string-escape-json "\"\\/\n")
"\\\"\\\\\\/\\n"
> (string-escape-json (string #\nul #\u001F))
"\\u0000\\u001F"
> (string-unescape-json (string-escape-json "hello\nλ"))
"hello\nλ"

Related: string-unescape-json, string-quote.

procedure
(string-unescape-json s) → string?
s : string?

Unescapes JSON string content, including "\\uXXXX" escapes and surrogate pairs.

Examples:

> (string-unescape-json "\\u0041\\u03BB")
"Aλ"
> (string-unescape-json "\\uD83D\\uDE00")
"😀"
> (string-unescape-json (string-escape-json "hello\nλ"))
"hello\nλ"

Related: string-escape-json, string-unquote.

procedure
(string-strip-ansi s) → string?
s : string?

Removes common terminal ANSI/VT control sequences from s, including color/style control sequences and OSC metadata sequences.

Examples:

> (string-strip-ansi "\e[31mred\e[0m")
"red"
> (string-strip-ansi "a\e]0;title\ab")
"ab"

procedure
(string-squeeze s [to-squeeze]) → string?
s : string?
to-squeeze : (or/c char? char-set? string? (procedure-arity-includes/c 1))
= char-whitespace?

Collapses consecutive matching characters in s into a single character.

If to-squeeze is a character, characters equal to it are squeezed. If it is a character set, characters in the set are squeezed. If it is a string, the string is treated as a character set. If it is a procedure, it is used as the character predicate.

Examples:

> (string-squeeze "a b c" #\space)
"a b c"
> (string-squeeze "a\t \n\nb")
"a\tb"
> (string-squeeze "baaaana" "a")
"bana"

1.12 Tokenization and Scanning🔗ℹ

This subsection groups parsing-oriented helpers that split text into tokens or fields and scan text for successive match ranges.

procedure
(string-tokenize s
[ to-separate
start
end
#:quote quote
#:escape escape]) → (listof string?)
  s : string?
   to-separate : (or/c char? char-set? string? (procedure-arity-includes/c 1))
= char-whitespace?
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  quote : (or/c #f char?) = #f
  escape : (or/c #f char?) = #\\

Splits s into non-empty tokens using to-separate as a character matcher.

If quote is provided, separators inside quoted text are ignored. If escape is provided, the escaped next character is treated literally.

Indices may be negative and are clamped to the string bounds.

Examples:

> (string-tokenize " a b c ")
'("a" "b" "c")
> (string-tokenize "a,b,c" #\,)
'("a" "b" "c")
> (string-tokenize "a,\"b,c\",d" #\, #:quote #\")
'("a" "b,c" "d")
> (string-tokenize "a,b\\,c,d" #\, #:escape #\\)
'("a" "b,c" "d")
> (string-tokenize "abc def ghi" char-whitespace? -7 -1)
'("def" "gh")

procedure
(string-fields s
[ to-separate
start
end
#:quote quote
#:escape escape
#:widths widths
#:include-rest? include-rest?])
→ (listof string?)
  s : string?
   to-separate : (or/c char? char-set? string? (procedure-arity-includes/c 1))
= #\,
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  quote : (or/c #f char?) = #f
  escape : (or/c #f char?) = #\\
  widths : (or/c #f (listof exact-positive-integer?)) = #f
  include-rest? : boolean? = #f

Splits s into fields.

In delimiter mode, empty fields are preserved. In fixed-width mode (#:widths), the widths list defines field lengths from left to right. When #:include-rest? is true in fixed-width mode, one additional field contains any remaining substring.

Indices may be negative and are clamped to the string bounds.

Examples:

> (string-fields "a,b,,c," #\,)
'("a" "b" "" "c" "")
> (string-fields "a,\"b,c\",d" #\, #:quote #\")
'("a" "b,c" "d")
> (string-fields "abcdefgh" #\, #:widths (quote(2 3 2)))
'("ab" "cde" "fg")
> (string-fields "abcdefgh" #\, #:widths (quote(2 3 2))#:include-rest? #t)
'("ab" "cde" "fg" "h")
> (string-fields "a,b,c,d" #\, -5 -1)
'("b" "c" "")

procedure
(string-scan s
matcher
[ start
end
#:overlap? overlap?])
→
(-> (or/c (cons/c exact-nonnegative-integer?
                  exact-nonnegative-integer?)
           #f))
  s : string?
  matcher : (or/c string? char? char-set? (procedure-arity-includes/c 1))
  start : exact-integer? = 0
  end : exact-integer? = (string-length s)
  overlap? : boolean? = #f

Returns a generator that produces successive match ranges as (cons start end) pairs, and then returns #f.

If matcher is a string, it is treated as a substring needle. If matcher is a character, character set, or predicate, matching characters are returned as one-character ranges.

Indices may be negative and are clamped to the string bounds.

Examples:

> (define (collect-ranges g)
    (let loop ([acc '()])
      (define v (g))
      (if v (loop (cons v acc)) (reverse acc))))
> (collect-ranges (string-scan "banana" "na"))
'((2 . 4) (4 . 6))
> (collect-ranges (string-scan "aaaa" "aa" #:overlap? #t))
'((0 . 2) (1 . 3) (2 . 4))
> (collect-ranges (string-scan "abc123x" char-numeric?))
'((3 . 4) (4 . 5) (5 . 6))
> (collect-ranges (string-scan "banana" #\a -5 -1))
'((1 . 2) (3 . 4))

Related: string-find-all-needle, string-find-needle.

1.13 Formatting and Layout🔗ℹ

This subsection covers presentation-oriented text shaping, including wrapping, indentation normalization, and width-constrained truncation.

procedure
(string-wrap s
width
[ #:mode mode
#:preserve-words? preserve-words?]) → string?
  s : string?
  width : exact-positive-integer?
  mode : (or/c 'soft 'hard) = 'soft
  preserve-words? : boolean? = #t

Wraps s to the requested line width.

In 'soft mode, wrapping prefers whitespace boundaries. In 'hard mode, lines are split exactly at width characters.

When preserve-words? is true in soft mode, long words are kept intact instead of being split.

Examples:

> (string-wrap "alpha beta gamma" 10)
"alpha\nbeta gamma"
> (string-wrap "supercalifragilistic" 8)
"supercalifragilistic"
> (string-wrap "supercalifragilistic" 8 #:preserve-words? #f)
"supercal\nifragili\nstic"
> (string-wrap "abcdefghij" 4 #:mode 'hard)
"abcd\nefgh\nij"

procedure
(string-indent s n-or-prefix) → string?
s : string?
n-or-prefix : (or/c exact-nonnegative-integer? string?)

Indents every line in s.

If n-or-prefix is a nonnegative integer, that many spaces are used. If it is a string, that string is used as the line prefix.

Examples:

> (string-indent "a\nb" 2)
" a\n b"
> (string-indent "a\nb" "-> ")
"-> a\n-> b"

procedure
(string-dedent s) → string?
s : string?

Removes common leading indentation from all non-blank lines in s.

Indentation is measured using leading spaces and tabs.

Examples:

> (string-dedent "  a\n  b")
"a\nb"
> (string-dedent "    a\n      b")
"a\n  b"
> (string-dedent "  a\n\n  b")
"a\n\nb"

procedure
(string-elide s
width
[ #:where where
#:ellipsis ellipsis]) → string?
  s : string?
  width : exact-nonnegative-integer?
  where : (or/c 'left 'right 'middle) = 'right
  ellipsis : string? = "..."

Truncates s to at most width characters by inserting ellipsis.

The where option chooses whether truncation happens on the left, right, or in the middle.

Examples:

> (string-elide "abcdef" 5)
"ab..."
> (string-elide "abcdef" 5 #:where 'left)
"...ef"
> (string-elide "abcdef" 5 #:where 'middle)
"a...f"
> (string-elide "abcdef" 6 #:ellipsis "..")
"abcdef"

1.14 Similarity and Distance🔗ℹ

This subsection provides string similarity and distance metrics useful for ranking candidates, fuzzy matching, and suggestion-style diagnostics.

🌐 See Levenshtein distance on Wikipedia.

procedure
(string-levenshtein a b) → exact-nonnegative-integer?
a : string?
b : string?

Computes the Levenshtein edit distance between a and b. The result is the minimum number of insertions, deletions, and substitutions needed to transform one string into the other.

Time complexity is O ((string-length a) * (string-length b)).

Examples:

> (string-levenshtein "kitten" "sitting")
3
> (string-levenshtein "flaw" "lawn")
2
> (string-levenshtein "abc" "abc")
0

🌐 See Jaro-Winkler distance on Wikipedia.

procedure
(string-jaro-winkler a
b
[ #:prefix-scale prefix-scale])
→ inexact-real?
  a : string?
  b : string?
  prefix-scale : real? = 0.1

Computes the Jaro-Winkler similarity score between a and b. Scores are in the range from 0.0 to 1.0, where larger values indicate stronger similarity. The prefix-scale value must be between 0.0 and 0.25.

Time complexity is approximately O ((string-length a) * (string-length b)) in the worst case.

Examples:

> (string-jaro-winkler "martha" "marhta")
0.9611111111111111
> (string-jaro-winkler "martha" "xyz")
0.0

procedure
(string-similarity a b) → inexact-real?
a : string?
b : string?

Returns a suggestion-oriented similarity score between a and b. This is currently an alias of string-jaro-winkler.

Example:

> (string-similarity "dixon" "dicksonx")
0.8133333333333332

1.15 Case Conversion and Predicates🔗ℹ

This subsection provides lightweight whole-string predicates for whitespace, ASCII, and digit checks.

procedure
(string-blank? s) → boolean?
s : string?

Returns #t if every character in s is whitespace.

Examples:

> (string-blank? "")
#t
> (string-blank? " \t\n")
#t
> (string-blank? " a ")
#f

procedure
(string-ascii? s) → boolean?
s : string?

Returns #t if every character in s is an ASCII character.

Examples:

> (string-ascii? "")
#t
> (string-ascii? "ABC123!?")
#t
> (string-ascii? "café")
#f

procedure
(string-digit? s) → boolean?
s : string?

Returns #t if every character in s is an ASCII digit (#\0 through #\9).

Examples:

> (string-digit? "")
#t
> (string-digit? "0123456789")
#t
> (string-digit? "12a3")
#f

2 Character Sets🔗ℹ

This section documents the character-set utilities used by string-count and available directly through string-tools/char-set.

Conceptually, a character set represents a collection of characters with membership operations and set operations such as union, intersection, and difference. It is useful when you want to classify characters efficiently and reuse that classification across multiple string-processing steps.

Character sets are represented with a hybrid structure: an ASCII bit mask for codepoints 0 through 127, plus a normalized collection of non-ASCII inclusive ranges. This representation gives fast membership tests for common ASCII text while keeping non-ASCII sets compact.

(require string-tools/char-set)

package: string-tools-lib

procedure
(char-set? v) → boolean?
v : any/c

Returns #t when v is a character set value.

Examples:

> (require string-tools/char-set)
> (char-set? (make-char-set #\a #\b))
#t
> (char-set? "ab")
#f

value
empty-char-set : char-set?

The empty character set.

Examples:

> (require string-tools/char-set)
> (char-set-size empty-char-set)
0
> (char-set-member? empty-char-set #\a)
#f

procedure
(make-char-set ch ...) → char-set?
ch : char?

Builds a character set containing the given characters.

Examples:

> (require string-tools/char-set)
> (make-char-set #\a #\b #\a)
(char-set 475368975085586025561263702016 '#())

procedure
(list->char-set xs) → char-set?
xs : (listof char?)

Builds a character set from a list of characters.

Examples:

> (require string-tools/char-set)
> (define cs (list->char-set (list #\a #\b #\a)))
> (char-set-size cs)
2
> (char-set-member? cs #\b)
#t

procedure
(string->char-set s) → char-set?
s : string?

Builds a character set from the distinct characters in s.

Examples:

> (require string-tools/char-set)
> (define cs (string->char-set "banana"))
> (char-set-size cs)
3
> (char-set-member? cs #\n)
#t

procedure
(char-set-add cs ch) → char-set?
cs : char-set?
ch : char?

Returns a character set containing all characters in cs and ch.

Examples:

> (require string-tools/char-set)
> (define cs (char-set-add empty-char-set #\x))
> (char-set-member? cs #\x)
#t

procedure
(char-set-add-range cs lo-ch hi-ch) → char-set?
  cs : char-set?
  lo-ch : char?
  hi-ch : char?

Returns a character set containing cs plus all characters from lo-ch to hi-ch, inclusive.

Examples:

> (require string-tools/char-set)
> (define letters (char-set-add-range empty-char-set #\a #\z))
> (char-set-member? letters #\m)
#t
> (char-set-member? letters #\A)
#f

procedure
(char-set-member? cs ch) → boolean?
cs : char-set?
ch : char?

Checks whether ch is in cs.

Examples:

> (require string-tools/char-set)
> (define vowels (make-char-set #\a #\e #\i #\o #\u))
> (char-set-member? vowels #\e)
#t
> (char-set-member? vowels #\y)
#f

procedure
(char-set-union a b) → char-set?
a : char-set?
b : char-set?

Returns the union of a and b.

Examples:

> (require string-tools/char-set)
> (define vowels (make-char-set #\a #\e #\i #\o #\u))
> (define y (make-char-set #\y))
> (char-set-member? (char-set-union vowels y) #\y)
#t

procedure
(char-set-intersection a b) → char-set?
a : char-set?
b : char-set?

Returns the intersection of a and b.

Examples:

> (require string-tools/char-set)
> (define a (make-char-set #\a #\b #\c))
> (define b (make-char-set #\b #\c #\d))
> (char-set-size (char-set-intersection a b))
2

procedure
(char-set-difference a b) → char-set?
a : char-set?
b : char-set?

Returns the set difference a - b.

Examples:

> (require string-tools/char-set)
> (define letters (char-set-add-range empty-char-set #\a #\f))
> (define vowels (make-char-set #\a #\e))
> (char-set-member? (char-set-difference letters vowels) #\b)
#t
> (char-set-member? (char-set-difference letters vowels) #\a)
#f

procedure
(char-set-size cs) → exact-nonnegative-integer?
cs : char-set?

Returns the number of characters in cs.

Examples:

> (require string-tools/char-set)
> (char-set-size (make-char-set #\a #\b #\a))
2

Examples:

> (require string-tools/char-set)
> (define vowels (make-char-set #\a #\e #\i #\o #\u))
> (char-set-member? vowels #\e)
#t
> (char-set-size vowels)
5
> (define letters (char-set-add-range empty-char-set #\a #\z))
> (char-set-size (char-set-difference letters vowels))
21

3 Extended Examples🔗ℹ

This section presents three end-to-end workflows: log analysis, CSV-like import cleaning, and configuration normalization with patching.

3.1 Logs🔗ℹ

This extended example uses a small synthetic log and shows a full normalize-parse-analyze pipeline. Each log line uses the format:

ts level service=... request_id=... msg="..."

Here, ts is the time stamp.

Prepare a small synthetic log input.

> (define raw-log
    (string-append
     "2026-02-21T22:10:00Z INFO service=api request_id=abc123 msg=\"start\"\r\n"
     "\e[31m2026-02-21T22:10:01Z ERROR service=api request_id=abc123 msg=\"timeout\"\e[0m\r\n"
     "2026-02-21T22:10:02Z WARN service=worker request_id=def456 msg=\"retrying\"\n"
     "2026-02-21T22:10:03Z INFO service=api request_id=abc123 msg=\"done\"\r"))

Normalize line endings and remove ANSI terminal escapes.

> (define cleaned-log
(string-strip-ansi (string-normalize-newlines raw-log)))

Turn text into non-empty log lines and inspect quick counts.

> (define lines
(filter (λ (s) (not (string-blank? s)))
(string-lines cleaned-log)))
> (length lines)
4
> (string-count-needle cleaned-log "ERROR")
1

Define a parser that turns one line into a record.

> (define (line->record line)
    (define fs         (string-fields line #\space))
    (define ts         (list-ref fs 0))
    (define level      (list-ref fs 1))
    (define service    (string-between line "service=" " "))
    (define request-id (string-between line "request_id=" " "))
    (define msg        (string-between line "msg=\"" "\"" #:right-match 'last))
    (list ts level service request-id msg))

Parse all lines and inspect the first parsed record.

> (define records (map line->record lines))
> (car records)
'("2026-02-21T22:10:00Z" "INFO" "api" "abc123" "start")

Select all ERROR records.

> (define error-records
(filter (λ (r) (string=? (list-ref r 1) "ERROR")) records))
> error-records
'(("2026-02-21T22:10:01Z" "ERROR" "api" "abc123" "timeout"))

3.2 CSV-Like Import Cleaning🔗ℹ

This example shows a small CSV-like import pipeline with quoted fields, whitespace cleanup, and row-level validation diagnostics.

Prepare a small CSV-like input and split it into rows.

> (define raw-csv
    (string-append
     "id,name,socre\r\n"
     "1,\"Alice\",98\r\n"
     "2,\" Bob  \",87\r\n"
     "x,\"Mallory\",91\r\n"
     "4,\"Eve\",9a\r\n"
     "5,\"\",100\r\n"))
> (define rows
    (string-lines (string-normalize-newlines raw-csv)))

Parse header and data rows.

> (define header (string-fields (car rows) #\, #:quote #\"))
> (define data-rows (cdr rows))

Validate header names and suggest likely intended names.

> (define expected-header '("id" "name" "score"))
> (define (best-column-suggestion col)
    (define-values (best-name best-score)
      (for/fold ([best-name #f] [best-score -1.0]) ([cand (in-list expected-header)])
        (define score (string-similarity col cand))
        (if (> score best-score)
            (values cand score)
            (values best-name best-score))))
    (if (and best-name (>= best-score 0.7))
        best-name
        #f))
> (define header-diagnostics
    (for/list ([col (in-list header)] #:unless (member col expected-header))
      (define suggestion (best-column-suggestion col))
      (if suggestion
          (string-append "unknown column " col "; did you mean " suggestion "?")
          (string-append "unknown column " col))))
> header-diagnostics
'("unknown column socre; did you mean score?")

Define a small field normalizer used during import.

> (define (clean-field s)
(string-trim-both (string-squeeze s #\space) #\space))

Parse each row as CSV-like fields and inspect parsed rows.

> (define parsed
    (for/list ([row (in-list data-rows)])
      (for/list ([field (in-list (string-fields row #\, #:quote #\"))])
        (clean-field field))))
> parsed
'(("1" "Alice" "98")
  ("2" "Bob" "87")
  ("x" "Mallory" "91")
  ("4" "Eve" "9a")
  ("5" "" "100"))

Validate rows: id and score must be digits; name must be non-blank.

> (define (row-error fs)
    (define id    (list-ref fs 0))
    (define name  (list-ref fs 1))
    (define score (list-ref fs 2))
    (cond
      [(not (string-digit? id))    "invalid id"]
      [(string-blank? name)        "blank name"]
      [(not (string-digit? score)) "invalid score"]
      [else #f]))

Keep diagnostics for rows that fail validation.

> (define diagnostics
    (for/list ([row (in-list data-rows)]
               [fs  (in-list parsed)]
               #:when (row-error fs))
      (list (row-error fs)
            (string-escape-visible row))))
> header
'("id" "name" "socre")
> diagnostics
'(("invalid id" "x,\"Mallory\",91")
  ("invalid score" "4,\"Eve\",9a")
  ("blank name" "5,\"\",100"))

3.3 Config Normalization and Patching🔗ℹ

This example parses an INI-like configuration text, validates keys, suggests fixes for unknown keys, patches one value in-place, and emits normalized output with a final newline.

Prepare and normalize a small INI-like input.

> (define raw-config
    (string-append
     "; demo config\r\n"
     "host = example.org\r\n"
     "port = 8080\r\n"
     "timeout = 30\r\n"
     "retris = 2\r\n"
     "mode fast\r\n"))
> (define normalized (string-normalize-newlines raw-config))
> (define lines      (string-lines normalized))
> (define expected-keys '("host" "port" "timeout" "retries" "mode"))

Define helpers for comment detection and line parsing.

> (define (comment-line? t)
    (memv (string-at t 0 #f) '(#\# #\;)))
> (define (parse-config-line line)
    (define t (string-trim-both line))
    (cond
      [(string-blank? t)
       #f]
      [(comment-line? t)
       #f]
      [else
       (define-values (lhs sep rhs) (string-partition t "="))
       (if (string=? sep "")
           (list 'invalid (string-escape-visible line))
           (list (string-trim-both lhs)
                 (string-trim-both rhs)))]))

Parse all lines and inspect the intermediate representation.

> (define parsed-lines
    (filter (λ (x) x)
            (map parse-config-line lines)))
> parsed-lines
'(("host" "example.org")
  ("port" "8080")
  ("timeout" "30")
  ("retris" "2")
  (invalid "mode fast"))

Validate keys and produce diagnostics with similarity-based suggestions.

> (define (best-key-suggestion k)
    (define-values (best-name best-score)
      (for/fold ([best-name #f] [best-score -1.0]) ([cand (in-list expected-keys)])
        (define score (string-similarity k cand))
        (if (> score best-score)
            (values cand score)
            (values best-name best-score))))
    (if (and best-name (>= best-score 0.7))
        best-name
        #f))
> (define diagnostics
    (for/list ([entry (in-list parsed-lines)]
               #:when
               (or (eq? (car entry) 'invalid)
                   (and (string? (car entry))
                        (not (member (car entry) expected-keys)))))
      (cond
        [(eq? (car entry) 'invalid)
         (string-append "malformed line: " (cadr entry))]
        [else
         (define key (car entry))
         (define sug (best-key-suggestion key))
         (if sug
             (string-append "unknown key " key "; did you mean " sug "?")
             (string-append "unknown key " key))])))
> diagnostics
'("unknown key retris; did you mean retries?" "malformed line: mode fast")

Patch one setting in-place and normalize final output.

> (define old-timeout "timeout = 30")
> (define i           (string-find-needle normalized old-timeout))
> (define patched
    (if i
        (string-replace-range normalized
                              i
                              (+ i (string-length old-timeout))
                              "timeout = 45")
        normalized))
> (define final-config
    (string-ensure-ends-with-newline patched))
> (displayln final-config)
; demo config
host = example.org
port = 8080
timeout = 45
retris = 2
mode fast