2 Character Sets
This section documents the character-set utilities used by string-count and available directly through string-tools/char-set.
Conceptually, a character set represents a collection of characters with membership operations and set operations such as union, intersection, and difference. It is useful when you want to classify characters efficiently and reuse that classification across multiple string-processing steps.
Character sets are represented with a hybrid structure: an ASCII bit mask for codepoints 0 through 127, plus a normalized collection of non-ASCII inclusive ranges. This representation gives fast membership tests for common ASCII text while keeping non-ASCII sets compact.
| (require string-tools/char-set) | package: string-tools-lib |
> (require string-tools/char-set) > (char-set? (make-char-set #\a #\b)) #t
> (char-set? "ab") #f
value
> (require string-tools/char-set) > (char-set-size empty-char-set) 0
> (char-set-member? empty-char-set #\a) #f
procedure
(make-char-set ch ...) → char-set?
ch : char?
> (require string-tools/char-set) > (make-char-set #\a #\b #\a) (char-set 475368975085586025561263702016 '#())
procedure
(list->char-set xs) → char-set?
xs : (listof char?)
> (require string-tools/char-set) > (define cs (list->char-set (list #\a #\b #\a))) > (char-set-size cs) 2
> (char-set-member? cs #\b) #t
procedure
(string->char-set s) → char-set?
s : string?
> (require string-tools/char-set) > (define cs (string->char-set "banana")) > (char-set-size cs) 3
> (char-set-member? cs #\n) #t
procedure
(char-set-add cs ch) → char-set?
cs : char-set? ch : char?
> (require string-tools/char-set) > (define cs (char-set-add empty-char-set #\x)) > (char-set-member? cs #\x) #t
procedure
(char-set-add-range cs lo-ch hi-ch) → char-set?
cs : char-set? lo-ch : char? hi-ch : char?
> (require string-tools/char-set) > (define letters (char-set-add-range empty-char-set #\a #\z)) > (char-set-member? letters #\m) #t
> (char-set-member? letters #\A) #f
procedure
(char-set-member? cs ch) → boolean?
cs : char-set? ch : char?
> (require string-tools/char-set) > (define vowels (make-char-set #\a #\e #\i #\o #\u)) > (char-set-member? vowels #\e) #t
> (char-set-member? vowels #\y) #f
procedure
(char-set-union a b) → char-set?
a : char-set? b : char-set?
> (require string-tools/char-set) > (define vowels (make-char-set #\a #\e #\i #\o #\u)) > (define y (make-char-set #\y)) > (char-set-member? (char-set-union vowels y) #\y) #t
procedure
(char-set-intersection a b) → char-set?
a : char-set? b : char-set?
> (require string-tools/char-set) > (define a (make-char-set #\a #\b #\c)) > (define b (make-char-set #\b #\c #\d)) > (char-set-size (char-set-intersection a b)) 2
procedure
(char-set-difference a b) → char-set?
a : char-set? b : char-set?
> (require string-tools/char-set) > (define letters (char-set-add-range empty-char-set #\a #\f)) > (define vowels (make-char-set #\a #\e)) > (char-set-member? (char-set-difference letters vowels) #\b) #t
> (char-set-member? (char-set-difference letters vowels) #\a) #f
procedure
cs : char-set?
> (require string-tools/char-set) > (char-set-size (make-char-set #\a #\b #\a)) 2
> (require string-tools/char-set) > (define vowels (make-char-set #\a #\e #\i #\o #\u)) > (char-set-member? vowels #\e) #t
> (char-set-size vowels) 5
> (define letters (char-set-add-range empty-char-set #\a #\z)) > (char-set-size (char-set-difference letters vowels)) 21