randstr: Random String Generator
| (require randstr) | package: randstr |
Version 0.1.1
A library for generating random strings based on regex-like patterns.
1 Features
Generate random strings from regex-like patterns
Support for character classes, ranges, and quantifiers
POSIX character classes ([:alpha:], [:digit:], etc.)
Unicode property support (\\p{L}, \\p{Script=Han}, etc.)
Normal distribution quantifiers for realistic length variation
Named groups and backreferences for pattern reuse
Command-line interface for quick generation
Fair distribution: duplicate characters in classes are deduplicated
2 Functions
(randstr "[a-z]{5}") (randstr "[0-9][a-z]+") (randstr "(abc|def)+")
(randstr* "[0-9]{3}" 5)
3 Pattern Syntax
The following pattern syntax is supported:
{n} - Exactly n repetitions
{n+} - Normal distribution with mean n (2nd order)
{n++} - Normal distribution with mean n (3rd order, more concentrated)
{n1+n2} - Normal distribution in range n1..n2 (2nd order)
{n1++n2} - Normal distribution in range n1..n2 (3rd order)
{+n} - Shorthand for {0+n} (range 0..n)
{++n} - Shorthand for {0++n} (range 0..n, 3rd order)
(?<name>...) - Named group (captures pattern for later reference)
\\k<name> - Backreference to named group
[abc] - Choose randomly from characters a, b, or c
[a-z] - Choose randomly from lowercase letters a through z
(abc|def) - Choose randomly between "abc" or "def"
a* - Zero or more of the preceding character
a+ - One or more of the preceding character
a? - Zero or one of the preceding character
. - Any character
[:alpha:] - Alphabetic characters
[:digit:] - Numeric characters
[:alnum:] - Alphanumeric characters (POSIX standard) or [:alphanum:] (alias)
[:word:] - Word characters (alphanumeric plus underscore)
[:blank:] - Blank characters (space and tab)
[:space:] - Whitespace characters
[:upper:] - Uppercase letters
[:lower:] - Lowercase letters
[:ascii:] - ASCII characters
[:cntrl:] - Control characters
[:graph:] - Printable characters except space
[:print:] - Printable characters including space
[:punct:] - Punctuation characters
[:xdigit:] - Hexadecimal digits
\\p{L} - Unicode letters
\\p{N} - Unicode numbers
\\p{P} - Unicode punctuation
\\p{M} - Unicode marks
\\p{S} - Unicode symbols
\\p{Z} - Unicode separators
\\p{C} - Unicode other (control characters)
\\p{Lu} - Unicode uppercase letters
\\p{Ll} - Unicode lowercase letters
\\p{Nd} - Unicode decimal numbers
\\p{Letter} - Unicode letters (alias for \\p{L})
\\p{Number} - Unicode numbers (alias for \\p{N})
\\p{Punctuation} - Unicode punctuation (alias for \\p{P})
\\p{Script=Han} - Unicode characters from Han script
\\p{Script=Latin} - Unicode characters from Latin script
\\p{Block=Basic_Latin} - Unicode characters from Basic Latin block
\\p{Block=CJK_Unified_Ideographs} - Unicode characters from CJK Unified Ideographs block
\\p{Alphabetic} - Unicode alphabetic characters
\\p{Uppercase} - Unicode uppercase characters
\\p{Lowercase} - Unicode lowercase characters
\\p{White_Space} - Unicode whitespace characters
\\p{Cased} - Unicode characters with case distinctions
\\p{Dash} - Unicode dash characters
\\p{Emoji} - Unicode emoji characters
\\p{Emoji_Component} - Unicode emoji component characters
\\p{Emoji_Modifier} - Unicode emoji modifier characters
\\p{Emoji_Modifier_Base} - Unicode emoji modifier base characters
\\p{Emoji_Presentation} - Unicode emoji presentation characters
\\p{Extended_Pictographic} - Unicode extended pictographic characters
\\p{Hex_Digit} - Unicode hexadecimal digits
\\p{ID_Continue} - Unicode identifier continuation characters
\\p{ID_Start} - Unicode identifier start characters
\\p{Ideographic} - Unicode ideographic characters
\\p{Math} - Unicode mathematical symbols
\\p{Quotation_Mark} - Unicode quotation mark characters
4 Advanced Examples
In addition to basic pattern matching, the library supports more complex patterns:
(randstr "[[:alpha:]]{5}") (randstr "[[:digit:]]{3}") (randstr "[[:alnum:]]{4}") (randstr "[[:word:]]+") (randstr "[[:upper:]0-9]+") (randstr "[[:lower:]_]+") (randstr "[[:alpha:]0-9]+") (randstr "\\p{L}{5}") (randstr "\\p{N}{3}") (randstr "\\p{P}{2}") (randstr "\\p{Lu}{3}\\p{Ll}{3}") (randstr "\\p{Letter}{5}") (randstr "\\p{Number}{3}") (randstr "\\p{Script=Han}{2}") (randstr "\\p{Block=Basic_Latin}{5}") (randstr "\\p{Alphabetic}{4}") (randstr "\\p{White_Space}{3}")
5 Normal Distribution Quantifiers
Generate strings with lengths following a normal distribution for more realistic random data:
(randstr "\\w{10+}") (randstr "\\w{10++}") (randstr "\\w{5+15}") (randstr "\\w{5++15}") (randstr "\\d{+10}") (randstr "\\d{++10}")
Higher order (more + signs) means values are more concentrated around the center/mean.
6 Named Groups and Backreferences
Capture generated content and reuse it later in the pattern:
(randstr "(?<word>\\w{4})-\\k<word>") (randstr "(?<id>\\d{3}):\\k<id>") (randstr "(?<a>[A-Z]{2})(?<b>\\d{2})-\\k<a>\\k<b>")
Named groups are defined with (?<name>...) and referenced with \\k<name>. The backreference will produce the exact same string that was generated by the named group.
7 Character Class Duplicate Handling
When a character class contains duplicate elements, each unique character is treated equally regardless of how many times it appears in the class. For example:
[aaabbbccc] - Each of a, b, c has equal probability (1/3 each), not a=3/9, b=3/9, c=3/9
[a-cb-e] - Each of a, b, c, d, e has equal probability (1/5 each)
[[:digit:]0-2] - Digits 0, 1, 2 appear in both the POSIX class and the range, but each digit still has equal probability
This ensures fair distribution of character selection in all character classes.
8 Changelog
8.1 Version 0.1.1
New: Normal distribution quantifiers ({n+}, {n++}, etc.) for realistic length variation
New: Range normal distribution ({n1+n2}, {n1++n2}, {+n}, {++n})
New: Named groups (?<name>...) for capturing generated content
New: Backreferences \\k<name> for reusing captured content
8.2 Version 0.1.0
Initial stable release
Fixed: \\W no longer incorrectly matches underscore
Performance: Optimized character class deduplication with O(1) hash-set lookups
Cleaned up internal code architecture
9 License
This project is licensed under the MIT License. See the "LICENSE" file for details.