On this page:
1.1 Module codepoint
*max-codepoint-value*
codepoint?
codepoint-non-character?
codepoint-utf16-surrogate?
codepoint-private-use?
codepoint-plane
codepoint-plane-name
codepoint->char
char->codepoint
codepoint->unicode-string
string->codepoint
assert-codepoint!
1.2 Module codepoint/  range
codepoint-range
assert-codepoint-range!
pair->codepoint-range
codepoint->codepoint-range
codepoint-range-length
codepoint-range=?
codepoint-range<?
codepoint-range>?
codepoint-range-contains?
codepoint-range-contains-any?
codepoint-range-contains-all?
codepoint-range-intersects?
codepoint-range-any-intersects?
codepoint-range->inclusive-range
codepoint-range->in-inclusive-range
codepoint-range->unicode-string
1.3 Module codepoint/  range-dict
range-dict?
make-range-dict
range-dict-count
range-dict-has-key?
range-dict-ref
8.12

1 Codepoint types🔗ℹ

A codepoint value is simply a exact-nonnegative-integer? in the inclusive range zero to *max-codepoint-value*. Note that not all codepoints correspond to characters (see codepoint-non-character?, codepoint-utf16-surrogate?, and codepoint-private-use?).

1.1 Module codepoint🔗ℹ

 (require codepoint) package: codepoint

The currently defined maximum value for a codepoint.

procedure

(codepoint? v)  boolean?

  v : any/c?
Returns #t if the provided value is a valid codepoint; it is a exact-nonnegative-integer? in the inclusive range zero to *max-codepoint-value*.

procedure

(codepoint-non-character? c)  boolean?

  c : codepoint?
Returns #t if the codepoint is one of the Unicode non-character values.

procedure

(codepoint-utf16-surrogate? c)  boolean?

  c : codepoint?
Returns #t if the codepoint is one of the reserved Unicode UTF-16 surrogate values.

procedure

(codepoint-private-use? c)  boolean?

  c : codepoint?
Returns #t if the codepoint is one of the reserved Unicode private use values.

Returns the integer (0..16) that represents the plane within the Unicode codepoint set that contains the provided codepoint. Planes are described in the standard, chapter 2, section 2.8 Unicode Allocation.

Examples:

procedure

(codepoint-plane-name c)  (or/c symbol? #f)

  c : codepoint?
Returns the name of the plane within the Unicode codepoint set that contains the provided codepoint. If the plane is not named by the standard the response is #f.

Examples:
> (codepoint-plane-name (char->codepoint #\§))

'basic-multilingual-plane

> (codepoint-plane-name (char->codepoint #\😀))

'supplementary-multilingual-plane

procedure

(codepoint->char c)  char?

  c : codepoint?
Return the character corresponding to the provided codepoint, if the codepoint is valid, and is not a non-character, UTF-16 surrogate, or private-use value.

Example:
> (codepoint->char 167)

#\§

procedure

(char->codepoint v)  boolean?

  v : char?
Return the codepoint for the provided character.

Example:
> (format "~x" (char->codepoint #\§))

"a7"

procedure

(codepoint->unicode-string c)  string?

  c : codepoint?
Return a string that formats the codepoint in the manner used in the Unicode specification.

Example:

procedure

(string->codepoint str)  codepoint?

  str : string?
Convert a string to a codepoint, accepting any Racket integer format, a C-style format, or the Unicode format.

Examples:
> (string->codepoint "0304")

304

> (string->codepoint "#x0304")

772

> (string->codepoint "0x0304")

772

> (string->codepoint "U+0304")

772

procedure

(assert-codepoint! v [name])  void?

  v : any/c?
  name : symbol? = 'v
Raises an argument error if the provided value is not a valid codepoint. The optional parameter name is used to override the name of the value v reported in the error. See raise-argument-error.

1.2 Module codepoint/range🔗ℹ

 (require codepoint/range) package: codepoint

Many of the properties defined by the Unicode standard are assigned to a range of codepoints and the codepoint-range structure is a typed pair of start and end codepoint values.

struct

(struct codepoint-range (start end)
    #:constructor-name make-codepoint-range
    #:prefab)
  start : codepoint?
  end : codepoint?
This structure represents an inclusive range start..end. It ensures that both start and end values are codepoints, startend, and can be used to test codepoint inclusion as well as the basis for iteration.

Examples:

procedure

(assert-codepoint-range! v [name])  void?

  v : any/c?
  name : symbol? = 'v
Raises an argument error if the provided value is not a valid codepoint-range. The optional parameter name is used to override the name of the value v reported in the error. See raise-argument-error.

Create a new range from a pair of codepoint values where the car is the start and the cdr is the end.

Example:
> (pair->codepoint-range '(0 . 127))

'#s(codepoint-range 0 127)

Create a new range from a single codepoint value used as both start and end.

Example:
> (codepoint->codepoint-range 0)

'#s(codepoint-range 0 0)

Returns the number of codepoints within the range.

procedure

(codepoint-range=? lhs rhs)  boolean?

  lhs : codepoint-range?
  rhs : codepoint-range?
Returns #t if the codepoint-range lhs is equal to the codepoint-range rhs.

procedure

(codepoint-range<? lhs rhs)  boolean?

  lhs : codepoint-range?
  rhs : codepoint-range?
Returns #t if the codepoint-range lhs is less than, and not overlapping, the codepoint-range rhs.

procedure

(codepoint-range>? lhs rhs)  boolean?

  lhs : codepoint-range?
  rhs : codepoint-range?
Returns #t if the codepoint-range lhs is greater than, and not overlapping, the codepoint-range rhs.

procedure

(codepoint-range-contains? cpr cp)  boolean?

  cpr : codepoint-range?
  cp : codepoint?
Returns #t if the codepoint cp is contained within the codepoint-range cpr.

procedure

(codepoint-range-contains-any? cpr cp ...)  boolean?

  cpr : codepoint-range?
  cp : codepoint?
Returns #t if any of the codepoint values in the list cp is contained within the codepoint-range cpr.

procedure

(codepoint-range-contains-all? cpr cp ...)  boolean?

  cpr : codepoint-range?
  cp : codepoint?
Returns #t if all of the codepoint values in the list cp is contained within the codepoint-range cpr.

procedure

(codepoint-range-intersects? lhs rhs)  boolean?

  lhs : codepoint-range?
  rhs : codepoint-range?
Returns #t if the codepoint-range lhs overlaps in any way with the codepoint-range rhs.

procedure

(codepoint-range-any-intersects? cpr-list)  boolean?

  cpr-list : (listof codepoint-range?)
Returns #t if any codepoint-range within the list cpr-list overlaps in any way with ay other.

procedure

(codepoint-range->inclusive-range cpr)  range?

  cpr : codepoint-range?
Similar to codepoint-range->in-inclusive-range, but returns lists.

procedure

(codepoint-range->in-inclusive-range cpr)  range?

  cpr : codepoint-range?
Returns a sequence (that is also a stream) whose elements are codepoints from start to end.

Examples:
> (define ascii-lowercase-letters
    (make-codepoint-range
      (char->codepoint #\a)
      (char->codepoint #\z)))
> (for ([letter (codepoint-range->in-inclusive-range ascii-lowercase-letters)])
    (display (codepoint->char letter)))

abcdefghijklmnopqrstuvwxyz

Return a string that formats the range in the manner used in the Unicode specification.

Examples:
> (define ascii-lowercase-letters
    (make-codepoint-range
      (char->codepoint #\a)
      (char->codepoint #\z)))
> (display (codepoint-range->unicode-string ascii-lowercase-letters))

U+0061..U+007A

1.3 Module codepoint/range-dict🔗ℹ

 (require codepoint/range-dict) package: codepoint

As some of the Unicode character property files maintain common properties for codepoint ranges they take up less space both as data in the package and in-memory at runtime. However, these cannot be directly indexed by codepoint to find a property value. The range-dict structure provides basic dict? functions taking a codepoint as key but performs a search through the ranges to find a match.

procedure

(range-dict? v)  boolean?

  v : any/c
Returns #t if the provided value is a valid codepoint range-dict.

procedure

(make-range-dict data)  range-dict?

  data : (listof (cons/c codepoint-range? hash?))
Construct a new range-dict from an list of pairs where each pair is a mapping from codepoint-range to a property hash.

Example:
> (make-range-dict
    (list
      (cons
        (make-codepoint-range 0  127)
        (make-hash '((block-name "Basic Latin"))))
      (cons
        (make-codepoint-range 128 255)
        (make-hash '((block-name "Latin-1 Supplement"))))
      (cons
        (make-codepoint-range 256 383)
        (make-hash '((block-name "Latin Extended-A"))))
      (cons
        (make-codepoint-range 384 591)
        (make-hash '((block-name "Latin Extended-B"))))))

'#s(rangedict

    #((#s(codepoint-range 0 127) . #hash((block-name . ("Basic Latin"))))

      (#s(codepoint-range 128 255)

       .

       #hash((block-name . ("Latin-1 Supplement"))))

      (#s(codepoint-range 256 383)

       .

       #hash((block-name . ("Latin Extended-A"))))

      (#s(codepoint-range 384 591)

       .

       #hash((block-name . ("Latin Extended-B"))))))

procedure

(range-dict-count dict)  exact-nonnegative-integer?

  dict : range-dict?
Returns the number of keys mapped by range-dict.

procedure

(range-dict-has-key? dict key)  boolean?

  dict : range-dict?
  key : codepoint-range?
Returns #t if dict contains a value for the given key, #f otherwise.

procedure

(range-dict-ref dict key failure-result)  hash?

  dict : range-dict?
  key : codepoint?
  failure-result : (lambda () (raise-arguments-error ...))
Returns the value for key in dict. If no value is found for key, then failure-result determines the result: