8.0

3 Representation of Hypersnippet Data

Punctaffy has a few particular data structures that it revolves around.

A hypersnippet, or a snippet for short, is a region of code that’s bounded by lower-degree snippets. The degree of a snippet is typically a number representing its dimension in a geometric sense. For instance, a degree-3 snippet is bounded by degree-2 snippets, which are bounded by degree-1 snippets, which are bounded by degree-0 snippets, just as a 3D cube is bounded by 2D squares, which are bounded by 1D line segments, which are bounded by 0D points. One of the boundaries of a hypersnippet is the opening delimiter. The others are the closing delimiters, or the holes for short. This name comes from the idea that a degree-3 snippet is like an expression (degree-2 snippet) with expression-shaped holes.

While a degree-3 snippet primarily has degree-2 holes, it’s also important to note that its degree-2 opening delimiter has degree-1 holes, and the degree-1 opening delimiter of that opening delimiter has a degree-0 hole. Most Punctaffy operations traverse the holes of every dimension at once, largely just because we’ve found that to be a useful approach.

The idea of a hypersnippet is specific enough to suggest quite a few operations, but the actual content of the code contained inside the snippet is vague. We could say that the content of the code is some sequence of bytes or some Unicode text, but we have a lot of options there, and it’s worth generalizing over them so that we don’t have to implement a new library each time. So the basic operations of a hypersnippet are represented in Punctaffy as generic operations that multiple data structures might be able to implement.

Snippets don’t identify their own snippet nature. Instead, each hypersnippet operation takes a hypersnippet system (aka a snippet system) argument, and it uses that to look up the appropriate hypersnippet functionality.

A dimension system is a collection of implementations of the arithmetic operations we need on dimension numbers. (A dimension number is the "3" in the phrase "degree-3 hypersnippet." It generally represents the set of smaller dimension numbers that are allowed for a snippet’s holes.) For what we’re doing so far, it turns out we only need to compare dimension numbers and take their maximum. For some purposes, it may be useful to use dimension numbers that aren’t quite numbers in the usual sense, such as dimensions that are infinite or symbolic.

A hypersnippet system always has some specific dimension system it’s specialized for. We tend to find that notions of hypersnippet make sense independently of a specific dimension system, so we sometimes represent these notions abstractly as a kind of functor from a dimesion system to a snippet system. In practical terms, a functor like this lets us convert between two snippet systems that vary only in their choice of dimension system, as long as we have some way to convert between the dimension systems in question.

A hypertee is a kind of hypersnippet data structure that represents a region of code that doesn’t contain content of any sort at all. A hypertee may not have content, but it still has a boundary, and hypertees tend to arise as the description of the shape of a hypersnippet. For instance, when we try to graft a snippet into the hole of another snippet, it needs to have a shape that’s compatible with the shape of that hole.

A hypernest is a kind of hypersnippet data structure that generalizes some other kind of hypersnippet (typically hypertees) by adding bumps. Bumps are like holes that are already filled in, but with a seam showing. The filling, called the bump’s interior, is considered to be nested deeper than the surrounding content. A bump can contain other bumps. A bump can also contain holes of degree equal to or greater than the bump’s degree.

A hypernest is a generalization of an s-expression or other syntax tree. In an s-expression, the bumps are the pairs of parentheses and the atoms. In a syntax tree, the bumps are the nodes. Just as trees are rather effective representations of lots of different kinds of structured programs and structured data, so are hypernests, and that makes them Punctaffy’s primary hypersnippet data structure.

    3.1 Dimension Systems

      3.1.1 Dimension Systems in General

      3.1.2 Category-Theoretic Dimension System Manipulations

      3.1.3 Commonly Used Dimension Systems

    3.2 Snippet Systems

      3.2.1 Snippet Systems in General

      3.2.2 Category-Theoretic Snippet System Manipulations

      3.2.3 Snippet Format Systems in General

      3.2.4 Category-Theoretic Snippet Format System Manipulations

    3.3 Hypertees: The Shape of a Hypersnippet of Text

      3.3.1 Hypertee Coils

      3.3.2 Hypertee Brackets

      3.3.3 Hypertee Constructors and Operations

    3.4 Hypernests: Nested Hypersnippets

      3.4.1 Hypernest Coils

      3.4.2 Hypernest Brackets

      3.4.3 Hypernest Constructors and Operations

    3.5 Hyperstacks: The State of a Text-to-Hypernest Parser