pprint-compact:   a non-greedy pretty printer
1 Getting Started
2 Documents
doc?
3 Best Practice for Document Construction
4 Library Documentation
4.1 Rendering Documents
pretty-print
pretty-format
4.2 Constructing Documents
text
flush
alt
h-append
h-concat
v-append
v-concat
hs-append
hs-concat
sep
fail
annotate
flat
flush-if
4.3 Useful Constants
empty-doc
lparen
rparen
space
4.4 Parameters
current-page-width
current-page-indent
4.5 Match Expanders
:  text
:  flush
:  concat
:  alternatives
:  annotate
:  fail
5 Annotation
6 Processing Documents
6.1 Memoization Library
memoize
6.2 Document Processing Library
doc-process
6.3 Tutorial:   Implementing flat
7 Debug
dag-size
tree-size
8 Design Notes
9 Acknowledgment
Bibliography
8.2

pprint-compact: a non-greedy pretty printer

Sorawee Porncharoenwase <sorawee.pwase@gmail.com>

 (require pprint-compact) package: pprint-compact

This library implements an optimal pretty printer inspired by Bernady (2017) and Podkopaev and Boulytchev (2014). The pretty printer generates the most optimal textual document from a tree structure.

The library is similar to another pretty printer library pprint, but pprint implements a greedy printer and doesn’t support the choice operator (alt). In practice, this means pprint will be more efficient, but it lacks expressivity that this library provides, and could produce a non-optimal layout.

Unlike the canonical Haskell implementation which removes the choice operator to improve performance, we implement all constructs described in Bernady (2017), with additional optimizations, constructs for practical printers, and a much better handling of overflow.

A part of this documentation and its structure are shamelessly copied/adapted from the PPrint library.

    1 Getting Started

    2 Documents

    3 Best Practice for Document Construction

    4 Library Documentation

      4.1 Rendering Documents

      4.2 Constructing Documents

      4.3 Useful Constants

      4.4 Parameters

      4.5 Match Expanders

    5 Annotation

    6 Processing Documents

      6.1 Memoization Library

      6.2 Document Processing Library

      6.3 Tutorial: Implementing flat

    7 Debug

    8 Design Notes

    9 Acknowledgment

    Bibliography

1 Getting Started

Here’s a simple example of pretty-printing a fragment of code.

Examples:
> (define doc
    (v-append
      (text "while (true) {")
      (h-append (text "    ")
                (v-append (text "f();")
                          (alt (hs-append (text "if (done())")
                                          (text "exit();"))
                               (v-append (text "if (done())")
                                         (h-append (text "    ")
                                                   (text "exit();"))))))
      (text "}")))
> (pretty-print doc)

while (true) {

    f();

    if (done()) exit();

}

With a different page width limit, the document could be printed differently:

Example:
> (pretty-print doc #:width 20)

while (true) {

    f();

    if (done())

        exit();

}

2 Documents

Formatting text involves creating an “abstract document” or doc, which encapsulates formatting information for the pretty printer. The library functions (see Constructing Documents) build and combine docs, which can then be rendered for pretty printing (see Rendering Documents).

procedure

(doc? x)  boolean?

  x : any/c
Determines whether x is a member of the doc datatype.

3 Best Practice for Document Construction

The arguments to alt should roughly have the same content, albeit with different formats. This means that the tree size of a doc containing alt tends to blow up exponentially. The time complexity of the algorithm used by this library however depends on the DAG size of the doc, so provided that sub-documents are sufficiently shared, the DAG size should be small enough to allow efficient pretty printing.

As an example, say we want to pretty print an S-expression with two possible layouts for each “list”: horizontal and vertical. That is,

(1 2 3)

could be rendered as itself or

(1
 2
 3)

We can construct a function to convert an S-expression to a doc:

> (define (pretty s)
    (match s
      [(list xs ...)
       ; Calculate all subdocuments first to share their references
       (define xs* (map pretty xs))
  
       (alt (h-append lparen (hs-concat xs*) rparen)
            (h-append lparen (v-concat xs*) rparen))]
      [_ (text s)]))

And then pretty print it:

> (define doc (pretty '("1" "2" "3")))
> (pretty-print doc)

(1 2 3)

> (pretty-print doc #:width 3)

(1

 2

 3)

The important point is that we reuse xs* across branches of alt. Had we call (map pretty xs) twice in branches of alt, both doc construction and pretty-print would be inefficient.

4 Library Documentation

4.1 Rendering Documents

procedure

(pretty-print d    
  [#:out out    
  #:width width    
  #:indent indent])  void?
  d : doc?
  out : output-port? = (current-output-port)
  width : (or/c +inf.0 natural-number/c) = (current-page-width)
  indent : natural-number/c = (current-page-indent)
Pretty prints the doc d to the output port out with a maximum page width of width and an initial indentation level for subsequent lines of indent.

Examples:
> (define prefix-s "value is: ")
> (begin
    (display prefix-s)
    (pretty-print (v-append (text "a") (text "b") (text "c"))
                  #:indent (string-length prefix-s)))

value is: a

          b

          c

procedure

(pretty-format d    
  [#:width width    
  #:indent indent])  string?
  d : doc?
  width : (or/c +inf.0 natural-number/c) = (current-page-width)
  indent : natural-number/c = (current-page-indent)
Pretty prints the doc d as a string with a maximum page width of width.

4.2 Constructing Documents

procedure

(text s)  doc?

  s : string?
Constructs a doc containing the fixed string s. s must not contain a newline character.

procedure

(flush d)  doc?

  d : doc?
Constructs a doc which is like d but with a newline at the end.

procedure

(alt x ...)  doc?

  x : doc?
Constructs a doc which is rendered to one of xs, whichever resulting in the prettiest layout for the whole document. If given no arguments, the resulting doc is fail.

See Best Practice for Document Construction for caveats of this construct.

procedure

(h-append x ...)  doc?

  x : doc?
Concatenates doc xs horizontally.

procedure

(h-concat xs)  doc?

  xs : (listof doc?)
Concatenates docs in xs horizontally.

procedure

(v-append x ...)  doc?

  x : doc?
Concatenates doc xs vertically.

procedure

(v-concat xs)  doc?

  xs : (listof doc?)
Concatenates docs in xs vertically.

procedure

(hs-append x ...)  doc?

  x : doc?
Concatenates doc xs horizontally with successive pairs separated by space.

procedure

(hs-concat x ...)  doc?

  x : doc?
Concatenates docs in xs horizontally with successive pairs separated by space.

procedure

(sep xs)  doc?

  xs : doc?
Concatenates docs in xs either (1) vertically or (2) horizontally with successive pairs separated by space.

value

fail : doc?

Constructs a doc that fails to render. This doc interacts with alt: only non-failing branches are considered for rendering.

procedure

(annotate d a)  doc?

  d : doc?
  a : any/c
Constructs a doc which is like d but with an annotation a.

procedure

(flat x)  doc?

  x : doc?
Constrains doc x to fit in one line. If x can’t fit in one line, it fails to render.

procedure

(flush-if b d)  doc?

  b : any/c
  d : doc?
Performs flush on d if b is true. Otherwise, simply returns d.

4.3 Useful Constants

value

empty-doc : doc?

Same as (text "")

value

lparen : doc?

Same as (text "(")

value

rparen : doc?

Same as (text ")")

value

space : doc?

Same as (text " ")

4.4 Parameters

parameter

(current-page-width)  (or/c natural-number/c +inf.0)

(current-page-width page-width)  void?
  page-width : (or/c natural-number/c +inf.0)
 = 80
A parameter that determines the page width for pretty printing.

parameter

(current-page-indent)  natural-number/c

(current-page-indent indent)  void?
  indent : natural-number/c
 = 0
A parameter that determines the initial indentation level for subsequent lines.

4.5 Match Expanders

Internally, a doc is either a :text, :flush, :concat, :alternatives, :annotate, or :fail. We provide these match expanders to allow doc processing (see Processing Documents. The match expanders are illegal outside of the pattern position of the match form. Keep in mind that this list is unstable and could change across versions of the library.

syntax

(:text s)

A match expander that recognizes text s of type string?.

syntax

(:flush d)

A match expander that recognizes a doc d with a newline at the end.

syntax

(:concat da db)

A match expander that recognizes a horizontal concatenation of docs da and db.

syntax

(:alternatives da db)

A match expander that recognizes two choices: docs da and db.

syntax

(:annotate d a)

A match expander that recognizes a doc d with an annotation value a.

syntax

(:fail)

A match expander that recognizes a failing doc.

5 Annotation

An annotation can be attached to a doc via annotate. An annotated doc is rendered like the corresponding unannotated doc, but various constructs can inspect an annotation for optimization or to change the rendering semantics. Unless you wish to write a function to process a doc (see Processing Documents), you can pretend that this feature doesn’t exist.

6 Processing Documents

If you wish to process a doc, it is highly recommended that your function that recurs over the doc structure is memoizing. This can be done by using memoize function in pprint-compact/memoize.

6.1 Memoization Library

 (require pprint-compact/memoize)
  package: pprint-compact

procedure

(memoize f [#:backend backend])  procedure?

  f : (-> any/c any/c)
  backend : procedure? = hasheq
Memoizes the function f based on the backend data structure.

6.2 Document Processing Library

 (require pprint-compact/process)
  package: pprint-compact

procedure

(doc-process f d)  doc?

  f : procedure?
  d : doc?
Calls f on the immediate subdocuments of d and reassembles the results back. The function attempts to avoid creating new objects as best as it can. Note that f should be memoized.

Prefer this function over manual matching against all match expanders, since the list of match expanders could change across versions of this library, making the code brittle to changes. Using this function on the other hand makes doc processing stable across versions.

6.3 Tutorial: Implementing flat

This section will demonstrate how to perform document processing by implementing flat.

The goal of flat is to constrain a document to have only one line. This can be done by replacing every :flush with fail.

(define (my-flat d)
  (match d
    [(:flush) fail]
    [(:text s) d]
    [(:fail) d]
    [(:concat a b) (h-append (my-flat a) (my-flat b))]
    [(:alternatives a b) (alt (my-flat a) (my-flat b))]
    [(:annotate d a) (annotate (my-flat d) a)]))

While this code is functional, it is inefficient: the time complexity is linear to the tree size of the document, even though the DAG size could be much smaller.

We can improve the function by using memoization:

(define (my-flat d)
  (define loop
    (memoize
      (λ (d)
        (match d
          [(:flush) fail]
          [(:text s) d]
          [(:fail) d]
          [(:concat a b) (h-append (loop a) (loop b))]
          [(:alternatives a b) (alt (loop a) (loop b))]
          [(:annotate d a) (annotate (loop d) a)]))))
  (loop d))

However, this function is still subtly inefficient because when it recurs, it unconditionally constructs new objects even though there is no change. Moreover, the code is brittle as the list of match expanders could change across versions of this library.

We can solve both problems at once by using doc-process.

(define (my-flat d)
  (define loop
    (memoize
      (λ (d)
        (match d
          [(:flush) fail]
          [_ (doc-process loop d)]))))
  (loop d))

We can further optimize the function by noticing that if a doc fragment is already flat, then there is no need to recur further. Therefore, we add an annotation that the output doc is flat and avoid computation if the doc is already flat:

(define (my-flat d)
  (define loop
    (memoize
      (λ (d)
        (match d
          [(:flush) fail]
          [(:annotate _ 'flat) d]
          [_ (doc-process loop d)]))))
  (annotate (loop d) 'flat))

7 Debug

 (require pprint-compact/debug) package: pprint-compact

The following functions can be used to measure document size.

procedure

(dag-size d)  natural-number/c

  d : doc?
Returns the DAG size of doc d.

procedure

(tree-size d)  natural-number/c

  d : doc?
Returns the tree size of doc d.

8 Design Notes

For the history of pretty printer in general, see History in the pprint library.

Two recent algorithms are proposed in Bernady (2017) and Podkopaev and Boulytchev (2014), which are more expressive and optimal at the cost of being less efficient, compared to other greedy pretty printers. The time complexity of the algorithm in Bernady (2017) is O(n W4 log W), where n is the tree size and W is the page width (when using an efficient Pareto frontier computation), though in practice it is fairly fast. Similarly, the time complexity of the algorithm in Podkopaev and Boulytchev (2014) is O(n W4), where n is the tree size and W is the page width, though in practice it is fairly slow.

This library implements a novel algorithm inspired by Bernady (2017) and Podkopaev and Boulytchev (2014). It uses various techniques to improve the time complexity to O(n W3 log W) where n is the DAG size, while making it fast in practice. It also adds features like fail and annotate, which are basis to implement constructs like flat efficiently.

9 Acknowledgment

I would like to give special thanks to Justin Pombrio for pointing me to Bernady (2017) and countless advice.

Bibliography

Jean-Philippe Bernady. A Pretty But Not Greedy Printer. 2017. https://jyp.github.io/pdf/Prettiest.pdf

Anton Podkopaev and Dmitri Boulytchev. Polynomial-Time Optimal Pretty-Printing Combinators with Choice. 2014. https://oops.math.spbu.ru/papers/printer-combinators.pdf