PEG
1 Introduction
2 Syntax Reference
2.1 define-peg
2.2 peg
3 Examples
3.1 Example 1
3.2 Example 2
3.3 Example 3
4 PEG Syntax
4.1 PEG Syntax Reference
7.0

PEG

This library implements a PEG parser generator.

1 Introduction

PEG can be thought of as an advance over regex. It can match more languages (for example balanced brackets) and can be paired with semantic actions to produce structured results from a parse.

The PEG language is implemented as a system of macros that compiles parser descriptions (rules) into scheme code. It is also provided with a custom syntax via "#lang peg".

The generated code parses text by interacting with the "PEG VM", which is a set of registers holding the input text, input position, control stack for backtracking and error reporting notes.

2 Syntax Reference

2.1 define-peg

syntax

(define-peg name rule)

 
<rule> = (epsilon) ; always succeeds
  | (char c) ; matches the character c
  | (any-char) ; matches any character
  | (range c1 c2) ; match any char between c1 and c2
  | (string str) ; matches string str
  | (and <rule> ...) ; sequence
  | (or <rule> ...) ; prioritized choice
  | (* <rule> ...) ; zero or more
  | (+ <rule> ...) ; one or more
  | (? <rule> ...) ; zero or one
  | (call name)
  | (capture name <rule>)
  | (! <rule> ...) ; negative lookahead
  | (& <rule>) ; positive lookahead
  | (drop <rule> ...) ; discard the semantic result on matching
Defines a new scheme function named peg-rule:name by compiling the peg rule into scheme code that interacts with the PEG VM.

syntax

(define-peg name rule action)

Same as above, but also performs a semantic action to produce its result. Semantic actions are regular scheme expressions, they can refer to variables named by a capture.

We also provide shorthands for some common semantic actions:

syntax

(define-peg/drop name rule)

= (define-peg rule-name (drop rule))

makes the parser produce no result.

syntax

(define-peg/bake name rule)

= (define-peg rule-name (name res rule) res)

transforms the peg-result into a scheme object.

syntax

(define-peg/tag name rule)

= (define-peg rule-name (name res exp) `(rule-name . ,res))

tags the result with the peg rule name. Useful for parsers that create an AST.

2.2 peg

syntax

(peg rule input-text)

Run a PEG parser. Attempt to parse the input-text string using the given rule. This is sets up the PEG VM registers into an initial state and then calls into the parser for rule.

3 Examples

3.1 Example 1

For a simple example, lets try splitting a sentence into words. We can describe a word as one or more of non-space characters, optionally followed by a space:

> (require peg)
> (define *sentence* "the quick brown fox jumps over the lazy dog")
> (define-peg non-space
    (and (! #\space) (any-char)))
> (define-peg/bake word
    (and (+ non-space)
         (drop (? #\space))))
> (peg word *sentence*)
"the"
> (peg (+ word) *sentence*)
'("the" "quick" "brown" "fox" "jumps" "over" "the" "lazy" "dog")

3.2 Example 2

Here is a simple calculator example that demonstrates semantic actions and recursive rules.

(define-peg number (name res (+ (range #\0 #\9)))
  (string->number res))
(define-peg sum
  (and (name v1 prod) (? (and #\+ (name v2 sum))))
  (if v2 (+ v1 v2) v1))
(define-peg prod
  (and (name v1 number) (? (and #\* (name v2 prod))))
  (if v2 (* v1 v2) v1))

this grammar(without semantic actions) is equivalenty to :

#lang peg
number <-- res:[0-9]+ ;
sum <-- v1:prod ('+' v2:sum)? ;
prod <-- v1:number ('*' v2:prod)? ;

Usage:

> (peg sum "2+3*4")
14
> (peg sum "2*3+4")
10
> (peg sum "7*2+3*4")
26

3.3 Example 3

Here is an example of parsing balanced paranthesis. It demonstrates a common idiom of using _ for skipping whitespace, and using "define-peg/bake" to produce a list rather than a sequence from a *.

(define-peg/drop _ (* (or #\space #\newline)))
 
(define-peg symbol
  (and (name res (+ (and (! #\( #\) #\space #\newline) (any-char)))) _)
  (string->symbol res))
 
(define-peg/bake sexp
  (or symbol
      (and (drop #\() (* sexp) (drop #\) _))))

Usage:

> (peg sexp "(foob (ar baz)quux)")
'(foob (ar baz) quux)
> (peg sexp "((())(()(())))")
'((()) (() (())))
> (peg sexp "(lambda (x) (list x (list (quote quote) x)))")
'(lambda (x) (list x (list 'quote x)))

4 PEG Syntax

This package also provides a "#lang peg" alternative, to allow you to make parsers in a more standard PEG syntax.

4.1 PEG Syntax Reference

The best way to understand the PEG syntax would be by reference to examples, there are many simple examples in the racket peg repo and the follow is the actual grammar used by racket-peg to implemet the peg lang:

#lang peg

 

import sexp-parser-expanded.rkt;

 

_ < ([ \t\n] / '//' [^\n]*)*;

SLASH < '/' _;

 

name <-- [a-zA-Z_] [a-zA-Z0-9\-_.]* _;

 

rule <-- name ('<--' / '<-' / '<') _ pattern ('->' _ s-exp _)? ';' _;

pattern <-- alternative (SLASH alternative)*;

alternative <-- expression+;

expression <-- (name ~':' _)? ([!&~] _)? primary ([*+?] _)?;

primary <-- '(' _ pattern ')' _ / '.' _ / literal / charclass / name;

 

literal <-- ~['] (~[\\] ['\\] / !['\\] .)* ~['] _;

 

charclass <-- ~'[' '^'? (cc-range / cc-escape / cc-single)+ ~']' _;

cc-range <-- cc-char ~'-' cc-char;

cc-escape <-- ~[\\] .;

cc-single <-- cc-char;

cc-char <- !cc-escape-char . / 'n' / 't';

cc-escape-char <- '[' / ']' / '-' / '^' / '\\' / 'n' / 't';

 

peg <-- _ import* rule+;

import <-- 'import' _ name ';' _;