GTP measure
1 Command-line:   raco gtp-measure
1.1 Stages of measurement
1.2 Configuration and Data Files
2 GTP targets
2.1 Typed/  Untyped Configuration
2.2 Exhaustive vs. Approximate evaluation
2.3 Design:   typed/  untyped directory
3 GTP configuration
gtp-measure-config/  c
key:  bin
key:  iterations
key:  jit-warmup
key:  num-samples
key:  sample-factor
key:  cutoff
key:  entry-point
key:  start-time
key:  argv
key:  working-directory
3.1 Configuration Fallback
4 GTP measuring task
4.1 GTP task setup
4.2 GTP sub-task
5 Data Description Languages

GTP measure

Ben Greenman

For benchmarking.

1 Command-line: raco gtp-measure

The gtp-measure raco command is a tool for measuring the performance of a set of gtp-measure targets according to a set of configuration options.

See also: GTP targets

To see all accepted flags: raco gtp-measure --help

To measure performance and print status messages:

PLTSTDERR="error [email protected]" raco gtp-measure ....

1.1 Stages of measurement

After gtp-measure is invoked on the command line, it operates in five stages:

1.2 Configuration and Data Files

The gtp-measure library uses the basedir library to obtain configuration and data files.

User-level configuration settings are stored in the file:

(writable-config-file "config.rktd" #:program "gtp-measure")

Each task gets a data directory, stored under:

(writable-data-dir #:program "gtp-measure")

Together, the data files and command-line arguments build a gtp-measure configuration value. See Configuration Fallback for details on how these data sources work together.

2 GTP targets

A gtp-measure target is either:
  • a file containing a Racket module and exactly one call to time-apply (possibly via time);

  • a directory containing: (1) a "typed" directory, (2) an "untyped" directory, (3) optionally a "base" directory, and (4) optionally a "both" directory.
    • The "typed" directory must contain a few typed/racket modules.

    • The "untyped" directory must contain matching Racket modules. These modules must have the same name as the modules in the "typed" directory, and should have the same code as the typed modules — just missing type annotations and type casts.

    • The optional "base" directory may contain data files that the "typed" and "untyped" modules may reference via a relative path (e.g. "../base/foo.rkt")

    • The optional "both" directory may contain modules that the "typed" and "untyped" modules may reference as if they were in the same directory (e.g. "foo.rkt"). If so, the "typed" and "untyped" modules will not compile unless the "both" modules are copied into their directory. This is by design.

  • a file containing a gtp-measure/manifest module.

To measure a file target, gtp-measure compiles the file once and repeatedly: runs the file and parses the output of time-apply. See GTP configuration for details on how gtp-measure compiles and runs Racket modules.

To measure a typed/untyped target, gtp-measure chooses a sequence of typed/untyped configurations and, for each: copies the configuration to a directory, and runs this program’s entry module as a file target. The sequence of configurations is either exhaustive or approximate.

To measure a manifest target, gtp-measure runs the targets listed in the manifest.

2.1 Typed/Untyped Configuration

A typed/untyped configuration for a typed/untyped target with M modules is a working program with M modules — some typed (maybe none), some untyped.

The gtp-measure library encodes such a configuration with a string of length M where each character is either #\0 or #\1. If the character at position i is #\0, the configuration uses the i-th module in the "untyped" directory and ignores the i-th module in the "typed" directory. If the character at position i is #\1, the configuration uses the i-th "typed" module and ignores the "untyped" module. Modules are ordered by filename-sort.

2.2 Exhaustive vs. Approximate evaluation

An exhaustive evaluation of a typed/untyped target with M modules measures the performance of all 2**M configurations. This is a lot of measuring, and will probably take a very long time if M is 15 or more.

An R-S-approximate evaluation measures R * S * M randomly-selected configurations; more precisely, R sequences containing S*M configuration in each sequence. This number, RSM, is probably less than 2**M. (If it’s not, just do an exhaustive evaluation.) See GTP configuration for how to set R and S, and how to switch from an exhaustive evaluation to an approximate one.

The idea of an approximate evaluation comes from our work on Typed Racket. Greenman and Migeed (PEPM 2018) give a more precise definition, and apply the idea to Reticulated Python. Note that gtp-measure uses a different definition of S than the PEPM paper.

2.3 Design: typed/untyped directory

The point of a typed/untyped directory is to describe an exponentially-large set of programs in “less than exponential” space. The set is all ways of taking a Typed Racket program and removing some of its types — specifically, removing types from some of the modules in the program. So given a typed/untyped directory, gtp-measure needs to be able to generate and run each program.

The "typed" and "untyped" directories are a first step to reduce space. Instead of storing all 2**M programs for a program with M modules, we store 2M modules. The reason we store 2M instead of just M typed modules is that we do not have a way to automatically remove types from a Typed Racket program (to remove types, we sometimes want to translate type casts to Racket).

The "base" directory is a second way to save space. If a program depends on data or libraries, they belong in the "base" directory so that all configurations can reference one copy.

The "both" directory helps us automatically generate configurations by solving a technical problem. The problem is that if an untyped module defines a struct and two typed modules import it, both typed modules need to reference a canonical require/typed for the struct’s type definitions. We solve this by putting an type adaptor module with the require/typed in the "both" directory. An adaptor can require "typed" or "untyped" modules, and typed modules can require the adaptor.

3 GTP configuration

 (require gtp-measure/configure) package: gtp-measure

The gtp-measure library is parameterized by a set of key/value pairs. This section documents the available keys and the type of values each key expects.


gtp-measure-config/c : flat-contract?

Contract for a gtp-measure configuration; that is, an immutable hash whose keys are a subset of those documented below and whose values match the descriptions below.



Value must be a string that represets a path to a directory. The directory must contain executables named raco and racket.

Used to compile and run Racket programs.

In particular, if <BIN> is the value of key:bin then the command to compile the target <FIILE> is:

<BIN>/raco make -v <FILE>

and the command to run <FILE> is:

<BIN>/racket <FILE>

Value must be an exact-positive-integer?.

Determines the number of times to run a file target and collect data.

Value must be an exact-nonnegative-integer?.

Determines the number of times (if any) to run a file target and ignore the output BEFORE collecting data.

Value must be an exact-positive-integer?

Determines R, the number of samples for any approximate evaluations.

Value must be an exact-positive-integer?

Determines the size of each sample in any approximate evaluations. The size is S*M, where S is the value associated with this key and M is the number of modules in the typed/untyped target.

Value must be an exact-nonnegative-integer?.

Determines whether to run an exhaustive or approximate evaluation for a typed/untyped target. Let M be the number of modules in the target and let C be the value associated with this key. If (<= M C), then gtp-measure runs an exhaustive evaluation; otherwise, it runs an approximate evaluation.

Value must be a string that represents a filename.

Determines the entry module of all typed/untyped targets. This module is treated as a file target for each configuration in the typed/untyped evaluation.

Value must be a real number.

By default, this is the value of current-inexact-milliseconds when gtp-measure was invoked. You should probably not override this default.



Value must be a list of string.

By default, this is the value of (vector->list (current-command-line-arguments)) when gtp-measure was invoked. You should probably not override this default.



Value must be a string absolute path.

All intermediate files and all results are saved in the given directory.

3.1 Configuration Fallback

The gtp-measure library defines a default value for each configuration key. Users can override this default by writing a hashtable with relevant keys (a subset of the keys listed above) to their configuration file. Users can override both the defaults and their global configuration by supplying a command-line flag. Run raco gtp-measure --help to see available flags.

The defaults for the machine that rendered this document are the following:

4 GTP measuring task

A task describes a sequence of targets to measure.

4.1 GTP task setup

Before measuring the targets in a task, the gtp-measure library allocates a directory for the task and writes files that describe what is to be run. If the task is interrupted, gtp-measure may be able to resume the task; run raco gtp-measure --help for instructions.

4.2 GTP sub-task

A sub-task is one unit of a task. This concept is not well-defined. The idea is to divide measuring tasks into small pieces so there is little to recompute if a task is interrupted.

More later.

5 Data Description Languages

The gtp-measure library includes a few small languages to describe data formats.

A manifest contains an optional hash with configuration options and a sequence of target descriptors.
The configuration options must be prefixed by the keyword #:config and must be a hash literal that matches the gtp-measure-config/c contract. If present, the options specified in the hash override any defaults.
A target descriptor is either a string representing a file or directory, or a pair of such a string and a target kind. In the first case, the target kind is inferred at runtime. In the second case, the target kind is checked at runtime.

#lang gtp-measure/manifest
#:config #hash((iterations . 10))
("file-2.rkt" . file)
(typed-untyped-dir-1 . typed-untyped)

There is an internal syntax class for these “target descriptors” that should be made public.