See also: GTP targets
See also: GTP configuration
To see all accepted flags: raco gtp-measure --help
To measure performance and print status messages:
PLTSTDERR="error info@gtp-measure" raco gtp-measure ....
After gtp-measure is invoked on the command line, it operates in five stages:
resolve the command-line targets to actual files / directories;
resolve the command-line configuration options to a configuration;
divide the measuring task into sub-tasks;
collect data, write to the task’s data directory.
User-level configuration settings are stored in the file:
(writable-config-file "config.rktd" #:program "gtp-measure")
Each task gets a data directory, stored under:
(writable-data-dir #:program "gtp-measure")
- gtp typed/untyped target : a directory containing: (1) a "typed" directory, (2) an "untyped" directory, (3) optionally a "base" directory, and (4) optionally a "both" directory.
The "typed" directory must contain a few typed/racket modules.
The "untyped" directory must contain matching Racket modules. These modules must have the same name as the modules in the "typed" directory, and should have the same code as the typed modules —
just missing type annotations and type casts.
The optional "base" directory may contain data files that the "typed" and "untyped" modules may reference via a relative path (e.g. "../base/file.rkt")
The optional "both" directory may contain modules that the "typed" and "untyped" modules may reference as if they were in the same directory (e.g. "file.rkt"). If so, the "typed" and "untyped" modules will not compile unless the "both" modules are copied into their directory. This is by design.
gtp manifest target : a file containing a gtp-measure/manifest module.
To measure a file target, gtp-measure compiles the file once and repeatedly: runs the file and parses the output of time-apply. See GTP configuration for details on how gtp-measure compiles and runs Racket modules.
To measure a typed/untyped target, gtp-measure chooses a sequence of typed/untyped configurations and, for each: copies the configuration to a directory, and runs this program’s entry module as a file target. The sequence of configurations is either exhaustive or approximate.
To measure a manifest target, gtp-measure runs the targets listed in the manifest.
A typed/untyped configuration for a
typed/untyped target with M
modules is a working program with M modules —
The gtp-measure library encodes such a configuration with a string of length M where each character is either #\0 or #\1. If the character at position i is #\0, the configuration uses the i-th module in the "untyped" directory and ignores the i-th module in the "typed" directory. If the character at position i is #\1, the configuration uses the i-th "typed" module and ignores the "untyped" module. Modules are ordered by filename-sort.
An exhaustive evaluation of a typed/untyped target with M modules measures the performance of all 2M configurations. This is a lot of measuring, and will probably take a very long time if M is 15 or more.
An R-S-approximate evaluation measures R * S * M randomly-selected configurations; more precisely, R sequences containing S*M configuration in each sequence. This number, RSM, is probably less than 2M. (If it’s not, just do an exhaustive evaluation.) See GTP configuration for how to set R and S, and how to switch from an exhaustive evaluation to an approximate one.
The idea of an approximate evaluation comes from our work on Typed Racket. Greenman and Migeed (PEPM 2018) give a more precise definition, and apply the idea to Reticulated Python. Note that gtp-measure uses a different definition of S than the PEPM paper.
The point of a typed/untyped directory
is to describe an exponentially-large set of programs in “less than exponential” space.
The set is all ways of taking a Typed Racket program and removing some of its
The "typed" and "untyped" directories are a first step to reduce space. Instead of storing all 2M programs for a program with M modules, we store 2M modules. The reason we store 2M instead of just M typed modules is that we do not have a way to automatically remove types from a Typed Racket program (to remove types, we sometimes want to translate type casts to Racket).
The "base" directory is a second way to save space. If a program depends on data or libraries, they belong in the "base" directory so that all configurations can reference one copy.
The "both" directory helps us automatically generate configurations by solving a technical problem. The problem is that if an untyped module defines a struct and two typed modules import it, both typed modules need to reference a canonical require/typed for the struct’s type definitions. We solve this by putting an type adaptor module with the require/typed in the "both" directory. An adaptor can require "typed" or "untyped" modules, and typed modules can require the adaptor.
|(require gtp-measure/configure)||package: gtp-measure|
The gtp-measure library is parameterized by a set of key/value pairs. This section documents the available keys and the type of values each key expects.
Used to compile and run Racket programs.
In particular, if <BIN> is the value of key:bin then the command to compile the target <FIILE> is:
<BIN>/raco make -v <FILE>
and the command to run <FILE> is:
Since this package was originally created to measure the GTP benchmarks, which depend on the require-typed-check package, invoking raco gtp-measure ensures that the package is installed for the current value of key:bin. If the package is missing, <BIN>/raco pkg installs it.
Changed in version 0.3 of package gtp-measure: Automatically install require-typed-check if missing.
Determines the number of times to run a file target and collect data.
Determines the number of times (if any) to run a file target and ignore the output BEFORE collecting data.
Determines R, the number of samples for any approximate evaluations.
Determines whether to run an exhaustive or approximate evaluation for a typed/untyped target. Let M be the number of modules in the target and let C be the value associated with this key. If (<= M C), then gtp-measure runs an exhaustive evaluation; otherwise, it runs an approximate evaluation.
Sets a time limit for the total time to run a configuration. If the value is #false then there is no time limit. Otherwise, the value is the time limit in seconds.
The total time includes all the warmup iterations and all the collecting iterations.
See also Time Limit Parsing.
Added in version 0.3 of package gtp-measure.
All intermediate files and all results are saved in the given directory.
The gtp-measure library defines a default value for each configuration key. Users can override this default by writing a hashtable with relevant keys (a subset of the keys listed above) to their configuration file. Users can override both the defaults and their global configuration by supplying a command-line flag. Run raco gtp-measure --help to see available flags.
The defaults for the machine that rendered this document are the following:
key:bin = "/home/root/racket/bin/"
key:iterations = 8
key:jit-warmup = 1
key:num-samples = 10
key:sample-factor = 10
key:cutoff = 9
key:entry-point = "main.rkt"
key:start-time = 0
key:time-limit = #f
key:argv = ()
A task describes a sequence of targets to measure.
Before measuring the targets in a task, the gtp-measure library allocates a directory for the task and writes files that describe what is to be run. If the task is interrupted, gtp-measure may be able to resume the task; run raco gtp-measure --help for instructions.
A sub-task is one unit of a task. This concept is not well-defined. The idea is to divide measuring tasks into small pieces so there is little to recompute if a task is interrupted.
The gtp-measure library includes a few small languages to describe data formats.
#lang gtp-measure/manifest #:config #hash((iterations . 10)) file-0.rkt typed-untyped-dir-0 "file-1.rkt" ("file-2.rkt" . file) (typed-untyped-dir-1 . typed-untyped)
There is an internal syntax class for these “target descriptors” that should be made public.
successful time output, containing the CPU time, real time, and GC time;
a Racket runtime error message;
or a timeout notice ("timeout N").
#lang gtp-measure/output/typed-untyped ("00000" ("cpu time: 566 real time: 567 gc time: 62" "cpu time: 577 real time: 578 gc time: 62")) ("00001" ("cpu time: 820 real time: 822 gc time: 46" "cpu time: 793 real time: 795 gc time: 44")) ("00010" ("cpu time: 561 real time: 562 gc time: 46" "cpu time: 565 real time: 566 gc time: 44")) ("00011" ("cpu time: 805 real time: 807 gc time: 47" "cpu time: 813 real time: 815 gc time: 45")) ....
> (string->time-limit "1")
> (string->time-limit "1s")
> (string->time-limit "1m")
> (string->time-limit "1h")
(hours->seconds h) → exact-nonnegative-integer?
h : exact-nonnegative-integer?
(minutes->seconds m) → exact-nonnegative-integer?
m : exact-nonnegative-integer?
> (hours->seconds 1)
> (minutes->seconds 1)