Finger Trees
Finger trees are a general purpose data structure that can be used to implement many other data structures including multiple variations of sequences, deques, and queues. Finger trees support amortized constant time access to both ends as well as logarithmic time concatenation and splitting.
Finger trees achieve the log append and split bounds by caching a "measurement" in each node. the actual measurement computed is determined by the specific data structure being represented. For example, caching the size of the subtree at each node implements a random-access sequence.
Based on Hinze and Paterson’s JFP 2006 [HP06] paper.
This implementation currently does not utilize laziness, because preliminary benchmarking indicates that it is not beneficial to do so. In other words, the benefits of laziness do not outweigh the practical overhead that laziness incurs. Thus, the theoretical bounds from the Hinze and Paterson’s paper may not hold when the data structures are used persistently. However, the data structures should still perform well in practice.
1 Finger Trees
(require ftree) | package: ftree |
procedure
∅ : any/c elem-sz : (-> any/c any/c) ⊕ : (-> any/c any/c any/c)
procedure
(ftree? x) → boolean?
x : any/c
value
gen:ft : any/c
gen:mk : (-> any/c any/c (-> any/c any/c) any/c any/c any/c) : Builds a new finger tree based on an existing finger tree. Takes five arguments: existing-ftree ∅ sz-fn ⊕ internal-ftree.
2 Random-Access Sequences
(require raseq) |
procedure
(raseq? x) → boolean?
x : any/c
procedure
(ra-splitat i ras) →
raseq? raseq? i : exact-nonnegative-integer? ras : raseq?
3 Priority Queues
(require pqueue) |
procedure
(pqueue? x) → boolean?
x : any/c
procedure
(pq-top+rest pq) →
any/c pqueue? pq : pqueue?
4 Ordered Sequences
(require orderedseq) |
From Hinze and Paterson’s paper: "ordered sequences [can be seen as an optimization] of, and subsume, priority queues, as we have immediate access to the smallest and the greatest element, and search trees, as we can partition ordered sequences in logarithmic time."
Ordered sequences store elements in sorted order, so the ft-hd and ft-tl functions can grab both min and max elements in constant time, but inserting elements requires a special os-insert function.
Priority queues store elements in arbitrary order, using the "measure" function, ie sz, to determine the order, so ft-cons can be used to insert.
procedure
(oseq? x) → boolean?
x : any/c
procedure
(os-partition x os) →
oseq? oseq? x : any/c os : oseq?
procedure
(os-delete-all x os) → oseq?
x : any/c os : oseq?
procedure
(os-remove-top os) → oseq?
os : oseq?
procedure
(os-remove-bot os) → oseq?
os : oseq?
5 Interval Trees
(require intervaltree) |
Represents sets of intervals where an interval is a pair of real numbers, lo and hi, with lo <= hi. Intervals are sorted in ascending order of the lo’s, when added via it-insert (but not with any of the ft-cons functions). The cached measure is an interval where the lo is the lo of the rightmost element and the hi is the maximum of the element hi’s.
procedure
(itree? x) → boolean?
x : any/c
procedure
(interval? x) → boolean?
x : any/c
procedure
(interval-low i) → number?
i : interval?
procedure
(interval-high i) → number?
i : interval?
procedure
(interval-intersect? i j) → boolean?
i : interval? j : interval?
Bibliography
[HP06] | Ralf Hinze and Ross Paterson, “Finger trees: a simple general-purpose data structure,” Journal of Functional Programming, 2006. |