dist-lib
1 Distances
distance?
2 Hierarchical clustering
2.1 Dendogram Trees
tree/  c
tree-elem-list
tree-elem-count
tree-depth
tree-level-left
tree-level-right
tree-equal?
tree-image-factory/  table%
new
get-image
tree-image-factory/  tree%
new
get-image
2.2 Distance Matrices
dist-matrix<%>
get-depth
get-elem-count
get-elem-dist
get-elem-set
3 Edit distance
9.0

dist-lib🔗ℹ

Jörgen Brandt

 (require dist-lib) package: dist-lib

dist-lib is a library for distance-related algorithms.

    1 Distances

    2 Hierarchical clustering

      2.1 Dendogram Trees

      2.2 Distance Matrices

    3 Edit distance

1 Distances🔗ℹ

procedure

(distance? x)  boolean?

  x : any/c
A distance is a non-negative, inexact rational number. This predicate returns #t exactly if all three criteria are fulfilled.

2 Hierarchical clustering🔗ℹ

A hierarchical clustering consumes a distance matrix and produces a dendogram tree. Such trees can then be visualized as a bitmap.

2.1 Dendogram Trees🔗ℹ

The end result of a hierarchical clustering is a dendogram tree.

value

tree/c : contract?

A tree is the racket representation of a dendrogram representing a hierarchical clustering.

procedure

(tree-elem-list tree)  (listof string?)

  tree : tree/c
Return the tree’s leaf labels in a list.

procedure

(tree-elem-count tree)  natural?

  tree : tree/c
Return the number of leaves in the tree.

procedure

(tree-depth tree)  distance?

  tree : tree/c
Return the depth of the tree, i.e., the distance covered, walking from the root to any leaf.

procedure

(tree-level-left tree)  natural?

  tree : tree/c
Return the number of branches traversed if always turning left.

procedure

(tree-level-right tree)  natural?

  tree : tree/c
Return the number of branches traversed if always turning right.

procedure

(tree-equal? tree)  boolean?

  tree : tree/c
Predicate comparing two trees. Two trees are equal if they have the same structure, the same leaf labels, and the same depth annotation. Left and right subtrees are interchangeable.

class

tree-image-factory/table% : class?

  superclass: abstract-tree-image-factory%

constructor

(new tree-image-factory/table% 
    [tree tree] 
    [[draw-labels draw-labels] 
    [depth depth]]) 
  (is-a?/c tree-image-factory/table%)
  tree : tree/c
  draw-labels : boolean? = #t
  depth : natural? = 10
Construct an instance of image-factory<%> from a dendogram tree to be plotted in table form. If draw-labels is #t then labels are drawn on the leaves of the tree. The drawing algorithm normalizes the tree’s depth to the value of depth.

method

(send a-tree-image-factory/table get-image)

  (is-a?/c image<%>)
Return the image constructed from the dendogram tree.

class

tree-image-factory/tree% : class?

  superclass: abstract-tree-image-factory%

constructor

(new tree-image-factory/tree% 
    [tree tree] 
    [[draw-labels draw-labels] 
    [depth depth]]) 
  (is-a?/c tree-image-factory/tree%)
  tree : tree/c
  draw-labels : boolean? = #t
  depth : natural? = 10
Construct an instance of image-factory<%> from a dendogram tree to be plotted in tree form. If draw-labels is #t then labels are drawn on the leaves of the tree. The drawing algorithm normalizes the tree’s depth to the value of depth.

method

(send a-tree-image-factory/tree get-image)

  (is-a?/c image<%>)
Return the image constructed from the dendogram tree.

2.2 Distance Matrices🔗ℹ

The starting point of a hierarchical clustering is a distance matrix represented in form of a dist-matrix<%> instance.

Interface representing distance matrix objects.

method

(send a-dist-matrix get-depth a)  distance?

  a : string?
Returns the distance of a given entry a from the leaf layer of the clustering.

method

(send a-dist-matrix get-elem-count)  natural?

Returns the number of entries at the current layer of the clustering.

method

(send a-dist-matrix get-elem-dist a b)  distance?

  a : string?
  b : string?
Returns the distance between the entries labeled a and b at the current layer of the clustering.

method

(send a-dist-matrix get-elem-set)  (set/c string?)

Returns the labels of all entries at the current layer of the clustering.

3 Edit distance🔗ℹ

Edit distance