On this page:

6 Creating and modifying columns

These operations add new variables, preserving existing ones. This operation uses both vectorized and regular operations.


(create df [new-column (binder ...) body ...] ...)

binder = bound-column
  | [bound-column : type]
type = element
  | vector
  df : (or/c data-frame? grouped-data-frame?)
Returns df, except with a derived column, or multiple derived columns. If the given column is already present in the data-frame, it will be (immutably) overridden, and otherwise it will be created.

Each new column is specified by a single clause. The column created will have the name new-column, and be specified by the expressions in body.

The bound variables in body are specified by binder. Each bound variable either has the type element, which binds a single element of the given column and maps over it, or vector, which binds the entire column. If a type for a bound variable is not specified, it defaults to element.

The binding structure of create is like let*: columns can depend on those coming in previous clauses, but not the other way around.

If every bound variable in a given column specification is of type vector, it is expected that body produces a vector of the same length as all other columns. Otherwise, it is expected that body produces some quantity, and it will be mapped over every column specified by variables of type element.

> (define (v/ vec c) (vector-map (λ (v) (/ v c)) vec))
> (define (sum vec)
    (for/sum ([v (in-vector vec)])
> (~> example-df
      (create [total (adult juv) (+ adult juv)]
              [grp (grp) (string-append "blerg" grp)]
              [freq ([juv : vector]) (v/ juv (sum juv))])

data-frame: 5 rows x 6 columns


│grp   │freq│trt│adult│juv│total│


│blerga│1/15│b  │1    │10 │11   


│blerga│2/15│b  │2    │20 │22   


│blergb│1/5 │a  │3    │30 │33   


│blergb│4/15│b  │4    │40 │44   


│blergb│1/3 │b  │5    │50 │55   



(rename df from to ...)  (or/c data-frame? grouped-data-frame?)

  df : (or/c data-frame? grouped-data-frame?)
  from : string?
  to : string?
Returns df, except with each column with name from renamed to to.

> (~> example-df
      (rename "grp" "waldo"
              "trt" "warbly")

data-frame: 5 rows x 4 columns




│1    │a    │10 │b     


│2    │a    │20 │b     


│3    │b    │30 │a     


│4    │b    │40 │b     


│5    │b    │50 │b