On this page:

7 Summarizing

This operation summarizes a data frame into a smaller one, using some kind of summary statistic, based on a vectorized operation.


(aggregate df [new-column (bound-column ...) body ...] ...)

  df : (or/c data-frame? grouped-data-frame?)
Creates a new frame with the grouping information (if any) and new aggregated columns as specified. The new data-frame will have only columns corresponding to the groups of the frame and the new derived columns. It is likely you want to use group-with first (or else you’ll end up with just a single value).

Each new column is specified by a single clause. The column created will have the name new-column, and be specified by the expressions in body.

The bound variables in body are specified by bound-column. Unlike create, all variables bound in body are the entirety of the column as a vector. body is expected to produce a single value, which is the "aggregation" of that vector.

The binding structure of aggregate is like let: all bound-columns come from df.

If the input is a grouped data-frame, the last layer of grouping will be implicitly removed after aggregating.

> (~> example-df
      (aggregate [sum (adult) (vector-length adult)])

data-frame: 1 rows x 1 columns






> (~> example-df
      (group-with "grp")
      (aggregate [adult-sum (adult) (sum adult)]
                 [juv-sum (juv) (sum juv)])

data-frame: 2 rows x 3 columns




│30     │3        │a  


│120    │12       │b  


> (~> example-df
      (group-with "grp" "trt")
      (aggregate [adult-sum (adult) (sum adult)]
                 [juv-sum (juv) (sum juv)])

data-frame: 3 rows x 4 columns

groups: (grp)




│a  │3        │30     │b  


│b  │3        │30     │a  


│b  │9        │90     │b