Aggregate functions
This page describes the available functions to assist with performing aggregate calculations.
avg#
avg(value) calculates simple average of values ignoring missing data (e.g
null values).
Arguments:
valueis any numeric value.
Return value:
Return value type is double.
Examples:
| avg |
|---|
| 22.4 |
| cash_or_card | avg |
|---|---|
| cash | 22.1 |
| card | 27.4 |
| null | 18.02 |
count#
count() or count(*) - counts rows irrespective of underlying data.
Arguments:
countdoes not require arguments.
Return value:
Return value type is long.
Examples:
- Count of rows in the transactions table.
| count |
|---|
| 100 |
- Count of rows in the transactions table aggregated by
payment_typevalue.
| cash_or_card | count |
|---|---|
| cash | 25 |
| card | 70 |
| null | 5 |
note
null values are aggregated with count().
count_distinct#
count_distinct(column_name) - counts distinct values in string, symbol,
long256, long, or int columns.
Return value:
Return value type is long.
Examples:
- Count of distinct sides in the transactions table. Side column can either be
BUYorSELLornull
| count_distinct |
|---|
| 2 |
- Count of distinct counterparties in the transactions table aggregated by
payment_typevalue.
| cash_or_card | count_distinct |
|---|---|
| cash | 3 |
| card | 23 |
| null | 5 |
note
null values are not counted in the count_distinct function.
first/last#
first(column_name)- returns the first value of a column.last(column_name)- returns the last value of a column.
Supported column datatype: double, float, integer, character, short,
byte, timestamp, date, long, geohash.
If a table has a designated timestamp,
then the first row is always the row with the lowest timestamp (oldest) and the
last row is always the one with the highest (latest) timestamp. For a table
without a designated timestamp column, first returns the first row and last
returns the last inserted row, regardless of any timestamp column.
Return value:
Return value type is string.
Examples:
Given a table sensors, which has a designated timestamp column:
| device_id | temperature | ts |
|---|---|---|
| arduino-01 | 12 | 2021-06-02T14:33:19.970258Z |
| arduino-02 | 10 | 2021-06-02T14:33:21.703934Z |
| arduino-03 | 18 | 2021-06-02T14:33:23.707013Z |
The following query returns oldest value for the device_id column:
| first |
|---|
| arduino-01 |
The following query returns the latest symbol value for the device_id column:
| last |
|---|
| arduino-03 |
Without selecting a designated timestamp column, the table may be unordered and
the query may return different result. Given an unordered table
sensors_unordered:
| device_id | temperature | ts |
|---|---|---|
| arduino-01 | 12 | 2021-06-02T14:33:19.970258Z |
| arduino-03 | 18 | 2021-06-02T14:33:23.707013Z |
| arduino-02 | 10 | 2021-06-02T14:33:21.703934Z |
The following query returns the first record for the device_id column:
| first |
|---|
| arduino-01 |
The following query returns the last record for the device_id column:
| last |
|---|
| arduino-02 |
haversine_dist_deg#
haversine_dist_deg(lat, lon, ts) - calculates the traveled distance for a
series of latitude and longitude points.
Arguments:
latis the latitude expressed as degrees in decimal format (double)lonis the longitude expressed as degrees in decimal format (double)tsis thetimestampfor the data point
Return value:
Return value type is double.
Examples:
ksum#
ksum(value) - adds values ignoring missing data (e.g null values). Values
are added using the
Kahan compensated sum algorithm.
This is only beneficial for floating-point values such as float or double.
Arguments:
valueis any numeric value.
Return value:
Return value type is the same as the type of the argument.
Examples:
| ksum |
|---|
| 52.79143968514029 |
max#
max(value) - returns the highest value ignoring missing data (e.g null
values).
Arguments:
valueis any numeric value
Return value:
Return value type is the same as the type of the argument.
Examples:
| max |
|---|
| 55.3 |
| cash_or_card | amount |
|---|---|
| cash | 31.5 |
| card | 55.3 |
| null | 29.2 |
min#
min(value) - returns the lowest value ignoring missing data (e.g null
values).
Arguments:
valueis any numeric value
Return value:
Return value type is the same as the type of the argument.
Examples:
| min |
|---|
| 12.5 |
| cash_or_card | min |
|---|---|
| cash | 12.5 |
| card | 15.3 |
| null | 22.2 |
nsum#
nsum(value) - adds values ignoring missing data (e.g null values). Values
are added using the
Neumaier sum algorithm.
This is only beneficial for floating-point values such as float or double.
Arguments:
valueis any numeric value.
Return value:
Return value type is double.
Examples:
| nsum |
|---|
| 49.5442334742831 |
stddev_samp#
stddev_samp(value) - calculates the sample standard deviation of values
ignoring missing data (e.g null values).
Arguments:
valueis any numeric value.
Return value:
Return value type is double.
Examples:
| stddev_samp |
|---|
| 29.011491975882 |
sum#
sum(value) - adds values ignoring missing data (e.g null values).
Arguments:
valueis any numeric value.
Return value:
Return value type is the same as the type of the argument.
Examples:
| sum |
|---|
| 100 |
| item | count |
|---|---|
| apple | 53 |
| orange | 47 |
Overflow#
sum does not perform overflow check. To avoid overflow, you can cast the
argument to wider type.