JOIN keyword
QuestDB supports the following types of joins: INNER, LEFT (OUTER), CROSS,
ASOF and SPLICE. FULL joins are not yet implemented and are on our
roadmap. All supported join types can be combined in a single SQL statement;
QuestDB SQL's optimizer determines the best execution order and algorithms.
There are no known limitations on size of tables or sub-queries used in joins and there are no limitations on number of joins either.
Syntax#
Columns from joined tables are combined in single row. Columns with the same name originating from different tables will be automatically aliased to create a unique column namespace of the resulting set.
Though it is usually preferable to explicitly specify join conditions, QuestDB
will analyze WHERE clauses for implicit join conditions and will derive
transient join conditions where necessary.
tip
When tables are joined on column that has the same name in both tables you can
use the ON (column) shorthand.
Execution order#
Join operations are performed in order of their appearance in a SQL query. The following query performs a join on a table with one million rows based on a column from a smaller table with one hundred rows:
The performance of this query can be improved by rewriting the query as follows:
Implicit joins#
It is possible to join two tables using the following syntax:
The type of join as well as the column will be inferred from the WHERE clause,
and may be either an INNER or CROSS join. For the example above, the
equivalent explicit statement would be:
(INNER) JOIN#
(INNER) JOIN is used to return rows from 2 tables where the records on the
compared column have matching values in both tables. JOIN is interpreted as
INNER JOIN by default, making the INNER keyword implicit.
The following query will return the movieId and the average rating from table
ratings. It will also add a column for the title from table movies. The
corresponding title will be identified based on the movieId in the ratings
table matching an id in the movies table.
LEFT (OUTER) JOIN#
LEFT OUTER JOIN or simply LEFT JOIN will return all records from the
left table, and if matched, the records of the right table. When there is no
match for the right table, it will return NULL values in right table fields.
The general syntax is as follows:
An OUTER JOIN query can also be used to select all rows in the left table that
do not exist in the right table.
CROSS JOIN#
CROSS JOIN will return the Cartesian product of the two tables being joined
and can be used to create a table with all possible combinations of columns. The
following query will return all possible combinations of starters and
deserts:
note
CROSS JOIN does not have an ON clause.
ASOF JOIN#
ASOF joins are used on time series data to join two tables based on timestamp
where timestamps do not exactly match. For a given record at a given timestamp,
it will return the corresponding record in the other table at the closest
timestamp prior to the timestamp in the first table.
note
To be able to leverage ASOF JOIN, both joined table must have a designated
timestamp column. To designate a column as timestamp, please refer to the
CREATE TABLE section.
ASOF join is performed on tables or result sets that are ordered by time. When
table is created as ordered by time order of records is enforced and timestamp
column name is in table metadata. ASOF join will use timestamp column from
metadata.
Given the following tables:
| ts | ask |
|---|---|
| 2019-10-17T00:00:00.000000Z | 100 |
| 2019-10-17T00:00:00.200000Z | 101 |
| 2019-10-17T00:00:00.400000Z | 102 |
| ts | bid |
|---|---|
| 2019-10-17T00:00:00.100000Z | 101 |
| 2019-10-17T00:00:00.300000Z | 102 |
| 2019-10-17T00:00:00.500000Z | 103 |
An ASOF JOIN query can look like the following:
The above query returns these results:
| timebid | bid | ask |
|---|---|---|
| 2019-10-17T00:00:00.100000Z | 101 | 100 |
| 2019-10-17T00:00:00.300000Z | 102 | 101 |
| 2019-10-17T00:00:00.500000Z | 103 | 102 |
Note that there is no ASKS at timestamp 2019-10-17T00:00:00.100000Z. The
ASOF JOIN will look for the value in the BIDS table that has the closest
timestamp prior to or equal to the target timestamp.
In case tables do not have designated timestamp column, but data is in chronological order, timestamp columns can be specified at runtime:
The query above assumes that there is only one instrument in BIDS and ASKS
tables and therefore does not use the optional ON clause. If both tables store
data for multiple instruments ON clause will allow you to find bids for asks
with matching instrument value:
caution
ASOF join does not check timestamp order, if data is not in chronological
order, the join result is non-deterministic.
LT JOIN#
LT join is very similar to ASOF, except that it searches for the last row
from the right table strictly before the row from the left table. There will be
one or no rows joined from the right table per each row from the left table.
Consider the following tables:
| ts | ask |
|---|---|
| 2019-10-17T00:00:00.000000Z | 100 |
| 2019-10-17T00:00:00.300000Z | 101 |
| 2019-10-17T00:00:00.400000Z | 102 |
| ts | bid |
|---|---|
| 2019-10-17T00:00:00.000000Z | 101 |
| 2019-10-17T00:00:00.300000Z | 102 |
| 2019-10-17T00:00:00.500000Z | 103 |
An LT JOIN can be built using the following query:
The query above returns the following results:
| timebid | timeask | bid | ask |
|---|---|---|---|
| 2019-10-17T00:00:00.000000Z | NULL | 101 | NULL |
| 2019-10-17T00:00:00.300000Z | 2019-10-17T00:00:00.000000Z | 102 | 100 |
| 2019-10-17T00:00:00.500000Z | 2019-10-17T00:00:00.400000Z | 103 | 102 |
note
LT join is often useful to join a table to itself in order to get preceding
values for every row.
SPLICE JOIN#
SPLICE JOIN is a full ASOF JOIN. It will return all the records from both
tables. For each record from left table splice join will find prevailing record
from right table and for each record from right table - prevailing record from
left table.
Considering the following tables.
| ts | ask |
|---|---|
| 2019-10-17T00:00:00.000000Z | 100 |
| 2019-10-17T00:00:00.200000Z | 101 |
| 2019-10-17T00:00:00.400000Z | 102 |
| ts | bid |
|---|---|
| 2019-10-17T00:00:00.100000Z | 101 |
| 2019-10-17T00:00:00.300000Z | 102 |
| 2019-10-17T00:00:00.500000Z | 103 |
A SPLICE JOIN can be built as follows:
This query returns the following results:
| timebid | bid | ask |
|---|---|---|
| null | null | 100 |
| 2019-10-17T00:00:00.100000Z | 101 | 100 |
| 2019-10-17T00:00:00.100000Z | 101 | 101 |
| 2019-10-17T00:00:00.300000Z | 102 | 101 |
| 2019-10-17T00:00:00.300000Z | 102 | 102 |
| 2019-10-17T00:00:00.500000Z | 103 | 102 |
Note that the above query does not use the optional ON clause. In case you
need additional filtering on the two tables, the ON clause can be used as
follows: