ClickHouse configurations
Models
Type | Supported? | Details |
---|---|---|
view materialization | YES | Creates a view. |
table materialization | YES | Creates a table. See below for the list of supported engines. |
incremental materialization | YES | Creates a table if it doesn't exist, and then writes only updates to it. |
ephemeral materialized | YES | Creates a ephemeral/CTE materialization. This does model is internal to dbt and does not create any database objects |
Experimental models
The following are experimental features in Clickhouse:
Type | Supported? | Details |
---|---|---|
Materialized View materialization | YES, Experimental | Creates a materialized view. |
Distributed table materialization | YES, Experimental | Creates a distributed table. |
Distributed incremental materialization | YES, Experimental | Incremental model based on the same idea as distributed table. Note that not all strategies are supported, visit this for more info. |
Dictionary materialization | YES, Experimental | Creates a dictionary. |
View materialization
A dbt model can be created as a ClickHouse view and configured using the following syntax:
- Project file
- Config block
models:
<resource-path>:
+materialized: view
{{ config(materialized = "view") }}
Table materialization
A dbt model can be created as a ClickHouse table and configured using the following syntax:
- Project file
- Config block
models:
<resource-path>:
+materialized: table
+order_by: [ <column-name>, ... ]
+engine: <engine-type>
+partition_by: [ <column-name>, ... ]
{{ config(
materialized = "table",
engine = "<engine-type>",
order_by = [ "<column-name>", ... ],
partition_by = [ "<column-name>", ... ],
...
]
) }}
Table configuration
Option | Description | Required? |
---|---|---|
materialized | How the model will be materialized into ClickHouse. Must be table to create a table model. | Required |
engine | The table engine to use when creating tables. See list of supported engines below. | Optional (default: MergeTree() ) |
order_by | A tuple of column names or arbitrary expressions. This allows you to create a small sparse index that helps find data faster. | Optional (default: tuple() ) |
partition_by | A partition is a logical combination of records in a table by a specified criterion. The partition key can be any expression from the table columns. | Optional |
Incremental materialization
Table model will be reconstructed for each dbt execution. This may be infeasible and extremely costly for larger result sets or complex transformations. To address this challenge and reduce the build time, a dbt model can be created as an incremental ClickHouse table and is configured using the following syntax:
- Project file
- Config block
models:
<resource-path>:
+materialized: incremental
+order_by: [ <column-name>, ... ]
+engine: <engine-type>
+partition_by: [ <column-name>, ... ]
+unique_key: [ <column-name>, ... ]
+inserts_only: [ True|False ]
{{ config(
materialized = "incremental",
engine = "<engine-type>",
order_by = [ "<column-name>", ... ],
partition_by = [ "<column-name>", ... ],
unique_key = [ "<column-name>", ... ],
inserts_only = [ True|False ],
...
]
) }}
Incremental table configuration
Option | Description | Required? |
---|---|---|
materialized | How the model will be materialized into ClickHouse. Must be table to create a table model. | Required |
unique_key | A tuple of column names that uniquely identify rows. For more details on uniqueness constraints, see here. | Required. If not provided altered rows will be added twice to the incremental table. |
engine | The table engine to use when creating tables. See list of supported engines below. | Optional (default: MergeTree() ) |
order_by | A tuple of column names or arbitrary expressions. This allows you to create a small sparse index that helps find data faster. | Optional (default: tuple() ) |
partition_by | A partition is a logical combination of records in a table by a specified criterion. The partition key can be any expression from the table columns. | Optional |
inserts_only | (Deprecated, see the append materialization strategy). If True, incremental updates will be inserted directly to the target incremental table without creating an intermediate table. | Optional (default: False ) |
incremental_strategy | The strategy to use for incremental materialization. delete+insert , append and insert_overwrite (experimental) are supported. For additional details on strategies, see here | Optional (default: 'default') |
incremental_predicates | Incremental predicate clause to be applied to delete+insert materializations | Optional |
Snapshot
dbt snapshots allow a record to be made of changes to a mutable model over time. This in turn allows point-in-time queries on models, where analysts can “look back in time” at the previous state of a model. This functionality is supported by the ClickHouse connector and is configured using the following syntax:
For more information on configuration, check out the snapshot configs reference page.
Supported table engines
Experimental supported table engines
Type | Details |
---|---|
Distributed Table | https://clickhouse.com/docs/en/engines/table-engines/special/distributed. |
Dictionary | https://clickhouse.com/docs/en/engines/table-engines/special/dictionary |
If you encounter issues connecting to ClickHouse from dbt with one of the above engines, please report an issue here.
Cross database macro support
dbt-clickhouse supports most of the cross database macros now included in dbt-core, with the following exceptions:
- The
split_part
SQL function is implemented in ClickHouse using the splitByChar function. This function requires using a constant string for the "split" delimiter, so thedelimeter
parameter used for this macro will be interpreted as a string, not a column name - Similarly, the
replace
SQL function in ClickHouse requires constant strings for theold_chars
andnew_chars
parameters, so those parameters will be interpreted as strings rather than column names when invoking this macro.
Setting quote_columns
To prevent a warning, make sure to explicitly set a value for quote_columns
in your dbt_project.yml
. See the doc on quote_columns for more information.
seeds:
+quote_columns: false #or `true` if you have csv column headers with spaces