Event data modeling in ClickHouse requires understanding its unique architecture. Unlike traditional row-based databases, ClickHouse is a columnar OLAP database optimized for analytical queries on large datasets. Getting your table design right from the start will determine whether your queries return in milliseconds or minutes.
This guide covers the essential concepts for modeling event data in ClickHouse, from table engines to query optimization strategies.
Understanding ClickHouse Table Engines
ClickHouse offers multiple table engines, but for event analytics, the MergeTree family is essential. Each engine variant serves different use cases.
MergeTree
The foundational engine for most analytical workloads. It provides:
- Primary key indexing: Sparse index that stores pointers to granules (blocks of 8,192 rows by default) for fast data location
- Data partitioning: Physical separation of data by partition key
- Data compression: Column-level compression with various codecs
- Background merges: Automatic optimization of data parts
CREATE TABLE events (
event_id UUID,
event_time DateTime,
event_date Date,
user_id String,
event_type LowCardinality(String),
properties String,
processed_at DateTime DEFAULT now()
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_type, user_id, event_time)
TTL event_date + INTERVAL 2 YEAR;
ReplacingMergeTree
Use when you need to deduplicate data based on the sorting key. Important caveats to understand:
- Deduplication occurs during background merges, not at insert time
- Until merges complete, duplicate rows may appear in query results
- Use the
FINALmodifier on SELECT to get deduplicated results at query time - Deduplication does not occur across different partitions
- A version column helps determine which row to keep (highest version wins)
CREATE TABLE user_profiles (
user_id String,
updated_at DateTime,
email String,
plan LowCardinality(String),
properties String
)
ENGINE = ReplacingMergeTree(updated_at)
ORDER BY user_id;
-- Query with FINAL to ensure deduplication
SELECT * FROM user_profiles FINAL WHERE user_id = 'user_123';
AggregatingMergeTree
Pre-aggregates data during merges for faster analytical queries. Requires special handling:
- Use
AggregateFunctiondata types to store intermediate aggregation states - Insert data using
-Stateaggregate function variants (e.g.,sumState,uniqState) - Query data using
-Mergefunction variants (e.g.,sumMerge,uniqMerge) - Best when it reduces row count by orders of magnitude
- Commonly used with materialized views for real-time dashboards
-- Target table for aggregated data
CREATE TABLE events_hourly (
hour DateTime,
event_type LowCardinality(String),
event_count SimpleAggregateFunction(sum, UInt64),
unique_users AggregateFunction(uniq, String)
)
ENGINE = AggregatingMergeTree()
ORDER BY (event_type, hour);
-- Materialized view to populate it
CREATE MATERIALIZED VIEW events_hourly_mv TO events_hourly AS
SELECT
toStartOfHour(event_time) AS hour,
event_type,
count() AS event_count,
uniqState(user_id) AS unique_users
FROM events
GROUP BY hour, event_type;
-- Query using -Merge functions
SELECT hour, event_type, sum(event_count), uniqMerge(unique_users)
FROM events_hourly
GROUP BY hour, event_type;
Designing Your Primary Key and Sorting Order
The ORDER BY clause defines the physical sort order of data on disk. By default, it also becomes the primary key (sparse index). You can optionally define a separate PRIMARY KEY that must be a prefix of the ORDER BY expression.
Key Selection Principles
- Put low-cardinality columns first: Columns with fewer unique values should come before high-cardinality ones, as this maximizes index efficiency
- Match your query patterns: The leftmost columns in ORDER BY should be your most common filter conditions
- Use 3-5 columns: A good ORDER BY typically has 3-5 columns; more rarely improves performance
- Timestamp placement: Two common patterns exist:
(..., timestamp)– timestamp last, for queries that filter by other dimensions first(..., toStartOfDay(timestamp), ..., timestamp)– rounded time in middle, for queries that access small time ranges within larger partitions
- Consider compression: For clickstream data, ordering by
session_idcan improve compression even if it's higher cardinality, because related events cluster together
Example: Event Analytics Table
-- Good: Low cardinality first, matches common queries
ORDER BY (event_type, user_id, event_time)
-- Query that benefits from this order:
SELECT count(*) FROM events
WHERE event_type = 'Page Viewed'
AND user_id = 'user_123'
AND event_time >= '2025-01-01';
-- Separate PRIMARY KEY for smaller index (advanced use case)
CREATE TABLE events (...)
ENGINE = MergeTree()
ORDER BY (event_type, user_id, event_time)
PRIMARY KEY (event_type, user_id);
Common Anti-Patterns
- UUID first: Putting high-cardinality UUID columns first destroys index efficiency
- Timestamp only:
ORDER BY event_timealone limits filtering options for non-time queries - Too many columns: More than 5 columns rarely improves performance and increases index size
- Skipping leftmost columns: Queries that filter on columns not at the start of ORDER BY cannot efficiently use the sparse index
Partitioning Strategies
Partitioning physically separates data into independent parts. Important: Think of partitioning primarily as a data management tool, not a query optimization feature. Parts within different partitions are never merged together.
Time-Based Partitioning
Most event data benefits from monthly partitions:
-- Monthly partitions (recommended for most cases)
PARTITION BY toYYYYMM(event_date)
-- Daily partitions (high-volume scenarios, short retention)
PARTITION BY event_date
-- Weekly partitions
PARTITION BY toMonday(event_date)
-- Yearly partitions (long retention, lower volume)
PARTITION BY toYear(event_date)
Partition Size Guidelines
- Target size: 1-300 GB per partition (not row count)
- Partition count: Keep in the dozens to hundreds, not thousands
- Too many partitions: Causes "Too many parts" errors since parts don't merge across partitions
- Small tables: Tables under 10 GB typically don't need partitioning at all
- Query access: A SELECT query should not touch more than a few dozen partitions
Multi-Dimensional Partitioning
For multi-tenant scenarios, combine time with tenant (use with caution to avoid too many partitions):
-- Only if workspace_id has low cardinality (tens, not thousands)
PARTITION BY (workspace_id, toYYYYMM(event_date))
Compression and Data Types
Choosing the right data types and compression codecs significantly impacts storage and query performance.
Optimal Data Types
- LowCardinality(String): For columns with fewer than ~10,000 unique values (event types, countries, status codes). Performance may degrade with over 100,000 distinct values
- Enum8/Enum16: For fixed, known sets of values that rarely change. More storage-efficient than LowCardinality but requires schema changes to add values
- FixedString(N): For fixed-length values like UUIDs (36 chars) or ISO country codes
- DateTime vs DateTime64: Use DateTime64 only when you need sub-second precision
- Nullable: Avoid when possible; use default values instead. Nullable columns require an additional column to track null states, impacting performance
- Smallest integer type: Use UInt8, UInt16, etc. based on actual value ranges, not convenience
Compression Codecs
ClickHouse uses LZ4 by default (ZSTD is the default in ClickHouse Cloud). Specialized codecs can be combined with general-purpose compression:
CREATE TABLE events (
event_id UUID CODEC(ZSTD(1)),
event_time DateTime CODEC(Delta, ZSTD(1)),
sequence_id UInt64 CODEC(DoubleDelta, LZ4),
user_id String CODEC(ZSTD(3)),
event_type LowCardinality(String),
metrics Float64 CODEC(Gorilla, ZSTD(1)),
flags UInt8 CODEC(T64, ZSTD(1))
);
Codec recommendations:
- Timestamps: Delta + ZSTD for excellent compression on sequential times
- Monotonic integers: DoubleDelta + LZ4 for counters or sequential IDs
- Floats/metrics: Gorilla + ZSTD for time-series gauge data with small changes
- Strings: ZSTD with compression level 1-3 (higher levels rarely provide significant gains)
- Small integers: T64 + ZSTD when values use only a portion of the type's range
- Unknown patterns: ZSTD alone is a strong general-purpose default
Materialized Views for Pre-Aggregation
ClickHouse supports two types of materialized views: incremental (real-time, triggered on INSERT) and refreshable (scheduled periodic refresh). Incremental views are preferred for most use cases.
Creating an Incremental Materialized View
-- Source table receives raw events
CREATE TABLE events_raw (
event_time DateTime,
user_id String,
event_type LowCardinality(String),
page_path String
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_time)
ORDER BY (event_type, event_time);
-- Explicitly create the target table first (recommended)
CREATE TABLE events_daily (
date Date,
event_type LowCardinality(String),
event_count UInt64,
unique_users UInt64
) ENGINE = SummingMergeTree()
ORDER BY (date, event_type);
-- Materialized view with TO clause
CREATE MATERIALIZED VIEW events_daily_mv
TO events_daily
AS SELECT
toDate(event_time) AS date,
event_type,
count() AS event_count,
uniqExact(user_id) AS unique_users
FROM events_raw
GROUP BY date, event_type;
Refreshable Materialized Views
For complex queries, joins across multiple tables, or when slight staleness is acceptable:
CREATE MATERIALIZED VIEW top_pages_mv
REFRESH EVERY 1 HOUR
TO top_pages
AS SELECT
page_path,
count() AS views
FROM events_raw
WHERE event_time >= now() - INTERVAL 24 HOUR
GROUP BY page_path
ORDER BY views DESC
LIMIT 100;
Best Practices
- Use the
TOclause to specify an explicit target table for better control and maintainability - Avoid POPULATE when creating views on large tables – it can miss rows inserted during population. Instead, backfill manually after creation
- Keep transformations simple to avoid ingestion bottlenecks
- The view triggers only on INSERTs to the leftmost table in the SELECT – it doesn't react to updates or deletes
- Use SummingMergeTree for sum/count aggregations, AggregatingMergeTree for complex aggregations
- Monitor materialized view lag during high-volume periods via
system.query_views_log
Data Skipping Indexes
Beyond the primary key, ClickHouse supports secondary data skipping indexes that help filter granules for columns not in the ORDER BY:
CREATE TABLE events (
event_id UUID,
event_time DateTime,
user_id String,
event_type LowCardinality(String),
error_code Nullable(UInt16),
request_id String,
INDEX idx_error_code error_code TYPE minmax GRANULARITY 4,
INDEX idx_request_id request_id TYPE bloom_filter(0.01) GRANULARITY 4
) ENGINE = MergeTree()
ORDER BY (event_type, event_time);
Index types:
- minmax: Stores min/max values per granule group. Good for range queries
- set(N): Stores up to N unique values per granule group
- bloom_filter: Probabilistic filter for point lookups on high-cardinality columns
- ngrambf_v1: N-gram bloom filter for LIKE queries on strings
Query Optimization Techniques
Even with good table design, query optimization matters. Here are key techniques for analytical queries.
Use PREWHERE for Heavy Filters
PREWHERE filters data before reading other columns, reducing I/O. ClickHouse often applies this automatically, but you can make it explicit:
-- PREWHERE filters before reading other columns
SELECT user_id, properties
FROM events
PREWHERE event_type = 'Purchase Completed'
WHERE toDate(event_time) >= '2025-01-01';
Use FINAL for ReplacingMergeTree
When querying ReplacingMergeTree tables, use FINAL to ensure deduplicated results:
-- Without FINAL: may return duplicate rows
SELECT * FROM user_profiles WHERE user_id = 'user_123';
-- With FINAL: guaranteed deduplication (slower)
SELECT * FROM user_profiles FINAL WHERE user_id = 'user_123';
Sampling for Approximate Results
-- Fast approximate counts on large datasets
SELECT event_type, count() * 10 AS estimated_count
FROM events SAMPLE 0.1
GROUP BY event_type;
Approximate vs Exact Functions
-- Exact (slower, more memory)
SELECT uniqExact(user_id) FROM events;
-- Approximate (10-100x faster, ~2% error margin)
SELECT uniq(user_id) FROM events;
Avoiding Common Performance Issues
- SELECT *: Always specify only needed columns
- DISTINCT on large sets: Use
uniq()oruniqExact()instead - ORDER BY without LIMIT: Always pair sorting with limits
- Functions on indexed columns:
WHERE toDate(event_time) = '2025-01-01'prevents index use; preferWHERE event_time >= '2025-01-01' AND event_time < '2025-01-02' - Nullable columns in GROUP BY: Avoid when possible; impacts aggregation performance
Efficient Joins
-- Use dictionaries for dimension lookups (fastest)
SELECT
e.event_type,
dictGet('users_dict', 'email', e.user_id) AS user_email
FROM events e
WHERE e.event_time >= today() - 7;
-- For large joins, put the smaller table on the right
SELECT e.*, u.plan
FROM events e
INNER JOIN users u ON e.user_id = u.user_id;
-- Consider denormalization for read-heavy workloads
Projections as an Alternative
Projections store pre-computed query results within the same table, automatically selected by the query optimizer:
CREATE TABLE events (
event_time DateTime,
user_id String,
event_type LowCardinality(String),
page_path String,
-- Define a projection for queries grouped by user
PROJECTION events_by_user (
SELECT * ORDER BY user_id, event_time
)
) ENGINE = MergeTree()
ORDER BY (event_type, event_time);
Projections vs. Materialized Views:
- Projections are stored within the same table and automatically maintained
- The optimizer automatically chooses projections when beneficial
- Materialized views offer more flexibility and can transform data
- Projections add storage overhead proportional to the data they store
Schema Evolution Strategies
Event schemas evolve over time. Plan for changes from the start.
Adding Columns
-- Add new column with default value
ALTER TABLE events
ADD COLUMN campaign_id String DEFAULT '';
-- Backfill if needed (mutation - use sparingly)
ALTER TABLE events
UPDATE campaign_id = extractURLParameter(page_url, 'utm_campaign')
WHERE campaign_id = '';
Flexible Properties Column
Store variable properties in a JSON column for flexibility:
CREATE TABLE events (
event_id UUID,
event_time DateTime,
event_type LowCardinality(String),
-- Structured fields for common queries
user_id String,
page_path String,
-- Flexible JSON for variable properties
properties String
) ENGINE = MergeTree()
ORDER BY (event_type, event_time);
-- Query JSON properties
SELECT JSONExtractString(properties, 'browser') AS browser
FROM events
WHERE event_type = 'Page Viewed';
Monitoring and Maintenance
Keep your ClickHouse tables healthy with regular monitoring.
Key Metrics to Watch
- Active parts count:
SELECT count() FROM system.parts WHERE active AND table = 'events' - Partition sizes: Monitor for skewed distributions
- Query performance: Use
system.query_logfor slow query analysis - Merge health: Ensure background merges keep up with inserts
- Compression ratios: Check via
system.parts_columns
Maintenance Tasks
-- Force merge of small parts (use sparingly, resource-intensive)
OPTIMIZE TABLE events FINAL;
-- Drop old partitions (fast operation)
ALTER TABLE events DROP PARTITION '202301';
-- Check table health
SELECT
table,
partition,
count() AS parts,
sum(rows) AS rows,
formatReadableSize(sum(bytes_on_disk)) AS size
FROM system.parts
WHERE active AND database = 'default'
GROUP BY table, partition
ORDER BY parts DESC;
Summary
Effective event data modeling in ClickHouse requires:
- Choosing the right MergeTree engine variant (MergeTree for raw data, ReplacingMergeTree for deduplication, AggregatingMergeTree for pre-aggregation)
- Designing ORDER BY with 3-5 columns, low cardinality first, matching your query patterns
- Partitioning by time for data management (dozens to hundreds of partitions, not thousands)
- Using optimal data types (LowCardinality, smallest integer types, avoiding Nullable)
- Applying appropriate compression codecs (Delta for timestamps, ZSTD as general-purpose)
- Leveraging materialized views for pre-aggregation, with explicit TO tables
- Understanding that ReplacingMergeTree requires FINAL for guaranteed deduplication
- Applying query optimization techniques (PREWHERE, approximate functions, proper column selection)
Invest time in upfront design, test with realistic data volumes, and iterate based on actual query patterns. A well-modeled ClickHouse database will handle billions of events while returning results in milliseconds.