clickhouse event data modeling

Event data modeling in ClickHouse requires understanding its unique architecture. Unlike traditional row-based databases, ClickHouse is a columnar OLAP database optimized for analytical queries on large datasets. Getting your table design right from the start will determine whether your queries return in milliseconds or minutes.

This guide covers the essential concepts for modeling event data in ClickHouse, from table engines to query optimization strategies.

Understanding ClickHouse Table Engines

ClickHouse offers multiple table engines, but for event analytics, the MergeTree family is essential. Each engine variant serves different use cases.

MergeTree

The foundational engine for most analytical workloads. It provides:

Primary key indexing: Sparse index that stores pointers to granules (blocks of 8,192 rows by default) for fast data location
Data partitioning: Physical separation of data by partition key
Data compression: Column-level compression with various codecs
Background merges: Automatic optimization of data parts

CREATE TABLE events (
    event_id UUID,
    event_time DateTime,
    event_date Date,
    user_id String,
    event_type LowCardinality(String),
    properties String,
    processed_at DateTime DEFAULT now()
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_date)
ORDER BY (event_type, user_id, event_time)
TTL event_date + INTERVAL 2 YEAR;

ReplacingMergeTree

Use when you need to deduplicate data based on the sorting key. Important caveats to understand:

Deduplication occurs during background merges, not at insert time
Until merges complete, duplicate rows may appear in query results
Use the FINAL modifier on SELECT to get deduplicated results at query time
Deduplication does not occur across different partitions
A version column helps determine which row to keep (highest version wins)

CREATE TABLE user_profiles (
    user_id String,
    updated_at DateTime,
    email String,
    plan LowCardinality(String),
    properties String
)
ENGINE = ReplacingMergeTree(updated_at)
ORDER BY user_id;

-- Query with FINAL to ensure deduplication
SELECT * FROM user_profiles FINAL WHERE user_id = 'user_123';

AggregatingMergeTree

Pre-aggregates data during merges for faster analytical queries. Requires special handling:

Use AggregateFunction data types to store intermediate aggregation states
Insert data using -State aggregate function variants (e.g., sumState, uniqState)
Query data using -Merge function variants (e.g., sumMerge, uniqMerge)
Best when it reduces row count by orders of magnitude
Commonly used with materialized views for real-time dashboards

-- Target table for aggregated data
CREATE TABLE events_hourly (
    hour DateTime,
    event_type LowCardinality(String),
    event_count SimpleAggregateFunction(sum, UInt64),
    unique_users AggregateFunction(uniq, String)
)
ENGINE = AggregatingMergeTree()
ORDER BY (event_type, hour);

-- Materialized view to populate it
CREATE MATERIALIZED VIEW events_hourly_mv TO events_hourly AS
SELECT
    toStartOfHour(event_time) AS hour,
    event_type,
    count() AS event_count,
    uniqState(user_id) AS unique_users
FROM events
GROUP BY hour, event_type;

-- Query using -Merge functions
SELECT hour, event_type, sum(event_count), uniqMerge(unique_users)
FROM events_hourly
GROUP BY hour, event_type;

Designing Your Primary Key and Sorting Order

The ORDER BY clause defines the physical sort order of data on disk. By default, it also becomes the primary key (sparse index). You can optionally define a separate PRIMARY KEY that must be a prefix of the ORDER BY expression.

Key Selection Principles

Put low-cardinality columns first: Columns with fewer unique values should come before high-cardinality ones, as this maximizes index efficiency
Match your query patterns: The leftmost columns in ORDER BY should be your most common filter conditions
Use 3-5 columns: A good ORDER BY typically has 3-5 columns; more rarely improves performance
Timestamp placement: Two common patterns exist:
- (..., timestamp) – timestamp last, for queries that filter by other dimensions first
- (..., toStartOfDay(timestamp), ..., timestamp) – rounded time in middle, for queries that access small time ranges within larger partitions
Consider compression: For clickstream data, ordering by session_id can improve compression even if it's higher cardinality, because related events cluster together

Example: Event Analytics Table

-- Good: Low cardinality first, matches common queries
ORDER BY (event_type, user_id, event_time)

-- Query that benefits from this order:
SELECT count(*) FROM events
WHERE event_type = 'Page Viewed'
  AND user_id = 'user_123'
  AND event_time >= '2025-01-01';
  
-- Separate PRIMARY KEY for smaller index (advanced use case)
CREATE TABLE events (...)
ENGINE = MergeTree()
ORDER BY (event_type, user_id, event_time)
PRIMARY KEY (event_type, user_id);

Common Anti-Patterns

UUID first: Putting high-cardinality UUID columns first destroys index efficiency
Timestamp only: ORDER BY event_time alone limits filtering options for non-time queries
Too many columns: More than 5 columns rarely improves performance and increases index size
Skipping leftmost columns: Queries that filter on columns not at the start of ORDER BY cannot efficiently use the sparse index

Partitioning Strategies

Partitioning physically separates data into independent parts. Important: Think of partitioning primarily as a data management tool, not a query optimization feature. Parts within different partitions are never merged together.

Time-Based Partitioning

Most event data benefits from monthly partitions:

-- Monthly partitions (recommended for most cases)
PARTITION BY toYYYYMM(event_date)

-- Daily partitions (high-volume scenarios, short retention)
PARTITION BY event_date

-- Weekly partitions
PARTITION BY toMonday(event_date)

-- Yearly partitions (long retention, lower volume)
PARTITION BY toYear(event_date)

Partition Size Guidelines

Target size: 1-300 GB per partition (not row count)
Partition count: Keep in the dozens to hundreds, not thousands
Too many partitions: Causes "Too many parts" errors since parts don't merge across partitions
Small tables: Tables under 10 GB typically don't need partitioning at all
Query access: A SELECT query should not touch more than a few dozen partitions

Multi-Dimensional Partitioning

For multi-tenant scenarios, combine time with tenant (use with caution to avoid too many partitions):

-- Only if workspace_id has low cardinality (tens, not thousands)
PARTITION BY (workspace_id, toYYYYMM(event_date))

Compression and Data Types

Choosing the right data types and compression codecs significantly impacts storage and query performance.

Optimal Data Types

LowCardinality(String): For columns with fewer than ~10,000 unique values (event types, countries, status codes). Performance may degrade with over 100,000 distinct values
Enum8/Enum16: For fixed, known sets of values that rarely change. More storage-efficient than LowCardinality but requires schema changes to add values
FixedString(N): For fixed-length values like UUIDs (36 chars) or ISO country codes
DateTime vs DateTime64: Use DateTime64 only when you need sub-second precision
Nullable: Avoid when possible; use default values instead. Nullable columns require an additional column to track null states, impacting performance
Smallest integer type: Use UInt8, UInt16, etc. based on actual value ranges, not convenience

Compression Codecs

ClickHouse uses LZ4 by default (ZSTD is the default in ClickHouse Cloud). Specialized codecs can be combined with general-purpose compression:

CREATE TABLE events (
    event_id UUID CODEC(ZSTD(1)),
    event_time DateTime CODEC(Delta, ZSTD(1)),
    sequence_id UInt64 CODEC(DoubleDelta, LZ4),
    user_id String CODEC(ZSTD(3)),
    event_type LowCardinality(String),
    metrics Float64 CODEC(Gorilla, ZSTD(1)),
    flags UInt8 CODEC(T64, ZSTD(1))
);

Codec recommendations:

Timestamps: Delta + ZSTD for excellent compression on sequential times
Monotonic integers: DoubleDelta + LZ4 for counters or sequential IDs
Floats/metrics: Gorilla + ZSTD for time-series gauge data with small changes
Strings: ZSTD with compression level 1-3 (higher levels rarely provide significant gains)
Small integers: T64 + ZSTD when values use only a portion of the type's range
Unknown patterns: ZSTD alone is a strong general-purpose default

Materialized Views for Pre-Aggregation

ClickHouse supports two types of materialized views: incremental (real-time, triggered on INSERT) and refreshable (scheduled periodic refresh). Incremental views are preferred for most use cases.

Creating an Incremental Materialized View

-- Source table receives raw events
CREATE TABLE events_raw (
    event_time DateTime,
    user_id String,
    event_type LowCardinality(String),
    page_path String
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(event_time)
ORDER BY (event_type, event_time);

-- Explicitly create the target table first (recommended)
CREATE TABLE events_daily (
    date Date,
    event_type LowCardinality(String),
    event_count UInt64,
    unique_users UInt64
) ENGINE = SummingMergeTree()
ORDER BY (date, event_type);

-- Materialized view with TO clause
CREATE MATERIALIZED VIEW events_daily_mv
TO events_daily
AS SELECT
    toDate(event_time) AS date,
    event_type,
    count() AS event_count,
    uniqExact(user_id) AS unique_users
FROM events_raw
GROUP BY date, event_type;

Refreshable Materialized Views

For complex queries, joins across multiple tables, or when slight staleness is acceptable:

CREATE MATERIALIZED VIEW top_pages_mv
REFRESH EVERY 1 HOUR
TO top_pages
AS SELECT
    page_path,
    count() AS views
FROM events_raw
WHERE event_time >= now() - INTERVAL 24 HOUR
GROUP BY page_path
ORDER BY views DESC
LIMIT 100;

Best Practices

Use the TO clause to specify an explicit target table for better control and maintainability
Avoid POPULATE when creating views on large tables – it can miss rows inserted during population. Instead, backfill manually after creation
Keep transformations simple to avoid ingestion bottlenecks
The view triggers only on INSERTs to the leftmost table in the SELECT – it doesn't react to updates or deletes
Use SummingMergeTree for sum/count aggregations, AggregatingMergeTree for complex aggregations
Monitor materialized view lag during high-volume periods via system.query_views_log

Data Skipping Indexes

Beyond the primary key, ClickHouse supports secondary data skipping indexes that help filter granules for columns not in the ORDER BY:

CREATE TABLE events (
    event_id UUID,
    event_time DateTime,
    user_id String,
    event_type LowCardinality(String),
    error_code Nullable(UInt16),
    request_id String,
    INDEX idx_error_code error_code TYPE minmax GRANULARITY 4,
    INDEX idx_request_id request_id TYPE bloom_filter(0.01) GRANULARITY 4
) ENGINE = MergeTree()
ORDER BY (event_type, event_time);

Index types:

minmax: Stores min/max values per granule group. Good for range queries
set(N): Stores up to N unique values per granule group
bloom_filter: Probabilistic filter for point lookups on high-cardinality columns
ngrambf_v1: N-gram bloom filter for LIKE queries on strings

Query Optimization Techniques

Even with good table design, query optimization matters. Here are key techniques for analytical queries.

Use PREWHERE for Heavy Filters

PREWHERE filters data before reading other columns, reducing I/O. ClickHouse often applies this automatically, but you can make it explicit:

-- PREWHERE filters before reading other columns
SELECT user_id, properties
FROM events
PREWHERE event_type = 'Purchase Completed'
WHERE toDate(event_time) >= '2025-01-01';

Use FINAL for ReplacingMergeTree

When querying ReplacingMergeTree tables, use FINAL to ensure deduplicated results:

-- Without FINAL: may return duplicate rows
SELECT * FROM user_profiles WHERE user_id = 'user_123';

-- With FINAL: guaranteed deduplication (slower)
SELECT * FROM user_profiles FINAL WHERE user_id = 'user_123';

Sampling for Approximate Results

-- Fast approximate counts on large datasets
SELECT event_type, count() * 10 AS estimated_count
FROM events SAMPLE 0.1
GROUP BY event_type;

Approximate vs Exact Functions

-- Exact (slower, more memory)
SELECT uniqExact(user_id) FROM events;

-- Approximate (10-100x faster, ~2% error margin)
SELECT uniq(user_id) FROM events;

Avoiding Common Performance Issues

SELECT *: Always specify only needed columns
DISTINCT on large sets: Use uniq() or uniqExact() instead
ORDER BY without LIMIT: Always pair sorting with limits
Functions on indexed columns: WHERE toDate(event_time) = '2025-01-01' prevents index use; prefer WHERE event_time >= '2025-01-01' AND event_time < '2025-01-02'
Nullable columns in GROUP BY: Avoid when possible; impacts aggregation performance

Efficient Joins

-- Use dictionaries for dimension lookups (fastest)
SELECT
    e.event_type,
    dictGet('users_dict', 'email', e.user_id) AS user_email
FROM events e
WHERE e.event_time >= today() - 7;

-- For large joins, put the smaller table on the right
SELECT e.*, u.plan
FROM events e
INNER JOIN users u ON e.user_id = u.user_id;

-- Consider denormalization for read-heavy workloads

Projections as an Alternative

Projections store pre-computed query results within the same table, automatically selected by the query optimizer:

CREATE TABLE events (
    event_time DateTime,
    user_id String,
    event_type LowCardinality(String),
    page_path String,
    -- Define a projection for queries grouped by user
    PROJECTION events_by_user (
        SELECT * ORDER BY user_id, event_time
    )
) ENGINE = MergeTree()
ORDER BY (event_type, event_time);

Projections vs. Materialized Views:

Projections are stored within the same table and automatically maintained
The optimizer automatically chooses projections when beneficial
Materialized views offer more flexibility and can transform data
Projections add storage overhead proportional to the data they store

Schema Evolution Strategies

Event schemas evolve over time. Plan for changes from the start.

Adding Columns

-- Add new column with default value
ALTER TABLE events
ADD COLUMN campaign_id String DEFAULT '';

-- Backfill if needed (mutation - use sparingly)
ALTER TABLE events
UPDATE campaign_id = extractURLParameter(page_url, 'utm_campaign')
WHERE campaign_id = '';

Flexible Properties Column

Store variable properties in a JSON column for flexibility:

CREATE TABLE events (
    event_id UUID,
    event_time DateTime,
    event_type LowCardinality(String),
    -- Structured fields for common queries
    user_id String,
    page_path String,
    -- Flexible JSON for variable properties
    properties String
) ENGINE = MergeTree()
ORDER BY (event_type, event_time);

-- Query JSON properties
SELECT JSONExtractString(properties, 'browser') AS browser
FROM events
WHERE event_type = 'Page Viewed';

Monitoring and Maintenance

Keep your ClickHouse tables healthy with regular monitoring.

Key Metrics to Watch

Active parts count: SELECT count() FROM system.parts WHERE active AND table = 'events'
Partition sizes: Monitor for skewed distributions
Query performance: Use system.query_log for slow query analysis
Merge health: Ensure background merges keep up with inserts
Compression ratios: Check via system.parts_columns

Maintenance Tasks

-- Force merge of small parts (use sparingly, resource-intensive)
OPTIMIZE TABLE events FINAL;

-- Drop old partitions (fast operation)
ALTER TABLE events DROP PARTITION '202301';

-- Check table health
SELECT
    table,
    partition,
    count() AS parts,
    sum(rows) AS rows,
    formatReadableSize(sum(bytes_on_disk)) AS size
FROM system.parts
WHERE active AND database = 'default'
GROUP BY table, partition
ORDER BY parts DESC;

Summary

Effective event data modeling in ClickHouse requires:

Choosing the right MergeTree engine variant (MergeTree for raw data, ReplacingMergeTree for deduplication, AggregatingMergeTree for pre-aggregation)
Designing ORDER BY with 3-5 columns, low cardinality first, matching your query patterns
Partitioning by time for data management (dozens to hundreds of partitions, not thousands)
Using optimal data types (LowCardinality, smallest integer types, avoiding Nullable)
Applying appropriate compression codecs (Delta for timestamps, ZSTD as general-purpose)
Leveraging materialized views for pre-aggregation, with explicit TO tables
Understanding that ReplacingMergeTree requires FINAL for guaranteed deduplication
Applying query optimization techniques (PREWHERE, approximate functions, proper column selection)

Invest time in upfront design, test with realistic data volumes, and iterate based on actual query patterns. A well-modeled ClickHouse database will handle billions of events while returning results in milliseconds.

Event Data Modeling in ClickHouse