Skip to content
Learning Path: Data Warehousing

Data Warehousing with ClickHouse: Level 3


Course

About

Level 3 Overview

Level 3 puts the full analytical power of ClickHouse front and center. You'll explore ClickHouse's rich library of over 1,000 built-in functions — spanning regular, aggregate, and table functions — and discover how to apply them against large-scale OLAP datasets to surface meaningful insights with sub-second query times. From there, you'll learn how the AggregatingMergeTree table engine and incremental materialized views work in concert to shift expensive computation from query time to insert time, making real-time analytics dashboards not just possible, but instantaneous. The level closes by connecting everything you've built to  a BI tool so your data is always fresh, accurate, and immediately actionable.

 

 

Module 8:  Analyzing Data

ClickHouse ships with a comprehensive function library covering three primary categories: regular functions that apply row-level transformations, aggregate functions that compute results across multiple rows, and table functions for creating and querying structured datasets. With well over 1,000 built-in functions — including rich datetime, string manipulation, search, and array utilities — ClickHouse gives you expressive, flexible SQL tooling purpose-built for OLAP workloads. You'll explore string functions for manipulation, search, and replacement, as well as datetime functions that support granularity as fine as microseconds, making it straightforward to build time-series analytics at any scale.

 

What You'll Learn:

  • The three main types of ClickHouse functions: regular, aggregate, and table functions
  • How to use datetime functions and string functions
  • How to apply advanced aggregate functions How to use aggregate combinators like if, any, and argMax to filter and retrieve aggregated values across different user segments
  • How to use arrayJoin to unfold array columns into individual rows for downstream analysis

 

 

Module 9: Building Dashboards 

Module 9 reveals how ClickHouse achieves blazing-fast dashboard performance through the AggregatingMergeTree table engine and incremental materialized views. Rather than scanning billions of rows at query time, the AggregatingMergeTree stores pre-computed intermediate aggregate states — grouped by the sort key — so that rows sharing the same key collapse into a single row on merge. This means your dashboards query a dramatically smaller, pre-aggregated dataset, enabling sub-second response times even at petabyte scale. Paired with incremental materialized views — which act as real-time insert triggers rather than full-table refresh scans — this pattern shifts the cost of computation from query time to insert time, keeping your data continuously fresh without sacrificing performance.

 

The module also covers system.query_log for identifying and optimizing slow queries, and walks through connecting ClickHouse Cloud to a BI tool for real-time analytics dashboards. ClickHouse integrates natively with all major BI tools — including Power BI, Tableau, and Looker — so you can surface insights through whatever visualization layer your organization already uses. You'll see exactly how the AggregateFunction and SimpleAggregateFunction data types, together with the State and Merge function combinators, are the core mechanisms that make this entire pre-aggregation pipeline work correctly and efficiently.

 

What You'll Learn:

  • How the AggregatingMergeTree table engine stores partial aggregate states and collapses rows at merge time to minimize table size
  • The difference between the AggregateFunction and SimpleAggregateFunction data types and when to use each
  • How to use the State and Merge combinators to write to and read from AggregatingMergeTree tables correctly
  • How to use system.query_log to monitor query execution time, memory usage, and identify bottlenecks
  • How to connect ClickHouse Cloud to a BI tool

 

 

Module 10: Connecting ClickHouse to an MCP Server

This module covers how to integrate the ClickHouse MCP Server with LibreChat, a chat interface for interacting with AI models. It walks you through cloning the LibreChat repository, configuring environment variables, setting up Docker with a custom docker-compose.override.yml file containing their ClickHouse Cloud credentials, and adjusting librechat.yaml settings to enable MCP server support. Once running, you can use the LibreChat interface to select the ClickHouse MCP server and your AI model of choice so you can ask questions about the data in their ClickHouse Cloud instance.

 

What you’ll learn:

  • How to integrate LibreChat with your ClickHouse Cloud instance
  • How to ask gain insights into the data in your ClickHouse Cloud instance, no SQL required

 

Pre-requisites:

  • It is recommended that you take Data Warehousing with ClickHouse: level 1 and Data Warehousing with ClickHouse: level 2 prior to this training
  • During the training, you will access your course materials in ClickHouse Academy. It is recommended that you create an account before the event HERE
  • ClickHouse Cloud account to access the labs.
  • You will also need to spin up a ClickHouse Cloud Service. If you have already used up your free trial, please contact us at least one day before the training.