TVL Managed Superset

Connect ClickHouse to Apache Superset: Real-time BI 2026

Tutorial to connect ClickHouse to Apache Superset: driver, performance, materialized views, async queries.

ClickHouse is the open source data warehouse of reference for real-time BI and log/event analysis. Combined with Apache Superset, it offers dashboards that render in less than a second over billions of rows. This guide details the connection and optimization in 2026.

1. Why ClickHouse + Superset?

  • Performance: aggregations 100x faster than Postgres on large volumes;
  • Compression: 10:1 ratio on columnar data;
  • Standard SQL: minimal learning curve for analysts;
  • Open source: self-hosted or managed (ClickHouse Cloud, Altinity).

If you want a Superset already connectable to ClickHouse, TVL Managed Superset includes ClickHouse drivers by default.

2. Prerequisites

  • A Superset instance (see hosting guide);
  • A ClickHouse cluster (self-hosted or managed);
  • A ClickHouse user with read-only access;
  • The clickhouse-connect driver installed.

3. Install the driver

uv pip install clickhouse-connect

For derived Dockerfiles or helm values bootstrap script.

4. Build the URI

ClickHouse + clickhouse-connect format:

clickhousedb+connect://<user>:<password>@<host>:8443/<database>?secure=true

Example:

clickhousedb+connect://superset_reader:XXX@clickhouse.example.com:8443/analytics?secure=true

5. Add to Superset

  1. UI → Settings → Database Connections → + Database;
  2. Type: ClickHouse Connect (Superset);
  3. Paste the URI;
  4. Test → Save.

6. Modeling for performance

ClickHouse rewards well-thought-out models:

  • ORDER BY on frequent filter columns (sparse index);
  • PARTITION BY by day, week, or month depending on volume;
  • Materialized views for common aggregations (ROLLUP, AggregatingMergeTree);
  • LowCardinality(String) for categorical columns (segment, country).

7. Optimize Superset on ClickHouse side

  • SET max_execution_time = 30 in Superset engine parameters;
  • SET max_memory_usage = 10G to limit consumption per query;
  • Async queries enabled in Superset for long queries;
  • Aggressive Superset Redis cache on executive dashboards (cf. Redis cache).

This configuration is applied by default on TVL Managed Superset, which follows community best practices.

8. Typical use cases

  • Product analytics: real-time application events;
  • Infra monitoring: aggregated logs and metrics (alternative to Loki/Elastic);
  • Marketing analytics: multi-touch attribution journals;
  • SaaS embedded: customer dashboards on thousands of tenants.

9. Security

  • Mandatory HTTPS: use port 8443 and secure=true;
  • Read-only user on ClickHouse via profiles;
  • Network policy: restrict access to Superset IP;
  • Quotas ClickHouse to limit the impact of an abusive Superset user.

10. Common pitfalls

  • Legacy driver (clickhouse-driver) instead of clickhouse-connect: the latter is official and faster;
  • No PARTITION BY: queries scanning the entire table;
  • Heavy JOIN: ClickHouse doesn't like multi-million JOINs, prefer denormalizing;
  • SELECT * on columnar table: 10x more useless bytes read.

11. Conclusion

ClickHouse + Apache Superset is probably the most performant open source combo in 2026 for real-time BI. The learning curve is fast, performance is stunning. For a data team with volumes > 100 million rows, it's a very rentable technical investment.

Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France).

For more: connect Snowflake, connect BigQuery, connect DuckDB.