Connect DuckDB to Apache Superset: local analytics 2026

Tutorial to connect DuckDB to Apache Superset: embedded SQL, Parquet files, MotherDuck, performance.

9 May 2026 · Tuvalu Tech OÜ

DuckDB is the data outsider of 2026: an embedded columnar SQL engine, ultra-fast on local files (CSV, Parquet, JSON) and now accessible in cloud via MotherDuck. Connecting Apache Superset to DuckDB turns your laptop or your Superset instance into a personal data warehouse.

1. Why DuckDB?

Performance: columnar, vectorized, comparable to ClickHouse on GBs;
Zero-config: a single .duckdb file, no server to operate;
Native reading of Parquet, CSV, JSON, Arrow without prior import;
Standard SQL with analytical extensions (window functions, CTE);
MotherDuck: cloud DuckDB + collaboration.

If you want Superset ready to connect to DuckDB, TVL Managed Superset includes the DuckDB driver by default.

2. Prerequisites

An accessible Superset instance;
A local DuckDB file or MotherDuck account;
The duckdb-engine driver installed.

3. Install the driver

uv pip install duckdb-engine

4. Local DuckDB URI

For a .duckdb file mounted in Superset:

duckdb:////data/analytics.duckdb

In memory (volatile, useful for testing):

duckdb:///:memory:

5. MotherDuck URI (cloud DuckDB)

duckdb:///md:<db_name>?motherduck_token=<token>

MotherDuck offers DuckDB benefits in collaborative and cloud mode, with S3 storage under the hood.

6. Add to Superset

UI → Settings → Database Connections → + Database;
Type: DuckDB;
Paste the URI;
Test → Save.

7. Iconic use cases

Exploratory analysis of Parquet files on S3 without import;
Local dashboards data analyst on their laptop;
Fast pre-aggregation of events before push to warehouse;
Analytics POC without cloud infra;
Education: teach SQL without provisioning a server.

This configuration is applied by default on TVL Managed Superset, which follows community best practices.

8. Direct Parquet/CSV reading

DuckDB can directly query files without import:

SELECT *
FROM 's3://my-bucket/events/year=2026/month=05/*.parquet'
WHERE event_type = 'purchase';

Configure S3 credentials:

SET s3_access_key_id='AKIA...';
SET s3_secret_access_key='...';
SET s3_region='eu-west-1';

9. DuckDB limits

Single-user per file (lock file): no concurrency with multiple writers;
No native replication: for HA, use MotherDuck;
Not suited for massive writes: DuckDB excels at reads/aggregations;
Memory limits: DuckDB uses disk-spill but remains limited by the machine.

10. Common pitfalls

File not accessible to Superset pod: mount a shared volume or use MotherDuck;
Lock file: another process opened the file in write, close it;
Legacy driver (duckdb without -engine): use duckdb-engine for SQLAlchemy;
Aggressive Superset cache recommended because DuckDB is very fast but not instant on 10+ million rows.

11. Conclusion

DuckDB + Apache Superset is a unique combo for exploratory and local BI. For multi-user production loads, consider MotherDuck or a classic cloud data warehouse. But for data analysts who want a lightweight and performant setup, it's unbeatable.

Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France).

For more: connect ClickHouse, connect PostgreSQL, chart types.