Databricks has become in 2026 the reference lakehouse platform, particularly among data-mature organizations. Connecting Apache Superset to a Databricks SQL Warehouse is now straightforward with the official driver. This guide details the procedure and optimizations.
1. Prerequisites
- An accessible Superset instance;
- A Databricks workspace with an active SQL Warehouse;
- A Databricks personal access token (or service principal);
- The
databricks-sql-connectordriver installed.
If you want to get started quickly, TVL Managed Superset includes Databricks drivers by default.
2. Retrieve the hostname and HTTP path
In the Databricks UI:
- SQL → SQL Warehouses → select your warehouse;
- Connection Details tab;
- Copy Server hostname, HTTP path, Port (443).
3. Build the URI
databricks+connector://token:<personal_access_token>@<hostname>:443?http_path=<http_path>&catalog=<catalog>&schema=<schema>
Example:
databricks+connector://token:dapi***@adb-1234567890.10.azuredatabricks.net:443?http_path=/sql/1.0/warehouses/abc123&catalog=main&schema=analytics
4. Add to Superset
- UI → Settings → Database Connections → + Database;
- Type: Databricks;
- Paste the URI;
- Test → Save.
5. Unity Catalog
With Unity Catalog (centralized Databricks governance), specify catalog in the URI or let Superset choose during dataset creation. Allows isolating schemas per environment (dev / prod).
This configuration is applied by default on TVL Managed Superset, which follows community best practices.
6. Optimization
- Serverless SQL Warehouse: startup in seconds vs minutes;
- Auto-stop enabled (5-10 min) to save costs;
- Photon Engine enabled for 2-5x performance;
- Aggressive Superset cache (24h) on stable dashboards;
- Materialized views Databricks for frequent aggregations.
7. Security
- Service principal instead of PAT for production;
- IP allowlist Databricks;
- Unity Catalog grants at table/column level;
- Audit logs enabled.
8. Common pitfalls
- Expired token: PATs have a lifespan, to rotate;
- Incorrect HTTP path: very common, verify it's properly copy-pasted;
- Paused warehouse: first query takes 30-60s to start;
- Sub-query latency: Databricks prefers simple queries on very large volumes.
9. Conclusion
Databricks + Apache Superset offers a powerful combination for organizations that have already invested in the lakehouse. The Serverless SQL Warehouse makes the experience close to a classic cloud data warehouse, without the overhead.
Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France).
For more: connect Snowflake, connect BigQuery, connect ClickHouse.