PostgreSQL is the most-used database behind Apache Superset. Whether it's your application directly, a read replica, or a Postgres data warehouse, the connection is established in 5 minutes. This tutorial details the exact procedure, common pitfalls, and production best practices in 2026.
1. Prerequisites
- An accessible Superset instance (Docker, K8s, or managed — see hosting guide);
- A reachable PostgreSQL 11+ from the instance;
- A read-only database account;
- The
psycopg2-binarydriver installed (built into official images).
If you want a Superset instance ready to connect without prior configuration, TVL Managed Superset includes psycopg2 by default.
2. Create a read-only Postgres account
CREATE ROLE superset_reader WITH LOGIN PASSWORD 'XXX';
GRANT CONNECT ON DATABASE prod TO superset_reader;
GRANT USAGE ON SCHEMA public TO superset_reader;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO superset_reader;
ALTER DEFAULT PRIVILEGES IN SCHEMA public
GRANT SELECT ON TABLES TO superset_reader;
A read-only account protects your database: a SQL Lab user cannot DROP TABLE or DELETE.
3. Build the SQLAlchemy URI
Standard format:
postgresql+psycopg2://<user>:<password>@<host>:<port>/<database>
Example:
postgresql+psycopg2://superset_reader:XXX@db.example.com:5432/prod
With forced SSL (production):
postgresql+psycopg2://superset_reader:XXX@db.example.com:5432/prod?sslmode=require
4. Add the connection in Superset
- Superset UI → Settings → Database Connections → + Database;
- Choose PostgreSQL from the list;
- Paste the URI in SQLAlchemy URI;
- Click Test connection;
- Save.
5. Best configuration practices
In the Advanced tab of the connection:
- Allow CREATE TABLE AS:
disabledin read-only; - Allow DML:
disabled; - Async queries:
enabledif Celery is in place; - SQL Lab cost estimate:
enabled, requiresEXPLAINgrants; - Engine parameters:
{"connect_args": {"connect_timeout": 10}}.
This configuration is applied by default on TVL Managed Superset, which follows community best practices.
6. Create a first dataset
- UI → Datasets → + Dataset;
- Database = the connection created;
- Schema =
public(or yours); - Table = select a table (e.g.,
customers); - Save.
7. Create a first chart
- UI → Charts → + Chart;
- Select the
customersdataset; - Choose a visualization type (Big Number, Bar Chart…);
- Configure metrics (e.g.,
COUNT(*)) and dimensions; - Run query, save.
8. Common pitfalls
- "could not connect to server": firewall or IP allowlist on Postgres side. Check
pg_hba.confand cloud security group; - "no module named psycopg2": driver missing from Superset venv (image 5.0 case). Install via
uv pip install psycopg2-binary; - SSL handshake error: add
?sslmode=requireor?sslmode=verify-fullwithsslrootcert; - Insufficient permissions: forgetting
ALTER DEFAULT PRIVILEGES→ new tables not accessible; - Timezone: if your data is in UTC but dashboards display in local time, configure
SQLALCHEMY_DATABASE_URI?timezone=UTC.
9. Performance and optimization
- Use a read-only replica not to load the application primary;
- Create indexes on columns filtered by dashboards;
- Enable Superset connection pooling:
SQLALCHEMY_ENGINE_OPTIONS = {"pool_size": 10, "pool_recycle": 3600}; - Pre-aggregate fact tables via dbt into physical tables.
10. Conclusion
Connecting PostgreSQL to Apache Superset literally takes 5 minutes once credentials are ready. The real value lies in the data model quality downstream (dbt, views, indexes) and account security (read-only, SSL).
Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France).
For more: create your first dashboard, Postgres tuning, SQL Lab.