Centralizing Apache Superset logs is essential in production: fast debugging, audit, proactive monitoring. This guide compares the options (Loki, ELK, Datadog, OpenObserve) and details the configuration in 2026.
1. Which logs to collect?
- Superset application: flask, gunicorn, sql_lab queries;
- Celery workers: tasks, errors;
- Postgres / Redis: connections, slow queries;
- Ingress nginx: access logs, 4xx/5xx;
- FAB audit: authentication, modifications.
If you want centralized logs without setup, TVL Managed Superset integrates Loki by default on Pro+ instances.
2. Solution comparison
| Solution | Type | Cost | Strength |
|---|---|---|---|
| Loki + Grafana | OSS | Low | Lightweight, integrated Grafana |
| ELK / OpenSearch | OSS | Medium | Powerful, full-text |
| Datadog Logs | SaaS | High | Turnkey, ML |
| OpenObserve | Recent OSS | Low | Logs + metrics + traces |
| Splunk | Enterprise | Very high | Banking/defense standard |
3. Loki + Promtail setup (recommended)
On Kubernetes:
# helm install
helm repo add grafana https://grafana.github.io/helm-charts
helm install loki grafana/loki-stack \
--namespace observability --create-namespace \
--set grafana.enabled=true \
--set promtail.enabled=true
Promtail automatically collects logs from all Superset pods and pushes them to Loki.
4. Superset configuration for structured logs
Configure Superset to emit structured JSON:
import logging
import json_log_formatter
formatter = json_log_formatter.JSONFormatter()
handler = logging.StreamHandler()
handler.setFormatter(formatter)
LOG_FORMAT = json
SUPERSET_LOG_LEVEL = "INFO"
LOG_LEVEL = "INFO"
Advantage: easy parsing by Loki/ELK, structured queries.
5. Useful queries in Loki/Grafana
# All Superset logs
{namespace="superset"}
# Errors only
{namespace="superset"} |= "ERROR"
# p95 latency per endpoint
{namespace="superset", app="superset-web"}
| json
| __error__ = ""
| response_ms > 5000
This configuration is applied by default on TVL Managed Superset, which follows community best practices.
6. Retention
| Log type | Retention |
|---|---|
| Application | 30 days (debug) |
| Audit trail | 12 months (GDPR) or 7 years (SOC 2) |
| Postgres slow queries | 90 days |
| Nginx access logs | 30 days |
7. Alerting on logs
Configure Grafana or Loki alerts on:
- 5xx error rate > 5% over 5 min;
- Login failures > 10/h on a user;
- Slow query > 30s;
- OOM kill;
- Disk usage > 85%.
8. Estimated cost
| Solution | Typical monthly cost |
|---|---|
| Loki self-hosted (10 GB logs) | ~€30 (S3 storage) |
| ELK self-hosted | ~€100 (3-node cluster) |
| Datadog Logs | ~€50-200 (depends on ingest) |
| OpenObserve cloud | ~€30-100 |
9. Common pitfalls
- Plain text logs: difficult to query, prefer JSON;
- PII in clear in logs: GDPR risk;
- No local rotation: full disk;
- No alerting: incident discovered the next day;
- Loki on local disk instead of S3: not durable.
10. Conclusion
Apache Superset log centralization is a critical step before going to production. Loki is an excellent choice for cost-conscious organizations with a Kubernetes stack; Datadog for those who want turnkey; ELK for advanced full-text needs. Whatever the choice, it's mandatory.
Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France).
For more: Prometheus monitoring, audit trail, production checklist.