Monitoring Apache Superset with Prometheus + Grafana

Set up Apache Superset monitoring with Prometheus and Grafana: metrics, dashboards, alerts, p95, error rate.

9 May 2026 · Tuvalu Tech OÜ

Without monitoring, you discover incidents through users. With Prometheus + Grafana, you see degradations before they become visible. This guide explains how to instrument Apache Superset, expose metrics, build dashboards, and configure alerts in 2026.

1. Why Prometheus + Grafana?

The reference open source pair for observability: Prometheus collects and stores time-series metrics, Grafana visualizes and alerts. Consistent with the rest of a modern Kubernetes stack, free, performant.

If you want this monitoring without the setup, TVL Managed Superset integrates Prometheus + Grafana by default on Pro+ instances.

2. Metrics to expose

Three layers to instrument:

Superset application: request latency, error rate, Celery queue, dashboards rendered;
Infrastructure: CPU, memory, disk, network;
Databases: Postgres and Redis exporters.

3. Enable Superset metrics

In superset_config.py:

from werkzeug.middleware.dispatcher import DispatcherMiddleware
from prometheus_client import make_wsgi_app

STATS_LOGGER = "superset.stats_logger.StatsdStatsLogger"
STATS_LOGGER_PROMETHEUS = True

# /metrics endpoint exposed by Superset
def FLASK_APP_MUTATOR(app):
    app.wsgi_app = DispatcherMiddleware(app.wsgi_app, {
        "/metrics": make_wsgi_app()
    })

The /metrics endpoint then returns metrics in Prometheus format.

4. Complementary exporters

Exporter	Metrics	Image
node_exporter	CPU, RAM, disk	`quay.io/prometheus/node-exporter`
postgres_exporter	Connections, latency, replication lag	`quay.io/prometheuscommunity/postgres-exporter`
redis_exporter	Hit ratio, memory, evictions	`oliver006/redis_exporter`
kube-state-metrics	State of pods, deployments, jobs	`k8s.gcr.io/kube-state-metrics`
blackbox_exporter	HTTP synthetic checks	`prom/blackbox-exporter`

5. Prometheus scrape configuration

scrape_configs:
  - job_name: superset
    metrics_path: /metrics
    static_configs:
      - targets: ['superset:8088']
    scrape_interval: 30s

  - job_name: postgres
    static_configs:
      - targets: ['postgres-exporter:9187']

  - job_name: redis
    static_configs:
      - targets: ['redis-exporter:9121']

  - job_name: blackbox
    metrics_path: /probe
    params:
      module: [http_2xx]
    static_configs:
      - targets:
          - https://superset.example.com/health
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: blackbox-exporter:9115

6. Essential Grafana dashboards

Dashboard 1 — Superset overview

Requests per second;
HTTP 5xx error rate (alert if >5% over 5 min);
Latency p50, p95, p99;
Celery queue (pending tasks);
Pods Up/Down and restarts.

Dashboard 2 — Databases

Active Postgres connections;
Query latency (top 10);
Redis cache hit ratio (target >90%);
Replication lag;
Disk space used.

Dashboard 3 — Infrastructure

CPU and memory per node / pod;
Network in/out;
Disk I/O;
OOM kills.

This configuration is applied by default on TVL Managed Superset, which follows community best practices.

7. Critical alerts to configure

groups:
  - name: superset
    rules:
      - alert: SupersetDown
        expr: up{job="superset"} == 0
        for: 2m
        labels: { severity: critical }
        annotations:
          summary: "Superset {{ $labels.instance }} is down"

      - alert: SupersetHighErrorRate
        expr: |
          rate(http_requests_total{status=~"5.."}[5m])
          / rate(http_requests_total[5m]) > 0.05
        for: 5m
        labels: { severity: warning }

      - alert: SupersetHighLatency
        expr: histogram_quantile(0.95, http_request_duration_seconds_bucket) > 5
        for: 10m
        labels: { severity: warning }

      - alert: PostgresConnectionsHigh
        expr: pg_stat_activity_count > 150
        for: 5m
        labels: { severity: warning }

      - alert: RedisMemoryHigh
        expr: redis_memory_used_bytes / redis_memory_max_bytes > 0.9
        for: 10m
        labels: { severity: warning }

8. Notification

Alertmanager routes alerts to:

Slack for teams;
PagerDuty / Opsgenie for on-call;
Email as backup;
Webhook for custom integration.

9. SLO and error budget

Define measurable, continuously-tracked SLOs:

SLO	Target	Monthly error budget
Availability	99.9%	43 min
Latency p95 < 2s	99% of requests	1% of requests
5xx error rate	< 0.1%	0.1%

10. Alternative tools

Datadog: commercial but turnkey, expensive at scale;
New Relic: powerful APM;
Elastic Stack: log-oriented but with metrics;
OpenObserve: open source, lighter.

11. Conclusion

A well-monitored Superset means a Superset where you sleep peacefully. The initial 1-2 engineer-day investment pays back its first week by avoiding a too-late-detected incident. For serious production deployment, it's non-negotiable.

Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France), monitoring included.

For more: centralized logs, high availability, production checklist.