TVL Managed Superset

Apache Superset High Availability: 2026 Architecture

Apache Superset HA architecture: multi-AZ, load balancer, replicas, replicated Postgres, Redis cluster, and automatic failover.

Putting Apache Superset in high availability means eliminating each single point of failure: web server, metadata database, cache, async workers, and reverse proxy. A well-thought-out HA architecture guarantees an SLA above 99.9% and survives the loss of an availability zone. This guide details the target architecture, critical components, and trade-offs to accept in 2026.

1. Why aim for high availability?

For occasional internal use, a single-node Superset is largely sufficient. But once it becomes an operational steering tool — executive dashboards, real-time monitoring, customer access via embedded — every interruption costs: delayed decisions, lost trust, cascading incidents. HA isn't a luxury, it's insurance.

If you want HA infrastructure benefits without the implementation complexity, TVL Managed Superset offers a multi-zone dedicated instance in less than 10 minutes.

2. The six components to duplicate

A production Superset instance combines six bricks. Each must be redundant to reach real HA:

  1. Reverse proxy / load balancer in front (Nginx, HAProxy, Traefik, ALB);
  2. Superset web pods/containers (Gunicorn) in at least 3 replicas;
  3. Celery workers (reports, alerts, async queries);
  4. Celery Beat (scheduler) — singleton, must have a failover strategy;
  5. Postgres metadata database replicated (primary + 1-2 replicas);
  6. Redis cache in Sentinel or Cluster mode.

3. Target architecture (multi-AZ Kubernetes)

The most common architecture in 2026 relies on managed multi-zone Kubernetes:

  • Ingress controller (ingress-nginx or Traefik) in 2+ replicas spread across nodes in different zones;
  • Superset Deployment 3 replicas with topologySpreadConstraints to spread across zones;
  • Celery worker Deployment 2-4 replicas depending on load;
  • Celery Beat StatefulSet with 1 replica + leader election (Redis-based);
  • Multi-AZ managed Postgres (RDS Multi-AZ, Cloud SQL HA, OVHcloud Postgres HA);
  • Managed Redis in cluster or Sentinel (ElastiCache, OVHcloud Valkey).

4. Configure shared sessions

Superset stores sessions in Flask memory by default. In multi-replica, a user logged into pod A loses their session if routed to pod B. Solution: externalize sessions in Redis:

SESSION_TYPE = "redis"
SESSION_REDIS = redis.Redis(
    host=os.environ["REDIS_HOST"],
    port=6379,
    db=1,
)
SESSION_PERMANENT = False
SESSION_USE_SIGNER = True

The load balancer must also be configured in round-robin or least-connection, never sticky session — stickiness masks real problems on failover.

5. Replicated Postgres: primary + replicas

Superset's metadata database stores dashboards, datasets, connections, permissions. Its loss = loss of configuration. Recommendations:

  • Managed Postgres with automatic failover (RTO < 60s);
  • Read replica not used by Superset (no read-only mode) but useful for logical backups;
  • WAL archiving + PITR (Point-In-Time Recovery) for second-precision restoration;
  • Encrypted backups stored in another region.

6. Redis: Sentinel or Cluster?

Redis plays three roles in Superset: query cache, Celery broker, session store. Two HA options:

  • Redis Sentinel: 3 sentinels watch a primary and its replicas, automatic failover. Sufficient in most cases.
  • Redis Cluster: sharding + replication. For very high loads (>100k ops/s).

Configure Superset to point to the Sentinel DNS, not the primary IP. The redis-py library handles new primary discovery after failover. This configuration is applied by default on TVL Managed Superset, which follows community best practices.

7. Celery Beat: the singleton trap

Celery Beat is the scheduler triggering scheduled reports and alerts. It must run in single instance, otherwise jobs fire twice. Solutions:

  • StatefulSet 1 replica + auto-restart on crash (acceptable for most uses);
  • celery-redbeat with Redis leader election: allows multiple Beat replicas in stand-by.

8. Frontal load balancing

The load balancer must handle:

  • TLS termination (HTTP/2 + HTTP/3);
  • Health checks on Superset's /health;
  • WebSocket pass-through (for SQL Lab);
  • Basic WAF (rate limiting, aggressive bot blocking);
  • Centralized access logs.

9. Achievable SLA table

ArchitectureTarget SLATypical monthly cost
Single-node Docker~95%€50-150
Single-AZ Kubernetes, 3 replicas~99%€200-500
Multi-AZ Kubernetes + HA Postgres99.9%€500-1,500
Multi-region active-passive99.95%€1,500-4,000

10. Chaos tests

An untested HA architecture is only a theoretical architecture. Test regularly:

  • Random kill of a Superset pod;
  • Manual Postgres primary failover;
  • Availability zone failure (cordon the AZ's nodes);
  • Loss of Redis Sentinel primary;
  • Full Postgres backup restoration.

11. Conclusion

Apache Superset high availability is achievable but requires rigor on six distinct components. Each brick has its own failover strategy, tests, metrics. For most organizations, externalizing this complexity to a managed service is more economical than building and maintaining a dedicated SRE team.

Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France), with multi-AZ available on the dedicated plan.

For more, also read Kubernetes deployment, load balancing, and disaster recovery.