Setting up a load balancer in front of Apache Superset is the key step to move from a single instance to a resilient, scalable architecture. This guide details strategies, tool choices (Nginx, HAProxy, Traefik), session management, WebSockets, and health checks for 2026.
1. Why a load balancer for Superset?
On a single instance, a crash, an update, or a load spike causes immediate downtime. With a load balancer in front of 2 or 3 Superset pods, you get: high availability, horizontal scaling, rolling updates without interruption, centralized TLS termination. The backbone of any serious production.
If you want this resilience without configuration complexity, TVL Managed Superset integrates a multi-zone load balancer by default on dedicated instances.
2. Is the Superset pod really stateless?
Almost. Three elements must be externalized for clean load balancing:
- Flask sessions: in-memory by default, must move to Redis;
- Cache: already in Redis if configured;
- Temporary uploads: ensure a shared volume (NFS, EFS) or object storage.
SESSION_TYPE = "redis"
SESSION_REDIS = redis.Redis(host="redis", port=6379, db=1)
CACHE_CONFIG = {"CACHE_TYPE": "RedisCache", "CACHE_REDIS_URL": "redis://redis:6379/0"}
3. Tool choice
| Tool | Strengths | Best for |
|---|---|---|
| ingress-nginx | Kubernetes standard, rich ecosystem | K8s cluster |
| HAProxy | Performance, observability, layer 4 + 7 | Bare-metal, very high load |
| Traefik | Docker/K8s auto-discovery, dashboard | Dynamic hybrid setup |
| Caddy | Automatic HTTPS, simple config | Small deployments |
| AWS ALB / GCP LB | Cloud integration, managed | Managed public cloud |
4. Distribution algorithm
For Superset, two relevant algorithms:
- Round-robin: equal distribution, simple, default recommended;
- Least-connection: better balance under uneven load (long vs short queries).
Avoid sticky sessions: they mask non-shared session bugs and prevent rebalancing on failover. This configuration is applied by default on TVL Managed Superset, which follows community best practices.
5. Typical Nginx configuration
upstream superset {
least_conn;
server superset-1:8088 max_fails=3 fail_timeout=30s;
server superset-2:8088 max_fails=3 fail_timeout=30s;
server superset-3:8088 max_fails=3 fail_timeout=30s;
}
server {
listen 443 ssl http2;
server_name superset.example.com;
ssl_certificate /etc/letsencrypt/live/superset.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/superset.example.com/privkey.pem;
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
proxy_read_timeout 300s;
proxy_send_timeout 300s;
client_max_body_size 100M;
location / {
proxy_pass http://superset;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket pass-through (async SQL Lab)
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
6. Health checks
Superset exposes a /health endpoint. Configure it as a probe:
# Kubernetes
livenessProbe:
httpGet: { path: /health, port: 8088 }
initialDelaySeconds: 60
periodSeconds: 30
readinessProbe:
httpGet: { path: /health, port: 8088 }
initialDelaySeconds: 30
periodSeconds: 10
For HAProxy: option httpchk GET /health.
7. WebSocket and long-polling
SQL Lab uses long-polling for async queries. Without proper pass-through, long requests fail. Verify the LB:
- routes
Upgrade: websocket; - has
proxy_read_timeoutat 300s minimum; - doesn't close idle connections too early.
8. TLS and HTTPS
The LB is the ideal place to terminate TLS, freeing Superset of that load:
- cert-manager + Let's Encrypt on Kubernetes;
- Caddy automatic on bare-metal;
- wildcard certificates for multi-tenant.
9. Rate limiting and WAF
To limit abuse:
# ingress-nginx annotations
nginx.ingress.kubernetes.io/limit-rps: "20"
nginx.ingress.kubernetes.io/limit-connections: "10"
To go further: Cloudflare in front of ingress, AWS WAF, or ModSecurity.
10. Common pitfalls
- Non-shared sessions: user disconnected on each differently-routed request;
- Timeout too short: long exports interrupted;
- No sticky with misconfigured session cookies: redirect loops;
- Partial HTTPS: Superset on HTTP behind HTTPS without
X-Forwarded-Proto= redirect loop; - Health check too strict: pods marked unhealthy on every Python GC.
11. Conclusion
Apache Superset load balancing isn't rocket science but requires following a few rules: external sessions, tolerant health checks, WebSocket pass-through, generous timeouts. Once in place, you gain HA, scaling, and rolling updates for a few hours of configuration.
Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France).
For more: high availability, Kubernetes deployment, security hardening.