TVL Managed Superset

Scale Apache Superset to 10,000+ Users 2026

Architecture to scale Apache Superset to 10,000+ users: HA, sharding, cache, autoscaling, perf.

Moving an Apache Superset instance from 100 to 10,000+ users requires a thought-out architecture: HPA, aggressive cache, properly sized metadata DB, fine monitoring. This guide details the scaling patterns for 2026.

1. Default limits

A Docker single-node Superset instance handles ~100 concurrent users. Beyond that, several bottlenecks:

  • Saturated gunicorn workers;
  • Limited Postgres connections;
  • Saturated Redis cache;
  • Frontend assets served slowly.

If you want a scale-ready instance, TVL Managed Superset Pro+ offers multi-AZ Kubernetes architectures.

2. Target architecture for 10,000 users

  • Multi-AZ Kubernetes;
  • Web pods 5-15 replicas + HPA;
  • Celery workers 10-30 replicas + HPA;
  • Postgres HA primary + 2 read replicas;
  • Redis Cluster 6 nodes minimum;
  • CDN in front for static assets;
  • Load balancer with health checks and session sharing.

3. Optimize gunicorn workers

# Superset env vars
SUPERSET_GUNICORN_WORKERS=8        # per pod
SUPERSET_GUNICORN_THREADS=4        # per worker
SUPERSET_WEBSERVER_TIMEOUT=300
SUPERSET_WEBSERVER_WORKER_TIMEOUT=120

With 5 pods × 8 workers = 40 simultaneous processes. Each process handles ~250 active users.

4. Postgres HA and read replicas

Configure the metadata database:

  • HA primary (RDS Multi-AZ, OVH HA, Cloud SQL HA);
  • 2 read replicas for heavy SELECTs;
  • PgBouncer in front for pooling;
  • See Postgres tuning.

5. Redis Cluster

At 10,000 users, Redis Sentinel is no longer enough. Redis Cluster with sharding:

  • 6 nodes (3 primary + 3 replicas) minimum;
  • Total memory 16-32 GB;
  • Automatic hash slot-based sharding;
  • redis-py-cluster library on Superset side.

This configuration is applied by default on TVL Managed Superset, which follows community best practices.

6. Autoscaling

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: superset-web
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: superset
  minReplicas: 5
  maxReplicas: 20
  metrics:
    - type: Resource
      resource:
        name: cpu
        target: { type: Utilization, averageUtilization: 70 }
    - type: Resource
      resource:
        name: memory
        target: { type: Utilization, averageUtilization: 80 }

7. CDN and frontend

  • Cloudflare or BunnyCDN in front of the ingress;
  • Static JS/CSS cache (1 year with hash in the name);
  • Brotli + gzip for compression;
  • HTTP/3 if supported.

8. Strategic cache

  • 24h TTL on exec dashboards;
  • Nightly cache warming (cf. cache strategy);
  • Async queries enabled to avoid blocking;
  • Pre-aggregation on warehouse side.

9. Fine monitoring

  • p95 latency per endpoint;
  • Concurrent users (active sessions);
  • Celery queue backlog;
  • Redis cache hit ratio;
  • Active Postgres connections.

10. Common pitfalls

  • Sticky session enabled by mistake → bad load balancing;
  • Flask sessions in memory → users disconnected on scale;
  • Postgres connection pool too small;
  • Systematic cache miss on heavy dashboards;
  • HPA with poorly chosen metric (CPU on IO load).

11. Conclusion

Scaling Apache Superset to 10,000+ users is entirely possible but requires a thought-out architecture. Count on 1-2 experienced SREs and €200-500/month in additional infra costs per 1000 active users. At this level, outsourcing to a managed service (TVL Managed Superset Enterprise) is often more economical than maintaining infrastructure in-house.

Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France).

For more: high availability, scale data, cache strategy.