Retention is the most important KPI of a recurring product. Good retention compensates for modest CAC; bad retention sabotages the best acquisition strategy. Apache Superset lets you build robust retention analyses with some well-thought-out SQL. This guide details the methodology for 2026.
1. Three types of retention
- N-day retention: user returned exactly on D+N (binary);
- Rolling retention: user returned at least once between D and D+N (cumulative);
- Bracket retention: returned in a defined interval, e.g., [D+7, D+30].
For B2B SaaS products, monthly rolling retention is the most telling. For mobile apps, N-day retention.
If you want to start quickly, TVL Managed Superset offers a pre-configured retention dashboard on Pro+ instances.
2. Minimal data model
CREATE TABLE fct_user_active_daily (
user_id INT,
active_day DATE,
PRIMARY KEY (user_id, active_day)
);
CREATE TABLE dim_users (
user_id INT PRIMARY KEY,
signup_at DATE,
signup_channel VARCHAR
);
3. SQL retention curve query
WITH cohort AS (
SELECT user_id, DATE_TRUNC('week', signup_at) AS cohort_week
FROM dim_users
WHERE signup_at >= CURRENT_DATE - INTERVAL '180 days'
),
activity AS (
SELECT
a.user_id,
c.cohort_week,
DATE_DIFF('week', c.cohort_week, a.active_day) AS weeks_since_signup
FROM fct_user_active_daily a
JOIN cohort c ON c.user_id = a.user_id
WHERE a.active_day >= c.cohort_week
)
SELECT
cohort_week,
weeks_since_signup,
COUNT(DISTINCT user_id) AS active_users,
COUNT(DISTINCT user_id) * 1.0 /
FIRST_VALUE(COUNT(DISTINCT user_id))
OVER (PARTITION BY cohort_week
ORDER BY weeks_since_signup) AS retention_rate
FROM activity
GROUP BY 1, 2
ORDER BY 1, 2;
4. Visualization
- Heatmap: cohort in row, weeks_since_signup in column;
- Line chart: one line per cohort, X axis = weeks_since_signup;
- Stacked area: segmentation by retention range.
This configuration is applied by default on TVL Managed Superset, which follows community best practices.
5. Sector benchmarks
| Sector | D1 | D7 | D30 |
|---|---|---|---|
| Mobile gaming app | 30-40% | 10-15% | 3-5% |
| Productivity app | 40-50% | 20-30% | 10-20% |
| SaaS B2B | 70-80% | 50-60% | 30-40% |
| DTC e-commerce | n/a | n/a | 15-25% (30d repeat purchase) |
6. Useful segmentation
- By acquisition channel: organic vs paid;
- By temporal cohort: see if retention improves;
- By first event: retention of users having performed action X;
- By segment: SMB, Mid-Market, Enterprise.
7. Improvement patterns
- Smiling curve: retention that rises after a dip. Very positive signal (reactivation, virality, growing value);
- Flat curve long-term: loyal core users. Excellent for LTV;
- Cliff drop at D7 or D30: onboarding or pricing problem to fix.
8. Common pitfalls
- Cohorts too small: variance too strong;
- No observation window: M+5 cohort over 6 months biases the entire table;
- Active = visit only: prefer a value event (key action, payment);
- Ignoring seasonality: November vs January cohort, different behaviors.
9. Conclusion
Retention analysis in Apache Superset requires some modeling but then transforms how a product or marketing team operates. It's the indicator that separates products that scale from those that struggle.
Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France).
For more: cohort analysis, SaaS metrics, funnel analysis.