TVL Managed Superset

Apache Superset and GDPR: 2026 Compliance Guide

How to make Apache Superset GDPR-compliant: EU hosting, encryption, audit, user rights, processing register, and best practices.

Apache Superset is a powerful Business Intelligence tool, but its compliance with the General Data Protection Regulation (GDPR) depends entirely on how you host and configure it. This article reviews GDPR requirements applicable to a Superset instance, the technical and organizational measures to put in place, and the documentation needed to demonstrate compliance.

1. Why Apache Superset is concerned by GDPR

GDPR governs any processing of personal data relating to individuals located in the European Union. As soon as a Superset dashboard displays a customer ID, an email, an IP address, an order number or any attribute that allows direct or indirect identification of a person, you are processing personal data within the meaning of Article 4.

If you want to avoid the regulatory and technical friction of a custom deployment, TVL Managed Superset deploys a Superset instance hosted in France (OVHcloud, Roubaix), with a GDPR Data Processing Agreement ready to sign, in less than 3 minutes.

Your organization's role in this chain is critical. You are a data controller within the meaning of Article 4(7) when you decide on the purposes and means of processing (e.g., analyzing customer behavior). If you use a managed service to host Superset, the service provider is generally a processor under Article 28 — a written processing agreement (DPA, Data Processing Agreement) is then mandatory.

2. Mapping personal data in Superset

Before anything else, identify precisely which personal data flows through your Superset instance. Three categories should be distinguished.

2.1 Superset user data

Superset accounts themselves contain personal data: name, email, hashed password, last login date, roles, activity logs. This information is stored in the metadata database (Postgres or MySQL) and in Redis for sessions.

2.2 Business data exposed in dashboards

Datasets connected to Superset may contain personal data: customers, prospects, employees, end users. Even though Superset does not store this data (it queries the source on the fly), it accesses and displays it, which constitutes a processing operation.

2.3 Technical data

Application logs, Nginx access logs, Superset audit events, performance traces. A simple IP address is considered personal data by data protection authorities and the CJEU since the Breyer ruling (2016).

3. Choosing the right legal basis

Article 6 of GDPR requires every processing activity to rely on one of six legal bases. In a Superset context, the most common are:

  • Legitimate interest (Art. 6.1.f): internal analysis to steer the business, subject to a balancing test;
  • Performance of a contract (Art. 6.1.b): usage analysis of a B2B product to fulfill contractual features;
  • Legal obligation (Art. 6.1.c): financial reporting, anti-fraud, accounting retention;
  • Consent (Art. 6.1.a): fine-grained marketing analysis, profiling for advertising.

Document the legal basis of each dashboard containing personal data in your processing register (Article 30 GDPR).

4. Technical compliance measures

4.1 EU hosting

Since the Schrems II ruling (July 2020), transfers to the United States are tightly regulated. To avoid complications, host Superset in the EU with a provider subject to European law. OVHcloud, Scaleway, Hetzner, Outscale are solid, certified choices. TVL Managed Superset runs entirely on OVHcloud infrastructure in Roubaix.

4.2 Encryption in transit and at rest

Any communication with Superset must use TLS 1.2 minimum (1.3 recommended). On the storage side, encrypt Postgres/Redis volumes and enable the at-rest encryption offered by your hosting provider. Backups must be encrypted as well.

4.3 Strong authentication and SSO

Enable SSO via OIDC or SAML to benefit from your IdP's authentication policies (MFA, session expiration, audit). See our guide SSO OIDC on Apache Superset. Disable free local account creation in production.

4.4 Row Level Security (RLS)

Superset's Row Level Security dynamically filters rows displayed based on the connected user's role. This is the central mechanism to apply the minimization principle required by Article 5.1.c GDPR: a salesperson should only see their own customers, a manager only their team.

4.5 Pseudonymization and anonymization

When analysis does not require identity, pseudonymize datasets: replace emails with a hash, names with an internal identifier. Pseudonymization remains GDPR processing but greatly reduces breach risk. Full anonymization (irreversible, k-anonymity) falls outside GDPR scope.

4.6 Logging and audit trails

Apache Superset offers a native audit log: every SQL query, every dashboard modification, every login is traced. Centralize these logs in a separate backend (Loki, Elasticsearch, OpenObserve) with a retention policy aligned with your processing register (typically 6 to 12 months).

4.7 Backups and recovery

GDPR requires under Article 32 the ability to restore availability and access to personal data in a timely manner in case of an incident. Implement automatic daily backups of the Superset metadata database, tested regularly. This configuration is applied by default on TVL Managed Superset, which performs a daily snapshot of the OVHcloud-managed Postgres and Redis.

5. Organizational measures

5.1 Processing register (Article 30)

Maintain a register listing each dashboard or category of dashboards processing personal data, with: purpose, legal basis, data categories, categories of data subjects, recipients, retention period, security measures.

5.2 Data Protection Impact Assessment (DPIA)

If your Superset instance processes data at large scale, sensitive data (health, political opinions) or implements profiling, you must perform a DPIA (Article 35). The CNIL provides a free tool, PIA, to structure this analysis.

5.3 Data Processing Agreement (DPA)

Sign a written processing agreement with your Superset host, compliant with Article 28 GDPR. It must list any sub-processors (e.g., the infrastructure host) and authorize an annual audit.

5.4 Data subject rights policy

Prepare an internal procedure to handle access, rectification, erasure and portability requests (Articles 15 to 20). For Superset, this means being able to identify quickly, in the metadata database and connected datasets, all data associated with a user identifier.

6. Controls summary by GDPR requirement

GDPR requirement Article Superset control
LawfulnessArt. 6Document legal basis per dashboard
MinimizationArt. 5.1.cRLS, filtered virtual datasets, pseudonymization
SecurityArt. 32TLS, at-rest encryption, MFA, audit log
Breach notificationArt. 33-34Prometheus alerting, 72-hour incident runbook
Sub-processingArt. 28DPA signed with the host
Non-EU transfersArt. 44+EU host, EU sub-processors
Processing registerArt. 30Inventory of risky dashboards
Data subject rightsArt. 15-22Documented procedure, template SQL queries

7. Common mistakes to avoid

  • Granting full production access to all Superset users: most internal leaks come from missing RLS.
  • Storing unencrypted exports on laptops: govern CSV/Excel exports with a usage policy.
  • Using a non-EU host without standard contractual clauses or transfer impact assessment (TIA).
  • Keeping audit logs indefinitely: set a retention period and purge automatically.
  • Forgetting old accounts: a script disabling accounts inactive for 90 days is essential.

8. Documenting compliance over time

GDPR is built on the accountability principle (Article 5.2): you must not only be compliant but also be able to demonstrate it. Keep the following evidence:

  1. up-to-date processing register (versioned in Git or a GRC tool);
  2. DPIA and updates;
  3. DPA signed with the host and amendments;
  4. information security policy covering Superset;
  5. backup restoration test evidence;
  6. incident log and any DPA notifications;
  7. Superset administrator training records.

9. Conclusion

Bringing Apache Superset into GDPR compliance is a realistic project, but it spans technical (encryption, RLS, EU hosting), legal (DPA, register, DPIA) and organizational (procedures, training) areas. Most controls can be automated and industrialized on a well-architected platform.

Want the benefits of Apache Superset without the friction of installation, maintenance and GDPR compliance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France), with a ready-to-sign GDPR Data Processing Agreement.

To go further, see our Superset hardening guide and our managed vs self-hosted comparison.