TVL Managed Superset

Connect MongoDB to Apache Superset 2026

Tutorial to connect MongoDB to Apache Superset: BI Connector, schema mapping, performance.

MongoDB is massively used for modern applications (e-commerce, content, IoT). Connecting Apache Superset to MongoDB requires some workarounds because Mongo is not natively SQL. This guide details the options for 2026.

1. Why is it more complex?

Superset relies on SQLAlchemy → SQL. MongoDB stores JSON documents, MongoDB Query Language. Three options to connect:

  1. MongoDB BI Connector (Atlas): exposes a SQL endpoint;
  2. Postgres FDW: foreign data wrapper that queries Mongo;
  3. ETL to warehouse: Airbyte/Fivetran pushes Mongo → BigQuery/Postgres.

If you want an instance with MongoDB connectors, TVL Managed Superset includes the necessary drivers on Pro+.

2. Option A — MongoDB BI Connector (Atlas)

Available with MongoDB Atlas (managed):

  1. Atlas UI → Cluster → BI Connector → Enable;
  2. Retrieve the hostname and port (3307 by default);
  3. Create a dedicated user with read privilege;
  4. Superset URI: mysql+pymysql://<user>:<password>@<host>:3307/<db>.

The BI Connector translates SQL → MongoDB Query Language on the fly.

3. Option B — Postgres FDW

For self-hosted MongoDB:

-- On Postgres
CREATE EXTENSION mongo_fdw;
CREATE SERVER mongo_server
  FOREIGN DATA WRAPPER mongo_fdw
  OPTIONS (address 'mongo.example.com', port '27017');

CREATE FOREIGN TABLE products (
  _id VARCHAR,
  name VARCHAR,
  price NUMERIC,
  category VARCHAR
)
SERVER mongo_server
OPTIONS (database 'shop', collection 'products');

Superset queries Postgres, which relays to Mongo.

4. Option C — ETL to warehouse

Recommended for serious analytical loads:

  1. Airbyte connector MongoDB → BigQuery / Postgres;
  2. Incremental sync with Mongo change streams;
  3. dbt transformation on the warehouse;
  4. Superset connected to the warehouse.

This configuration is applied by default on TVL Managed Superset, which follows community best practices.

5. Schema mapping

MongoDB is schemaless. For Superset, an explicit schema is needed:

  • Either the BI Connector infers the schema (sample-based);
  • Or define the schema manually in the FDW foreign table;
  • Or dbt projects the Mongo JSON into typed columns.

6. Performance

  • BI Connector: OK up to a few million docs;
  • FDW: very slow beyond 100k docs;
  • ETL warehouse: excellent, but introduces delay.

7. Typical use cases

  • E-commerce: products, orders, carts;
  • Content: articles, comments, likes;
  • IoT: device events;
  • Application logs stored in JSON.

8. Common pitfalls

  • Variable schema between documents → missing fields;
  • JOIN performance low (Mongo doesn't like joins);
  • Nested arrays poorly handled by BI Connector;
  • Atlas cost: BI Connector = pricing tier surcharge;
  • Latency: Superset queries 5-30s without cache.

9. Recommendation

For analytical loads, prefer option C (ETL to warehouse). Mongo remains excellent for transactional, but poorly suited to massive BI aggregations. The warehouse offers a much better Superset experience.

10. Conclusion

Connecting MongoDB to Apache Superset is possible via several options, each with trade-offs. For a POC, the Atlas BI Connector is fast. For production, ETL to a columnar warehouse is largely preferable.

Want the benefits of Apache Superset without the friction of installation and maintenance? Deploy your instance in 3 clicks with TVL Managed Superset, hosted in Europe (OVHcloud, Roubaix, France).

For more: connect PostgreSQL, connect ClickHouse, connect Elasticsearch.