Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Join us at AWS re:Invent 2024! Learn how to use MongoDB for AI use cases.
MongoDB Developer
MongoDB
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
MongoDBchevron-right

Atlas Query Federation SQL to Form Powerful Data Interactions

Pavel Duchovny3 min read • Published Jun 20, 2022 • Updated Aug 13, 2024
SQLMongoDBData Visualization
Facebook Icontwitter iconlinkedin icon
Rate this article
star-empty
star-empty
star-empty
star-empty
star-empty
Modern platforms have a wide variety of data sources. As businesses grow, they have to constantly evolve their data management and have sophisticated, scalable, and convenient tools to analyse data from all sources to produce business insights.
MongoDB has developed a rich and powerful query language, including a very robust aggregation framework.
These were mainly done to optimize the way developers work with data and provide great tools to manipulate and query MongoDB documents.
Having said that, many developers, analysts, and tools still prefer the legacy SQL language to interact with the data sources. SQL has a strong foundation around joining data as this was a core concept of the legacy relational databases normalization model.
This makes SQL have a convenient syntax when it comes to describing joins.
Providing MongoDB users the ability to leverage SQL to analyse multi-source documents while having a flexible schema and data store is a compelling solution for businesses.

Data Sources and the Challenge

Consider a requirement to create a single view to analyze data from operative different systems. For example:
  • Customer data is managed in the user administration systems (REST API).
  • Financial data is managed in a financial cluster (Atlas cluster).
  • End-to-end transactions are stored in files on cold storage gathered from various external providers (cloud object storage - Amazon S3 or Microsoft Azure Blob Storage store).
How can we combine and best join this data?
MongoDB Atlas Query Federation connects multiple data sources using the different data store types. Once the data sources are mapped, we can create collections consuming this data. Those collections can have SQL schema generated, allowing us to perform sophisticated joins and do JDBC queries from various BI tools.
In this article, we will showcase the extreme power hidden in Atlas SQL Query.

Setting Up My Federated Database Instance

In the following view, I have created three main data stores:
  • S3 Transaction Store (S3 sample data).
  • Accounts from my Atlas clusters (Sample data sample_analytics.accounts).
  • Customer data from a secure https source.
Query Federation Sources
I mapped the stores into three collections under FinTech database:
  • Transactions
  • Accounts
  • CustomerDL
Now, I can see them through a Query Federation connection as MongoDB collections.
Let's grab our Query Federation instance connection string from the Atlas UI.
Connection
This connection string can be used with our BI tools or client applications to run SQL queries.

Connecting and Using $sql and db.sql

Once we connect to the Query Federation instancee via a mongosh shell, we can generate a SQL schema for our collections. This is optional for the JDBC or $sql operators to recognise collections as SQL “tables” as this step is done automatically for newly created collections, however, its always good to be familiar with the available commands.

Generate SQL schema for each collection:

1use admin;
2db.runCommand({sqlGenerateSchema: 1, sampleNamespaces: ["FinTech.customersDL"], sampleSize: 1000, setSchemas: true})
3{
4 ok: 1,
5 schemas: [ { databaseName: 'FinTech', namespaces: [Array] } ]
6}
7db.runCommand({sqlGenerateSchema: 1, sampleNamespaces: ["FinTech.accounts"], sampleSize: 1000, setSchemas: true})
8{
9 ok: 1,
10 schemas: [ { databaseName: 'FinTech', namespaces: [Array] } ]
11}
12db.runCommand({sqlGenerateSchema: 1, sampleNamespaces: ["FinTech.transactions"], sampleSize: 1000, setSchemas: true})
13{
14 ok: 1,
15 schemas: [ { databaseName: 'FinTech', namespaces: [Array] } ]
16}

Running SQL queries and joins using $sql stage:

1use FinTech;
2db.aggregate([{
3 $sql: {
4 statement: "SELECT a.* , t.transaction_count FROM accounts a, transactions t where a.account_id = t.account_id SORT BY t.transaction_count DESC limit 2",
5 format: "jdbc",
6 formatVersion: 2,
7 dialect: "mysql",
8 }
9}])
10
11// Equivalent command
12db.sql("SELECT a.* , t.transaction_count FROM accounts a, transactions t where a.account_id = t.account_id SORT BY t.transaction_count DESC limit 2");
The above query will prompt account information and the transaction counts of each account.

Connecting Via JDBC

Let’s connect a powerful BI tool like Tableau with the JDBC driver.
Latest Driver

Connect to Tableau

You have 2 main options to connect, via "MongoDB Atlas" connector or via a JDBC general connector. Please follow the relevant instructions and prerequisites on this documentation page.
Connector "MongoDB Atlas by MongoDB"
Search and click the “MongoDB Atlas by MongoDB” connector and provide the information pointing to our Query Federation URI. See the following example:
Tablue Connector
"JDBC" Connector
Setting connection.properties file.
1user=root
2password=*******
3authSource=admin
4database=FinTech
5ssl=true
6compressors=zlib
Click the “Other Databases (JDBC)” connector, copy JDBC connection format, and load the connection.properties file.
Tableau Connector Selection
Once the data is read successfully, the collections will appear on the right side.

Setting and Joining Data

Setting Tables
We can drag and drop collections from different sources and link them together.
Joining Tables with Drag & Drop
In my case, I connected Transactions => Accounts based on the Account Id field, and accounts and users based on the Account Id to Accounts field.
Unified View of Accounts, Transactions and Users data
In this view, we will see a unified table for all accounts with usernames and their transactions start quarter.

Summary

MongoDB has all the tools to read, transform, and analyse your documents for almost any use-case.
Whether your data is in an Atlas operational cluster, in a service, or on cold storage like cloud object storage, Atlas Query Federation will provide you with the ability to join the data in real time. With the option to use powerful join SQL syntax and SQL-based BI tools like Tableau, you can get value out of the data in no time.
Try Atlas Query Federation with your BI tools and SQL today.

Facebook Icontwitter iconlinkedin icon
Rate this article
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

How to Use Custom Aggregation Expressions in MongoDB 4.4


Sep 23, 2022 | 11 min read
News & Announcements

MongoDB's New Time Series Collections


Sep 09, 2024 | 8 min read
Quickstart

Getting Started with Aggregation Pipelines in Python


Oct 01, 2024 | 14 min read
Quickstart

Introduction to Multi-Document ACID Transactions in Python


Sep 11, 2024 | 10 min read
Table of Contents