Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Join us at AWS re:Invent 2024! Learn how to use MongoDB for AI use cases.
MongoDB Developer
Atlas
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
Atlaschevron-right

MongoDB Data Federation Setup

Ken W. Alger, Anaiya Raisinghani5 min read • Published Feb 07, 2022 • Updated Jan 23, 2024
AWSAtlasData Federation
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
As an avid traveler, you have a love for staying at Airbnbs and have been keeping detailed notes about each one you’ve stayed in over the years. These notes are spread out across different storage locations, like MongoDB Atlas and AWS S3, making it a challenge to search for a specific Airbnb with the amenities your girlfriend desires for your upcoming Valentine’s Day trip. Luckily, there is a solution to make this process a lot easier. By using MongoDB’s Data Federation feature, you can combine all your data into one logical view and easily search for the perfect Airbnb without having to worry about where the data is stored. This way, you can make your Valentine’s Day trip perfect without wasting time searching through different databases and storage locations.
Don’t know how to utilize MongoDB’s Data Federation feature? This tutorial will guide you through exactly how to combine your Airbnb data together for easier query-ability.

Tutorial Necessities

Before we jump in, there are a few necessities we need to have in order to be on the same page. This tutorial requires:
  • MongoDB Atlas.
  • An Amazon Web Services (AWS) account.
  • Access to the AWS Management Console.
  • AWS CLI.
  • MongoDB Compass.

Importing our sample data

Our first step is to import our Airbnb data into our Atlas cluster and our S3 bucket, so we have data to work with throughout this tutorial. Make sure to import the dataset into both of these storage locations.

Importing via MongoDB Atlas

Step 1: Create a free tier shared cluster.
✅ Already have an AWS account? Atlas supports paying for usage via the AWS Marketplace (AWS MP) without any upfront commitment — simply
sign up for MongoDB Atlas via AWS Marketplace.
Step 2: Once your cluster is set up, click the three ellipses and click “Load Sample Dataset." loading sample dataset
Step 3: Once you get this green message you’ll know your sample dataset (Airbnb notes) is properly loaded into your cluster. sample dataset successfull loaded

Importing via AWS S3

Step 1: We will be using this sample data set. Please download it locally. It contains the sample data we are working with along with the S3 bucket structure necessary for this demo.
Step 2: Once the data set is downloaded, access your AWS Management Console and navigate to their S3 service.
Step 3: Hit the button “Create Bucket” and follow the instructions to create your bucket and upload the sampledata.zip.
Step 4: Make sure to unzip your file before uploading the folders into S3.
Step 5: Once your data is loaded into the bucket, you will see several folders, each with varying data types.
folders with data types
Step 6: Follow the path: Amazon S3 > Buckets > atlas-data-federation-demo > json/ > airbnb/ to view your Airbnb notes. Your bucket structure should look like this:
bucket structure
Congratulations! You have successfully uploaded your extensive Airbnb notes in not one but two storage locations. Now, let’s see how to retrieve this information in one location using Data Federation so we can find the perfect Airbnb. In order to do so, we need to get comfortable with the MongoDB Atlas Data Federation console.

Connecting MongoDB Atlas to S3

Inside the MongoDB Atlas console, on the left side, click on Data Federation.
Data Federation
Here, click “set up manually” in the "create new federated database" dropdown in the top right corner of the UI. This will lead us to a page where we can add in our data sources. You can rename your Federated Database Instance to anything you like. Once you save it, you will not be able to change the name.
Naming your Data Federation instance
Let’s add in our data sources from our cluster and our bucket!

Adding in data source via AWS S3 Bucket:

Step 1: Click on “Add Data Source.”
Step 2: Select the “Amazon S3” button and hit “Next.”
Adding a data source
Step 3: From here, click Next on the “Authorize an AWS IAM Role”:
Configure AWS S3 data store
Step 4: Click on “Create New Role in the AWS CLI”:
Creating a new role
Step 5: Now, you’re going to want to make sure you have AWS CLI configured on your laptop.
Step 6: Follow the steps below the “Create New Role with the AWS CLI” in your AWS CLI.
1aws iam create-role \
2 --role-name datafederation \
3 --assume-role-policy-document file://role-trust-policy.json
Step 7: You can find your “ARN” directly in your terminal. Copy that in — it should look like this:
1arn:aws:iam::7***************:role/datafederation
Step 8: Enter the bucket name containing your Airbnb notes:
Enter bucket name containing Airbnb notes
Step 9: Follow the instructions in Atlas and save your policy role.
Save policy role
Step 10: Copy the CLI commands listed on the screen and paste them into your terminal like so:
1aws iam put-role-policy \
2 --role-name datafederation \
3 --policy-name datafederation-policy \
4 --policy-document file://adl-s3-policy.json
Step 11: Access your AWS Console, locate your listingsAndReviews.json file located in your S3 bucket, and copy the S3 URI.
Copy S3 URI
Step 12: Enter it back into your “Define ‘Data Sources’ Using Paths Inside Your S3” screen and change each step of the tree to “static.”
Changing steps to static
Step 13: Drag your file from the left side of the screen to the middle where it says, “Drag the dataset to your Federated Database.” Following these steps correctly will result in a page similar to the screenshot below.
Data source to virtual connection
You have successfully added in your Airbnb notes from your S3 bucket. Nice job. Let's do the same thing for the notes saved in our Atlas cluster.

Adding in data source via MongoDB Atlas cluster

Step 1: Click “Add Data Sources.”
Step 2: Select “MongoDB Atlas Cluster” and provide the cluster name along with our sample_airbnb collection. These are your Atlas Airbnb notes.
Provide cluster name
Step 3: Click “Next” and your sample_airbnb.listingsAndReviews will appear in the left-hand side of the console.
Step 4: Drag it directly under your Airbnb notes from your S3 bucket and hit “Save.” Your console should look like this when done:
Saving S3 bucket
Great job. You have successfully imported your Airbnb notes from both your S3 bucket and your Atlas cluster into one location. Let’s connect to our Federated Database and see our data combined in one easily query-able location.

Connect to your federated database

We are going to connect to our Federated Database using MongoDB Compass.
Step 1: Click the green “Connect” button and then select “Connect using MongoDB Compass.”
Connect to FederationDemo
Step 2: Copy in the connection string, making sure to switch out the user and password for your own. This user must have admin access in order to access the data.
Step 3: Once you’re connected to Compass, click on “VirtualDatabase0” and once more on “VirtualCollection0.”
virtual database
Amazing job. You can now look at all your Airbnb notes in one location!

Conclusion

In this tutorial, we have successfully stored your Airbnb data in various storage locations, combined these separate data sets into one via Data Federation, and successfully accessed our data back through MongoDB Compass. Now you can look for and book the perfect Airbnb for your trip in a fraction of the time.

Facebook Icontwitter iconlinkedin icon
Rate this tutorial
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Tutorial

Building AI Applications with Microsoft Semantic Kernel and MongoDB Atlas Vector Search


Nov 27, 2023 | 8 min read
Code Example

EHRS-Peru


Sep 11, 2024 | 3 min read
Article

The MongoDB Atlas Sample Datasets


Oct 01, 2024 | 8 min read
Tutorial

Leveraging MongoDB Atlas Vector Search With LangChain


Sep 18, 2024 | 6 min read
Table of Contents