How to Use PyMongo to Connect MongoDB Atlas with AWS Lambda

Anaiya Raisinghani6 min read • Published Apr 02, 2024 • Updated Apr 02, 2024

AWS Serverless Python Atlas

Rate this tutorial

Picture a developer’s paradise: a world where instead of fussing over hardware complexities, we are free to focus entirely on running and executing our applications. With the combination of AWS Lambda and MongoDB Atlas, this vision becomes a reality.

Armed with AWS Lambda’s pay-per-execution structure and MongoDB Atlas’ unparalleled scalability, developers will truly understand what it means for their applications to thrive without the hardware limitations they might be used to.

This tutorial will take you through how to properly set up an Atlas cluster, connect it to AWS Lambda using MongoDB’s Python Driver, write an aggregation pipeline on our data, and return our wanted information. Let’s get started.

Prerequisites for success

MongoDB Atlas Account
AWS Account; Lambda access is necessary
GitHub repository
Python 3.8+

Create an Atlas Cluster

Our first step is to create an Atlas cluster. Log into the Atlas UI and follow the steps to set it up. For this tutorial, the free tier is recommended, but any tier will work!

Please ensure that the cloud provider picked is AWS. It’s also necessary to pick a secure username and password so that we will have the proper authorization later on in this tutorial, along with proper IP address access.

Already have an AWS account? Atlas supports paying for usage via the AWS Marketplace (AWS MP) without any upfront commitment — simply

Once your cluster is up and running, click the ellipses next to the Browse Collections button and download the sample dataset. Your finished cluster will look like this:

Once our cluster is provisioned, let’s set up our AWS Lambda function.

Creating an AWS Lambda function

Sign into your AWS account and search for “Lambda” in the search bar. Hit the orange “Create function” button at the top right side of the screen, and you’ll be taken to the image below. Here, make sure to first select the “Author from scratch” option. Then, we want to select a name for our function (AWSLambdaDemo), the runtime (3.8), and our architecture (x86_64).

Hit the orange “Create function” button on the bottom right to continue. Once your function is created, you’ll see a page with your function overview above and your code source right below.

Now, we are ready to set up our connection from AWS Lambda to our MongoDB cluster.

To make things easier for ourselves because we are going to be using Pymongo, a dependency, instead of editing directly in the code source, we will be using Visual Studio Code. AWS Lambda has a limited amount of pre-installed libraries and dependencies, so in order to get around this and incorporate Pymongo, we will need to package our code in a special way. Due to this “workaround,” this will not be a typical tutorial with testing at every step. We will first have to download our dependencies and upload our code to Lambda prior to ensuring our code works instead of using a typical requirements.txt file. More on that below.

AWS Lambda and MongoDB cluster connection

Now we are ready to establish a connection between AWS Lambda and our MongoDB cluster!

Create a new directory on your local machine and name it awslambda-demo.

Let’s install pymongo. As said above, Lambda doesn’t have every library available. So, we need to download pymongo at the root of our project. We can do it by working with .zip file archives: In the terminal, enter our awslambda-demo directory:

1  cd awslambda-demo

Create a new directory where your dependencies will live:

1 mkdir dependencies

Install pymongo directly in your dependencies package:

1 pip install --target ./dependencies pymongo

Open Visual Studio Code, open the awslambda-demo directory, and create a new Python file named lambda_function.py. This is where the heart of our connection will be.

Insert the code below in our lambda_function.py. Here, we are setting up our console to check that we are able to connect to our Atlas cluster. Please keep in mind that since we are incorporating our environment variables in a later step, you will not be able to connect just yet. We have copied the lambda_handler definition from our Lambda code source and have edited it to insert one document stating my full name into a new “test” database and “test” collection. It is best practice to incorporate our MongoClient outside of our lambda_handler because to establish a connection and performing authentication is reactively expensive, and Lambda will re-use this instance.

1 import os
2 from pymongo import MongoClient
3 
4 
5 client = MongoClient(host=os.environ.get("ATLAS_URI"))
6 
7 
8 def lambda_handler(event, context):
9     # Name of database
10     db = client.test 
11 
12     # Name of collection
13     collection = db.test 
14     
15     # Document to add inside
16     document = {"first name": "Anaiya", "last name": "Raisinghani"}
17 
18 
19     # Insert document
20     result = collection.insert_one(document)
21 
22 
23     if result.inserted_id:
24         return "Document inserted successfully"
25     else:
26         return "Failed to insert document"

If this is properly inserted in AWS Lambda, we will see “Document inserted successfully” and in MongoDB Atlas, we will see the creation of our “test” database and collection along with the single document holding the name “Anaiya Raisinghani.” Please keep in mind we will not be seeing this yet since we haven’t configured our environment variables and will be doing this a couple steps down.

Now, we need to create a .zip file, so we can upload it in our Lambda function and execute our code. Create a .zip file at the root:

1 cd dependencies
2 zip -r ../deployment.zip *

This creates a deployment.zip file in your project directory.

Now, we need to add in our lambda_function.py file to the root of our .zip file:

1 cd ..
2 zip deployment.zip lambda_function.py

Once you have your .zip file, access your AWS Lambda function screen, click the “Upload from” button, and select “.zip file” on the right hand side of the page:

Upload your .zip file and you should see the code from your lambda_function.py in your “Code Source”:

Let’s configure our environment variables. Select the “Configuration” tab and then select the “Environment Variables” tab. Here, put in your “ATLAS_URI” string. To access your connection string, please follow the instructions in our docs.

Once you have your Environment Variables in place, we are ready to run our code and see if our connection works. Hit the “Test” button. If it’s the first time you’re hitting it, you’ll need to name your event. Keep everything else on the default settings. You should see this page with our “Execution results.” Our document has been inserted!

When we double-check in Atlas, we can see that our new database “test” and collection “test” have been created, along with our document with “Anaiya Raisinghani.”

This means our connection works and we are capable of inserting documents from AWS Lambda to our MongoDB cluster. Now, we can take things a step further and input a simple aggregation pipeline!

Aggregation pipeline example

For our pipeline, let’s change our code to connect to our sample_restaurants database and restaurants collection. We are going to be incorporating our aggregation pipeline to find a sample size of five American cuisine restaurants that are located in Brooklyn, New York. Let’s dive right in!

Since we have our pymongo dependency downloaded, we can directly incorporate our aggregation pipeline into our code source. Change your lambda_function.py to look like this:

1 import os
2 from pymongo import MongoClient
3 
4 connect = MongoClient(host=os.environ.get("ATLAS_URI"))
5 
6 def lambda_handler(event, context):
7     # Choose our "sample_restaurants" database and our "restaurants" collection
8     database = connect.sample_restaurants
9     collection = database.restaurants
10 
11     # This is our aggregation pipeline
12     pipeline = [
13 
14         # We are finding American restaurants in Brooklyn
15         {"$match": {"borough": "Brooklyn", "cuisine": "American"}},
16 
17         # We only want 5 out of our over 20k+ documents
18         {"$limit": 5},
19 
20         # We don't want all the details, project what you need
21         {"$project": {"_id": 0, "name": 1, "borough": 1, "cuisine": 1}}
22         
23     ]
24 
25     # This will show our pipeline 
26     result = list(collection.aggregate(pipeline))
27 
28     # Print the result
29     for restaurant in result:
30         print(restaurant)

Here, we are using $match to find all the American cuisine restaurants located in Brooklyn. We are then using $limit to only five documents out of our database. Next, we are using $project to only show the fields we want. We are going to include “borough”, “cuisine”, and the “name” of the restaurant. Then, we are executing our pipeline and printing out our results.

Click on “Deploy” to ensure our changes have been deployed to the code environment. After the changes are deployed, hit “Test.” We will get a sample size of five Brooklyn American restaurants as the result in our console:

Our aggregation pipeline was successful!

Conclusion

This tutorial provided you with hands-on experience to connect a MongoDB Atlas database to AWS Lambda. We also got an inside look on how to write to a cluster from Lambda, how to read back information from an aggregation pipeline, and how to properly configure our dependencies when using Lambda. Hopefully now, you are ready to take advantage of AWS Lambda and MongoDB to create the best applications without worrying about external infrastructure.

If you enjoyed this tutorial and would like to learn more, please check out our MongoDB Developer Center and YouTube channel.

Rate this tutorial

Tutorial

Developing Your Applications More Efficiently with MongoDB Atlas Serverless Instances

Feb 03, 2023 | 7 min read

Tutorial

Java Faceted Full-Text Search API using MongoDB Atlas Search

Jan 17, 2025 | 18 min read

Tutorial

How to Improve LLM Applications With Parent Document Retrieval Using MongoDB and LangChain

Feb 11, 2025 | 15 min read

Tutorial

Index Anything, Search Everything: Scalable Vector Search with Replicate AI, MongoDB, and Hookdeck

Dec 16, 2024 | 20 min read

Create an Atlas Cluster
Creating an AWS Lambda function
AWS Lambda and MongoDB cluster connection
Aggregation pipeline example
Conclusion

Atlas

How to Use PyMongo to Connect MongoDB Atlas with AWS Lambda

Prerequisites for success

Create an Atlas Cluster

Creating an AWS Lambda function

AWS Lambda and MongoDB cluster connection

Aggregation pipeline example

Conclusion

Related

Developing Your Applications More Efficiently with MongoDB Atlas Serverless Instances

Java Faceted Full-Text Search API using MongoDB Atlas Search

How to Improve LLM Applications With Parent Document Retrieval Using MongoDB and LangChain

Index Anything, Search Everything: Scalable Vector Search with Replicate AI, MongoDB, and Hookdeck

Table of Contents

1	import os
2	from pymongo import MongoClient
3
4
5	client = MongoClient(host=os.environ.get("ATLAS_URI"))
6
7
8	def lambda_handler(event, context):
9	# Name of database
10	db = client.test
11
12	# Name of collection
13	collection = db.test
14
15	# Document to add inside
16	document = {"first name": "Anaiya", "last name": "Raisinghani"}
17
18
19	# Insert document
20	result = collection.insert_one(document)
21
22
23	if result.inserted_id:
24	return "Document inserted successfully"
25	else:
26	return "Failed to insert document"

1	import os
2	from pymongo import MongoClient
3
4	connect = MongoClient(host=os.environ.get("ATLAS_URI"))
5
6	def lambda_handler(event, context):
7	# Choose our "sample_restaurants" database and our "restaurants" collection
8	database = connect.sample_restaurants
9	collection = database.restaurants
10
11	# This is our aggregation pipeline
12	pipeline = [
13
14	# We are finding American restaurants in Brooklyn
15	{"$match": {"borough": "Brooklyn", "cuisine": "American"}},
16
17	# We only want 5 out of our over 20k+ documents
18	{"$limit": 5},
19
20	# We don't want all the details, project what you need
21	{"$project": {"_id": 0, "name": 1, "borough": 1, "cuisine": 1}}
22
23	]
24
25	# This will show our pipeline
26	result = list(collection.aggregate(pipeline))
27
28	# Print the result
29	for restaurant in result:
30	print(restaurant)