Getting Started with Atlas and the MongoDB Query API
Rate this quickstart
Depending where you are in your development career or the technologies
you've already become familiar with, MongoDB can seem quite
intimidating. Maybe you're coming from years of experience with
relational database management systems (RDBMS), or maybe you're new to
the topic of data persistance in general.
The good news is that MongoDB isn't as scary as you might think, and it
is definitely a lot easier when paired with the correct tooling.
In this tutorial, we're going to see how to get started with MongoDB
Atlas for hosting our database
cluster and the MongoDB Query Language (MQL) for interacting with our
data. We won't be exploring any particular programming technology, but
everything we see can be easily translated over.
There are a few ways to get started with MongoDB. You could install a
single instance or a cluster of instances on your own hardware which you
manage yourself in terms of updates, scaling, and security, or you can
make use of MongoDB Atlas which is a database as a service (DBaaS) that
makes life quite a bit easier, and in many cases cheaper, or even free.
We're going to be working with an M0 sized Atlas cluster, which is part
of the free tier that MongoDB offers. There's no expiration to this
cluster and there's no credit card required in order to deploy it.
Before we can use MongoDB in our applications, we need to deploy a
cluster. Create a MongoDB Cloud account and
into it.
Choose to Create a New Cluster if not immediately presented with the
option, and start selecting the features of your cluster.
You'll be able to choose between AWS, Google Cloud, and Azure for
hosting your cluster. It's important to note that these cloud providers
are for location only. You won't ever have to sign into the cloud
provider or manage MongoDB through them. The location is important for
latency reasons in case you have your applications hosted on a
particular cloud provider.
If you want to take advantage of a free cluster, make sure to choose M0
for the cluster size.
It may take a few minutes to finish creating your cluster.
With the cluster created, you won't be able to access it from outside of
the web dashboard by default. This is a good thing because you don't
want random people on the internet attempting to gain unauthorized
access to your cluster.
To be able to access your cluster from the CLI, a web application, or
Visual Studio Code, which we'll be using later, you'll need to setup a
network rule that allows access from a particular IP address.
You have a few options when it comes to adding an IP address to the
allow list. You could add your current IP address which would be useful
for accessing from your local network. You could provide a specific IP
address which is useful for applications you host in the cloud
somewhere. You can also supply 0.0.0.0/0 which would allow full
network access to anyone, anywhere.
I'd strongly recommend not adding 0.0.0.0/0 as a network rule to
keep your cluster safe.
With IP addresses on the allow list, the final step is to create an
application user.
It is a good idea to create role-based access accounts to your MongoDB
Atlas cluster. This means instead of creating one super user like the
administrator account, you're creating a user account based on what the
user should be doing.
For example, maybe we create a user that has access to your accounting
databases and another user that has access to your employee database.
Within Atlas, choose the Database Access tab and click Add New
Database User to add a new user.
While you can give a user access to every database, current and future,
it is best if you create users that have more refined permissions.
It's up to you how you want to create your users, but the more specific
the permissions, the less likely your cluster will become compromised by
malicious activity.
Need some more guidance around creating an Atlas cluster? Check out
this
tutorial
by Maxime Beugnet on the subject.
With the cluster deployed, the network rules in place for your IP
address, and a user created, we can focus on some of the basics behind
the MongoDB Query Language (MQL).
To get the most out of MongoDB, you're going to need to become familiar
with the MongoDB Query Language (MQL). No, it is not like SQL if you're
familiar with relational database management systems (RDBMS), but it
isn't any more difficult. MQL can be used from the CLI, Visual Studio
Code, the development drivers, and more. You'll get the same experience
no matter where you're trying to write your queries.
In this section, we're going to focus on Visual Studio Code and the
MongoDB
Playground
extension for managing our data. We're doing this because Visual Studio
Code is common developer tooling and it makes for an easy to use
experience.
While we could write our queries out of the box with Visual Studio Code,
we won't be able to interact with MongoDB in a meaningful way until we
install the MongoDB
Playground
extension.
Within Visual Studio Code, bring up the extensions explorer and search
for MongoDB.
Install the official extension with MongoDB as the publisher.
With the extension installed, we'll need to interact with it from within
Visual Studio Code. There are a few ways to do this, but we're going to
use the command palette.
Open the command pallette (cmd + shift + p, if you're on macOS), and
enter MongoDB: Connect into the input box.
You'll be able to enter the information for your particular MongoDB
cluster. Once connected, we can proceed to creating a new Playground. If
you've already saved your information into the Visual Studio Code
extension and need to connect later, you can always enter Show
MongoDB in the command pallette and connect.
Assuming we're connected, enter Create MongoDB Playground in the
command pallette to create a new file with boilerplate MQL.
Rather than just creating random queries that may or may not be helpful
or any different from what you'd find the documentation, we're going to
come up with a data model to work with and then interact with that data
model.
I'm passionate about gaming, so our example will be centered around some
game data that might look like this:
1 { 2 "_id": "nraboy", 3 "name": "Nic Raboy", 4 "stats": { 5 "wins": 5, 6 "losses": 10, 7 "xp": 300 8 }, 9 "achievements": [ 10 { "name": "Massive XP", "timestamp": 1598961600000 }, 11 { "name": "Instant Loss", "timestamp": 1598896800000 } 12 ] 13 }
The above document is just one of an endless possibility of data models
for a document in any given collection. To make the example more
exciting, the above document has a nested object and a nested array of
objects, something that demonstrates the power of JSON, but without
sacrificing how easy it is to work with in MongoDB.
The document above is often referred to as a user profile document in
game development. You can learn more about user profile stores in game
development through a previous Twitch
stream on the subject.
As of right now, it's alright if your cluster has no databases,
collections, or even documents that look like the above document. We're
going to get to that next.
When working with MongoDB, you're going to get quite familiar with the
create, read, update, and delete (CRUD) operations necessary when
working with data. To reiterate, we'll be using Visual Studio Code to do
all this, but any CRUD operation you do in Visual Studio Code, can be
taken into your application code, scripts, and similar.
Earlier you were supposed to create a new MongoDB Playground in Visual
Studio Code. Open it, remove all the boilerplate MQL, and add the
following:
1 use("gamedev"); 2 3 db.profiles.insertOne({ 4 "_id": "nraboy", 5 "name": "Nic Raboy", 6 "stats": { 7 "wins": 5, 8 "losses": 10, 9 "xp": 300 10 }, 11 "achievements": [ 12 { "name": "Massive XP", "timestamp": 1598961600000 }, 13 { "name": "Instant Loss", "timestamp": 1598896800000 } 14 ] 15 });
In the above code we are declaring that we want to use a gamedev
database in our queries that follow. It's alright if such a database
doesn't already exist because it will be created at runtime.
Next we're using the
insertOne
operation in MongoDB to create a single
document. The db
object references the gamedev database that we've
chosen to use. The profiles object references a collection that we
want to insert our document into.The profiles collection does not need to exist prior to inserting
our first document.
It does not matter what we choose to call our database as well as our
collection. As long as the name makes sense to you and the use-case that
you're trying to fulfill.
Within Visual Studio Code, you can highlight the above MQL and choose
Run Selected Lines From Playground or use the command pallette to
run the entire playground. After running the MQL, check out your MongoDB
Atlas cluster and you should see the database, collection, and document
created.
If you'd rather verify the document was created without actually
navigating through MongoDB Atlas, we can move onto the next stage of the
CRUD operation journey.
Within the playground, add the following:
1 use("gamedev"); 2 3 db.profiles.find({});
The above
find
operation will return all documents in the profiles
collection. If you wanted to narrow the result-set, you could provide
filter criteria instead of providing an empty object. For example, try
executing the following instead:1 use("gamedev"); 2 3 db.profiles.find({ "name": "Nic Raboy" });
The above
find
operation will only return documents where the name
field matches exactly Nic Raboy
. We can do better though. What about
finding documents that sit within a certain range for certain fields.Take the following for example:
1 use("gamedev"); 2 3 db.profiles.find( 4 { 5 "stats.wins": { 6 "$gt": 6 7 }, 8 "stats.losses": { 9 "$lt": 11 10 } 11 } 12 );
The above
find
operation says that we only want documents that have
more than six wins and less than eleven losses. If we were running the
above query with the current dataset shown earlier, no results would be
returned because nothing satisfies the conditions.So we've got at least one document in our collection and have seen the
insertOne
and find
operators. Now we need to take a look at the
update and delete parts of CRUD.Let's say that we finished a game and the
stats.wins
field needs to be
updated. We could do something like this:1 use("gamedev") 2 3 db.profiles.update( 4 { "_id": "nraboy" }, 5 { "$inc": { "stats.wins": 1 } } 6 );
The first object in the above
update
operation is the filter. This is
the same filter that can be used in a find
operation. Once we've
filtered for documents to update, the second object is the mutation. In
the above example, we're using the $inc
operator to increase the
stats.wins
field by a value of one.There are quite a few operators that can be used when updating
documents. You can find more information in the official
documentation.
Maybe we don't want to use an operator when updating the document. Maybe
we want to change a field or add a field that might not exist. We can do
something like the following:
1 use("gamedev") 2 3 db.profiles.update( 4 { "_id": "nraboy" }, 5 { "name": "Nicolas Raboy" } 6 );
The above query will filter for documents with an
_id
of nraboy
, and
then update the name
field on those documents to be a particular
string, in this case "Nicolas Raboy". If the name
field doesn't exist,
it will be created and set.Got a document you want to remove? Let's look at the final part of the
CRUD operators.
Add the following to your playground:
1 use("gamedev") 2 3 db.profiles.remove({ "_id": "nraboy" })
The above
remove
operation uses a filter, just like what we saw with
the find
and update
operations. We provide it a filter of documents
to find and in this circumstance, any matches will be removed from the
profiles collection.For a lot of applications, you might only need to ever use basic CRUD
operations when working with MongoDB. However, when you need to start
analyzing your data or manipulating your data for the sake of reporting,
running a bunch of CRUD operations might not be your best bet.
This is where a MongoDB data aggregation pipeline might come into use.
To get an idea of what a data aggregation pipeline is, think of it as a
series of data stages that must complete before you have your data.
Let's use a better example. Let's say that you want to look at your
profiles collection and determine all the players who received a
certain achievement after a certain date. However, you only want to know
the specific achievement and basic information about the player. You
don't want to know generic information that matched your query.
Take a look at the following:
1 use("gamedev") 2 3 db.profiles.aggregate([ 4 { "$match": { "_id": "nraboy" } }, 5 { "$unwind": "$achievements" }, 6 { 7 "$match": { 8 "achievements.timestamp": { 9 "$gt": new Date().getTime() - (1000 * 60 * 60 * 24 * 1) 10 } 11 } 12 }, 13 { "$project": { "_id": 1, "achievements": 1 }} 14 ]);
There are four stages in the above pipeline. First we're doing a
$match
to find all documents that match our filter. Those documents
are pushed to the next stage of the pipeline. Rather than looking at and
trying to work with the achievements
field which is an array, we are
choosing to $unwind
it.To get a better idea of what this looks like, at the end of the second
stage, any data that was found would look like this:
1 [ 2 { 3 "_id": "nraboy", 4 "name": "Nic Raboy", 5 "stats": { 6 "wins": 5, 7 "losses": 10, 8 "xp": 300 9 }, 10 "achievements": { 11 "name": "Massive XP", 12 "timestamp": 1598961600000 13 } 14 }, 15 { 16 "_id": "nraboy", 17 "name": "Nic Raboy", 18 "stats": { 19 "wins": 5, 20 "losses": 10, 21 "xp": 300 22 }, 23 "achievements": { 24 "name": "Instant Loss", 25 "timestamp": 1598896800000 26 } 27 } 28 ]
Notice in the above JSON response that we are no longer working with an
array. We should have only matched on a single document, but the results
are actually two instead of one. That is because the
$unwind
split the
array into numerous objects.So we've flattened the array, now we're onto the third stage of the
pipeline. We want to match any object in the result that has an
achievement timestamp greater than a specific time. The plan here is to
reduce the result-set of our flattened documents.
The final stage of our pipeline is to output only the fields that we're
interested in. With the
$project
we are saying we only want the _id
field and the achievements
field.Our final output for this aggregation might look like this:
1 [ 2 { 3 "_id": "nraboy", 4 "achievements": { 5 "name": "Instant Loss", 6 "timestamp": 1598896800000 7 } 8 } 9 ]
There are quite a few operators when it comes to the data aggregation
pipeline, many of which can do far more extravagant things than the four
pipeline stages that were used for this example. You can learn about the
other operators in the official
documentation.
You just got a taste of what you can do with MongoDB Atlas and the
MongoDB Query Language (MQL). While the point of this tutorial was to
get you comfortable with deploying a cluster and interacting with your
data, you can extend your knowledge and this example by exploring the
programming drivers.
Take the following quick starts for example:
In addition to the quick starts, you can also check out the MongoDB
University course,
MongoDB Aggregation, which focuses
on data aggregation.
As previously mentioned, you can take the same queries between languages
with minimal to no changes between them.