Explore Developer Center's New Chatbot! MongoDB AI Chatbot can be accessed at the top of your navigation to answer all your MongoDB questions.

Join us at AWS re:Invent 2024! Learn how to use MongoDB for AI use cases.
MongoDB Developer
MongoDB
plus
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right
Productschevron-right
MongoDBchevron-right

How to Build an Aggregation Pipeline in MongoDB Atlas

9 min • Published Nov 01, 2022
MongoDBAggregation Framework
Facebook Icontwitter iconlinkedin icon
Rate this video
star-empty
star-empty
star-empty
star-empty
star-empty
search
00:00:00Introduction to Aggregation Pipeline Builder
00:01:14Exploring the Movies Collection
00:01:59Building the Aggregation Pipeline: Initial Match Stage
00:04:00Unwinding Arrays with the Unwind Operator
00:07:11Narrowing Down Results with a Second Match Stage
00:09:15Grouping Documents and Summing Runtime
00:11:17Projecting Results and Omitting Unnecessary Fields
00:12:00Reviewing the Aggregation Pipeline Stages
00:14:03Conclusion and Encouragement to Engage with the Content
The main theme of the video is to teach viewers how to use MongoDB Atlas's aggregation pipeline Builder to create complex queries and manipulate data sets effectively.
🔑 Key Points
  • The video demonstrates the use of MongoDB Atlas's aggregation pipeline Builder.
  • It shows how to build an aggregation pipeline with multiple stages, including match, unwind, group, and project.
  • The example used involves filtering movie documents by cast member, specifically Ryan Reynolds, and calculating the total screen time.
  • The aggregation pipeline Builder provides a graphical tool to build, test, and visualize the stages of your pipeline.
All MongoDB Videos

Full Video Transcript
hey everyone I'm Nick raboy from mongodb and in this video we're going to get a taste of what you can do with the aggregation pipeline Builder that can be accessed directly within mongodb Atlas some of my screen you'll notice that I do have manga to be Atlas loaded um this I do have one cluster available I am using a free tiered cluster in this example it doesn't really matter I'm gonna go ahead and click on browse collections and I do have quite a few databases and collections these are all from the sample data set I'm going to go ahead and click on Sample inflix as the one that I'm going to be using for this example and I'm going to go ahead and click on movies so to give you an idea of what these movies look like we have information such as plot title cast runtime Etc anything that you would expect when it comes to say Netflix IMDb similar services so what we're going to be doing in this video is we're going to be actually playing around with the aggregation pipeline Builder so a graphical tool that lets you build and test your aggregation pipelines so that way you can get more comfortable when it comes to running more complex queries in mongodb so with the movies database open what I'm going to do is I'm going to click on aggregation and that's going to drop us into the aggregation pipeline Builder which you'll notice is a near similar experience that you would get in mongodb compass if you've ever worked with manga to be Compass so you have the opportunity to do it directly with an atlas or within Compass it's totally up to you so if I scroll down into the aggregation pipeline Builder that we're offered through Atlas you'll notice that I do have some sample documents to give us an idea of what we're going to be attempting to work with this is just a sample data set so you can get an idea that everything is working as expected so let's go ahead and do a first stage of this Builder and we're going to do we're going to start with something simple and we're going to work our way up into something a little more complicated to give you an idea of some of the stuff that you can do so the first thing I want to do is I want to do a match query I want to find specific documents that meet my particular criteria so I'm going to say match and what exactly do I want to match on so if I scroll through this document to see what exactly is available to me I want to actually look at the cast so I want to find a all particular movies that have a particular cast member so an act an actor so what I'm going to say is I'm going to say cast and I'm going to say let's let's search for Ryan Reynolds and you'll notice that it does generate a preview based on what I had for this particular stage in the pipeline and this is useful because it gives you an idea of well one is my query actually working so is this aggregation working up into this part and is it working correctly so if I scroll through some of these documents in the sample data set I'm going to look for cast which is an array even though I provided just a solid string value and sure enough we do have data that has Ryan Reynolds so this you're not going to get much benefit of this particular aggregation over a simple find operation because a find operation is in fact a match so we're going to scroll and do another stage in this aggregation pipeline so for this stage let's go ahead and say that we don't want to work with arrays anymore this this cast array it's cool we're working with an array we mongodb has plenty of operators that allow you to work with arrays successfully but just for the sake of this example we're going to say you know we're done with arrays so I'm going to look for the unwind operator so the unwind operator will flatten this array for us so what I'm going to do for this particular unwind is I'm just going to provide the path I'm going to ignore the other optional operators so I'm going to say dollar sign cast because we're using the dollar sign because we want to reference a particular field that exists within our document and not just provide some kind of string value so by saying dollar sign cast we're saying that we're referencing the cast property the the cast field and now instead of an array we now have a flattened document or set of documents that would contain all of the values in that in that cast array all right so we've done a match we've un we've Unwound the array let's take a step further let's say that we want to do another narrow down of our data set so we initially searched for Ryan Reynold documents which reduced our data set significantly we Unwound the array let's go ahead and do another match based on the Ryan Reynolds cast member so that Ryan Reynolds actor so we want to further narrow down the scope of our results so I'm going to say add a stage and I'm going to say match for this match I'm once again going to say cast is going to be Ryan Reynolds and you'll notice that once again we do get some sample documents back it's only a sample set of 10 whereas we could end up with many more documents as part of the actual result set of this aggregation pipeline but if we look for cast we know that cast is in fact Ryan Reynolds here great let's go ahead and further scope down the results of our aggregation pipeline we want to narrow it down and let's say that we want this time to say that we want to get the total screen time that Ryan Reynolds has for all of his movies that he's ever produced movie shows whatever exists in this movie's collection so let's go ahead and add another stage to our pipeline this time around let's go ahead and say that we want to do a group so in this case we're actually going to group our documents based on certain criteria and that criteria is going to include the actual sum of the runtime of these movies because if I look at the samples that came back we do have a runtime field I'm going to assume that this is minutes it could be something else but it doesn't really matter for this example so let's go ahead and make some changes here we do need to provide an ID and some Fields so we're going to start with that ID the ID is we're just going to leave it as one the value of this ID doesn't really matter for this example because we don't need to specify our actual documents in this case we only care about the runtime we know that we're only going to get information about this this actor um so let's make some changes our first field uh let's go ahead and say that we want to include the cast member as in the results and the uh the accumulator that we're going to be using is we only want the first actor returned which should only ever be Ryan Reynolds because we did the match after the unwind so let's go ahead and do this operator let's go ahead and say first and we're going to get the First cast member returned so in this case it's going to be dollar sign cast and you'll notice that it did provide us a sample that came back in this case we have an ID which is fixed of one and we have a cast member Ryan Reynolds that really doesn't help us because we're still trying to get the total run time of these movies so let's go ahead and add another field to our group let's go ahead and say that this time we're going to call it uh screen time and for screen time we're going to use another accumulator so we're going to say this time around we're going to say sum and it's going to be the sum of our run time so we're going to say dollar sign runtime and this will return the total runtime which is we're assuming uh maybe 1941 minutes uh or whatever that accumulator is the summation of the runtimes which is great if we wanted to get rid of this ID which is not particularly useful to us in this particular aggregation pipeline we can say let's go ahead and add another stage and I'm showing you as many stages as I can right now because that's the whole point of the aggregation Pipeline and the aggregation pipeline Builder is that you can add numerous stages to manipulate your data however you want it to be so in this case what I'm going to say is I'm going to say project so I'm going to project which Fields I want and the fields that I want well basically I want to say what I don't want and what I don't want is I don't want the ID included so what I can say is ID 0 so by saying ID 0 we're saying omit the ID and include everything else so in this case now we do have our sample our sample is one a single result and that is Ryan Reynolds with a screen time so you'll notice that we do have quite a few stages in this aggregation pipeline there are probably 100 different ways to do the same pipeline or this pipeline might not actually be useful to you but we have a match followed by an unwind followed by a match followed by a group followed by a project and in each stage of this pipeline we do have some samples along the way that will help you when it comes to working with your data so if you like this video please take a moment to hit that like button and subscribe to the YouTube channel and I'll see you in my next video

Facebook Icontwitter iconlinkedin icon
Rate this video
star-empty
star-empty
star-empty
star-empty
star-empty
Related
Article

Paginations 1.0: Time Series Collections in five minutes


May 19, 2022 | 4 min read
Quickstart

Aggregation Framework with Node.js 3.3.2 Tutorial


Oct 01, 2024 | 9 min read
Tutorial

Beyond Vectors: Augment LLM Capabilities With MongoDB Aggregation Framework


Jun 20, 2024 | 16 min read
News & Announcements

Laravel MongoDB 5.0: Major Release With Exciting New Features!


Sep 13, 2024 | 1 min read