MongoDB Aggregation Pipeline Queries vs SQL Queries

Joe Karlsson7 min read • Published Feb 07, 2022 • Updated May 10, 2022

SQL MongoDB Aggregation Framework

Rate this tutorial

Let's be honest: Many devs coming to MongoDB are joining the community with a strong background in SQL. I would personally include myself in this subset of MongoDB devs. I think it's useful to map terms and concepts you might be familiar with in SQL to help "translate" your work into MongoDB Query Language (MQL). More specifically, in this post, I will be walking through translating the MongoDB Aggregation Pipeline from SQL.

What is the Aggregation Framework?

The aggregation framework allows you to analyze your data in real time. Using the framework, you can create an aggregation pipeline that consists of one or more stages. Each stage transforms the documents and passes the output to the next stage.

If you're familiar with the Unix pipe |, you can think of the aggregation pipeline as a very similar concept. Just as output from one command is passed as input to the next command when you use piping, output from one stage is passed as input to the next stage when you use the aggregation pipeline.

SQL is a declarative language. You have to declare what you want to see—that's why SELECT comes first. You have to think in sets, which can be difficult, especially for functional programmers. With MongoDB's aggregation pipeline, you can have stages that reflect how you think—for example, "First, let's group by X. Then, we'll get the top 5 from every group. Then, we'll arrange by price." This is a difficult query to do in SQL, but much easier using the aggregation pipeline framework.

The aggregation framework has a variety of stages available for you to use. Today, we'll discuss the basics of how to use $match, $group, $sort, and $limit. Note that the aggregation framework has many other powerful stages, including $count, $geoNear, $graphLookup, $project, $unwind, and others.

If you want to check out another great introduction to the MongoDB Aggregation Pipeline, be sure to check out Introduction to the MongoDB Aggregation Framework.

Terminology and Concepts

The following table provides an overview of common SQL aggregation terms, functions, and concepts and the corresponding MongoDB aggregation operators:

SQL Terms, Functions, and Concepts	MongoDB Aggregation Operators
WHERE	$match
GROUP BY	$group
HAVING	$match
SELECT	$project
LIMIT	$limit
OFFSET	$skip
ORDER BY	$sort
SUM()	$sum
COUNT()	$sum and $sortByCount
JOIN	$lookup
SELECT INTO NEW_TABLE	$out
MERGE INTO TABLE	$merge (Available starting in MongoDB 4.2)
UNION ALL	$unionWith (Available starting in MongoDB 4.4)

Alright, now that we've covered the basics of MongoDB Aggregations, let's jump into some examples.

SQL Setup

The SQL examples assume two tables, album and songs, that join by the song.album_id and the songs.id columns. Here's what the tables look like:

Albums

id	name	band_name	price	status
1	lo-fi chill hop songs to study to	Silicon Infinite	2.99	A
2	Moon Rocks	Silicon Infinite	1.99	B
3	Flavour	Organical	4.99	A

Songs

id	title	plays	album_id
1	Snow Beats	133	1
2	Rolling By	242	1
3	Clouds	3191	1
4	But First Coffee	562	3
5	Autumn	901	3
6	Milk Toast	118	2
7	Purple Mic	719	2
8	One Note Dinner Party	1242	2

I used a site called SQL Fiddle, and used PostgreSQL 9.6 for all of my examples. However, feel free to run these sample SQL snippets wherever you feel most comfortable. In fact, this is the code I used to set up and seed my tables with our sample data:

1 -- Creating the main albums table
2 CREATE TABLE IF NOT EXISTS albums (
3     id BIGSERIAL NOT NULL UNIQUE PRIMARY KEY,
4     name VARCHAR(40) NOT NULL UNIQUE,
5     band_name VARCHAR(40) NOT NULL,
6     price float8 NOT NULL,
7     status VARCHAR(10) NOT NULL
8 );
9 
10 -- Creating the songs table
11 CREATE TABLE IF NOT EXISTS songs (
12     id SERIAL PRIMARY KEY NOT NULL,
13     title VARCHAR(40) NOT NULL,
14     plays integer NOT NULL,
15     album_id BIGINT NOT NULL REFERENCES albums ON DELETE RESTRICT
16 );
17 
18 INSERT INTO albums (name, band_name, price, status)
19 VALUES
20     ('lo-fi chill hop songs to study to', 'Silicon Infinite', 7.99, 'A'),
21     ('Moon Rocks', 'Silicon Infinite', 1.99, 'B'),
22     ('Flavour', 'Organical', 4.99, 'A');
23 
24 INSERT INTO songs (title, plays, album_id)
25 VALUES
26     ('Snow Beats', 133, (SELECT id from albums WHERE name='lo-fi chill hop songs to study to')),
27     ('Rolling By', 242, (SELECT id from albums WHERE name='lo-fi chill hop songs to study to')),
28     ('Clouds', 3191, (SELECT id from albums WHERE name='lo-fi chill hop songs to study to')),
29     ('But First Coffee', 562, (SELECT id from albums WHERE name='Flavour')),
30     ('Autumn', 901, (SELECT id from albums WHERE name='Flavour')),
31     ('Milk Toast', 118, (SELECT id from albums WHERE name='Moon Rocks')),
32     ('Purple Mic', 719, (SELECT id from albums WHERE name='Moon Rocks')),
33     ('One Note Dinner Party', 1242, (SELECT id from albums WHERE name='Moon Rocks'));

MongoDB Setup

The MongoDB examples assume one collection albums that contains documents with the following schema:

1 {
2     name : 'lo-fi chill hop songs to study to',
3     band_name: 'Silicon Infinite',
4     price: 7.99,
5     status: 'A',
6     songs: [
7         { title: 'Snow beats', 'plays': 133 },
8         { title: 'Rolling By', 'plays': 242 },
9         { title: 'Sway', 'plays': 3191 }
10     ]
11 }

For this post, I did all of my prototyping in a MongoDB Visual Studio Code plugin playground. For more information on how to use a MongoDB Playground in Visual Studio Code, be sure to check out this post: How To Use The MongoDB Visual Studio Code Plugin. Once you have your playground all set up, you can use this snippet to set up and seed your collection. You can also follow along with this demo by using the MongoDB Web Shell.

1 // Select the database to use.
2 use('mongodbVSCodePlaygroundDB');
3 
4 // The drop() command destroys all data from a collection.
5 // Make sure you run it against the correct database and collection.
6 db.albums.drop();
7 
8 // Insert a few documents into the albums collection.
9 db.albums.insertMany([
10     {
11         'name' : 'lo-fi chill hop songs to study to', band_name: 'Silicon Infinite', price: 7.99, status: 'A',
12         songs: [
13             { title: 'Snow beats', 'plays': 133 },
14             { title: 'Rolling By', 'plays': 242 },
15             { title: 'Clouds', 'plays': 3191 }
16         ]
17     },
18     {
19         'name' : 'Moon Rocks', band_name: 'Silicon Infinite', price: 1.99, status: 'B',
20         songs: [
21             { title: 'Milk Toast', 'plays': 118 },
22             { title: 'Purple Mic', 'plays': 719 },
23             { title: 'One Note Dinner Party', 'plays': 1242 }
24         ]
25     },
26     {
27         'name' : 'Flavour', band_name: 'Organical', price: 4.99, status: 'A',
28         songs: [
29             { title: 'But First Coffee', 'plays': 562 },
30             { title: 'Autumn', 'plays': 901 }
31         ]
32     },
33 ]);

Quick Reference

Count all records from albums

SQL

1 SELECT COUNT(*) AS count
2 FROM albums

MongoDB

1 db.albums.aggregate( [
2     {
3         $group: {
4             _id: null,  //  An _id value of null on the $group operator accumulates values for all the input documents as a whole.
5             count: { $sum: 1 }
6         }
7     }
8 ] );

Sum the price field from albums

SQL

1 SELECT SUM(price) AS total
2 FROM albums

MongoDB

1 db.albums.aggregate( [
2     {
3         $group: {
4             _id: null,
5             total: { $sum: "$price" }
6         }
7     }
8 ] );

For each unique band_name, sum the price field

SQL

1 SELECT band_name,
2 SUM(price) AS total
3 FROM albums
4 GROUP BY band_name

MongoDB

1 db.albums.aggregate( [
2     {
3         $group: {
4             _id: "$band_name",
5             total: { $sum: "$price" }
6         }
7     }
8 ] );

For each unique band_name, sum the price field, results sorted by sum

SQL

1 SELECT band_name,
2     SUM(price) AS total
3 FROM albums
4 GROUP BY band_name
5 ORDER BY total

MongoDB

1 db.albums.aggregate( [
2     {
3         $group: {
4             _id: "$band_name",
5             total: { $sum: "$price" }
6         }
7     },
8     { $sort: { total: 1 } }
9 ] );

For band_name with multiple albums, return the band_name and the corresponding album count

SQL

1 SELECT band_name,
2     count(*)
3 FROM albums
4 GROUP BY band_name
5 HAVING count(*) > 1;

MongoDB

1 db.albums.aggregate( [
2     {
3       $group: {
4          _id: "$band_name",
5          count: { $sum: 1 }
6       }
7     },
8     { $match: { count: { $gt: 1 } } }
9  ] );

Sum the price of all albums with status A and group by unique band_name

SQL

1 SELECT band_name,
2     SUM(price) as total
3 FROM albums
4 WHERE status = 'A'
5 GROUP BY band_name

MongoDB

1 db.albums.aggregate( [
2     { $match: { status: 'A' } },
3     {
4         $group: {
5             _id: "$band_name",
6             total: { $sum: "$price" }
7         }
8     }
9 ] );

For each unique band_name with status A, sum the price field and return only where the sum is greater than $5.00

SQL

1 SELECT band_name,
2     SUM(price) as total
3 FROM albums
4 WHERE status = 'A'
5 GROUP BY band_name
6 HAVING SUM(price) > 5.00;

MongoDB

1 db.albums.aggregate( [
2     { $match: { status: 'A' } },
3     {
4         $group: {
5             _id: "$band_name",
6             total: { $sum: "$price" }
7         }
8     },
9     { $match: { total: { $gt: 5.00 } } }
10 ] );

For each unique band_name, sum the corresponding song plays field associated with the albums

SQL

1 SELECT band_name,
2     SUM(songs.plays) as total_plays
3 FROM albums,
4     songs
5 WHERE songs.album_id = albums.id
6 GROUP BY band_name;

MongoDB

1 db.albums.aggregate( [
2     { $unwind: "$songs" },
3     {
4         $group: {
5             _id: "$band_name",
6             qty: { $sum: "$songs.plays" }
7         }
8     }
9 ] );

For each unique album, get the song from album with the most plays

SQL

1 SELECT name, title, plays
2     FROM songs s1 INNER JOIN albums ON (album_id = albums.id)
3 WHERE plays=(SELECT MAX(s2.plays)
4     FROM songs s2
5 WHERE s1.album_id = s2.album_id)
6 ORDER BY name;

MongoDB

1 db.albums.aggregate( [
2     { $project:
3         {
4             name: 1,
5             plays: {
6                 $filter: {
7                     input: "$songs",
8                     as: "item",
9                     cond: { $eq: ["$item.plays", { $max: "$songs.plays" }] }
10                 }
11             }
12         }
13     }
14 ] );

Wrapping Up

This post is in no way a complete overview of all the ways that MongoDB can be used like a SQL-based database. This was only meant to help devs in SQL land start to make the transition over to MongoDB with some basic queries using the aggregation pipeline. The aggregation framework has many other powerful stages, including $count, $geoNear, $graphLookup, $project, $unwind, and others.

If you want to get better at using the MongoDB Aggregation Framework, be sure to check out MongoDB University: M121 - The MongoDB Aggregation Framework. Or, better yet, try to use some advanced MongoDB aggregation pipeline queries in your next project! If you have any questions, be sure to head over to the MongoDB Community Forums. It's the best place to get your MongoDB questions answered.

Resources:

MongoDB University: M121 - The MongoDB Aggregation Framework: https://mongodb.prakticum-team.ru/university.mongodb.com/courses/M121/about
How to Use Custom Aggregation Expressions in MongoDB 4.4: https://mongodb.prakticum-team.ru/developer.mongodb.com/how-to/use-function-accumulator-operators
Introduction to the MongoDB Aggregation Framework: https://mongodb.prakticum-team.ru/developer.mongodb.com/quickstart/introduction-aggregation-framework
How to Use the Union All Aggregation Pipeline Stage in MongoDB 4.4: https://mongodb.prakticum-team.ru/developer.mongodb.com/how-to/use-union-all-aggregation-pipeline-stage
Aggregation Framework with Node.js Tutorial: https://mongodb.prakticum-team.ru/developer.mongodb.com/quickstart/node-aggregation-framework
Aggregation Pipeline Quick Reference: https://mongodb.prakticum-team.ru/proxy/docs.mongodb.com/manual/meta/aggregation-quick-reference
SQL to Aggregation Mapping Chart: https://mongodb.prakticum-team.ru/proxy/docs.mongodb.com/manual/reference/sql-aggregation-comparison
SQL to MongoDB Mapping Chart: https://mongodb.prakticum-team.ru/proxy/docs.mongodb.com/manual/reference/sql-comparison
Questions? Comments? We'd love to connect with you. Join the conversation on the MongoDB Community Forums: https://mongodb.prakticum-team.ru/developer.mongodb.com/community/forums

Rate this tutorial

Tutorial

Storing Large Objects and Files in MongoDB

May 13, 2022 | 1 min read

Tutorial

Preparing Time Series Data for Analysis Tools With $densify and $fill

Sep 17, 2024 | 8 min read

Quickstart

Introduction to MongoDB and Helidon

Nov 04, 2024 | 6 min read

Tutorial

Install & Configure MongoDB on the Raspberry Pi

Sep 11, 2024 | 8 min read

What is the Aggregation Framework?
Terminology and Concepts
SQL Setup
MongoDB Setup
Quick Reference
Wrapping Up
Resources:

1	-- Creating the main albums table
2	CREATE TABLE IF NOT EXISTS albums (
3	id BIGSERIAL NOT NULL UNIQUE PRIMARY KEY,
4	name VARCHAR(40) NOT NULL UNIQUE,
5	band_name VARCHAR(40) NOT NULL,
6	price float8 NOT NULL,
7	status VARCHAR(10) NOT NULL
8	);
9
10	-- Creating the songs table
11	CREATE TABLE IF NOT EXISTS songs (
12	id SERIAL PRIMARY KEY NOT NULL,
13	title VARCHAR(40) NOT NULL,
14	plays integer NOT NULL,
15	album_id BIGINT NOT NULL REFERENCES albums ON DELETE RESTRICT
16	);
17
18	INSERT INTO albums (name, band_name, price, status)
19	VALUES
20	('lo-fi chill hop songs to study to', 'Silicon Infinite', 7.99, 'A'),
21	('Moon Rocks', 'Silicon Infinite', 1.99, 'B'),
22	('Flavour', 'Organical', 4.99, 'A');
23
24	INSERT INTO songs (title, plays, album_id)
25	VALUES
26	('Snow Beats', 133, (SELECT id from albums WHERE name='lo-fi chill hop songs to study to')),
27	('Rolling By', 242, (SELECT id from albums WHERE name='lo-fi chill hop songs to study to')),
28	('Clouds', 3191, (SELECT id from albums WHERE name='lo-fi chill hop songs to study to')),
29	('But First Coffee', 562, (SELECT id from albums WHERE name='Flavour')),
30	('Autumn', 901, (SELECT id from albums WHERE name='Flavour')),
31	('Milk Toast', 118, (SELECT id from albums WHERE name='Moon Rocks')),
32	('Purple Mic', 719, (SELECT id from albums WHERE name='Moon Rocks')),
33	('One Note Dinner Party', 1242, (SELECT id from albums WHERE name='Moon Rocks'));

1	{
2	name : 'lo-fi chill hop songs to study to',
3	band_name: 'Silicon Infinite',
4	price: 7.99,
5	status: 'A',
6	songs: [
7	{ title: 'Snow beats', 'plays': 133 },
8	{ title: 'Rolling By', 'plays': 242 },
9	{ title: 'Sway', 'plays': 3191 }
10	]
11	}

1	// Select the database to use.
2	use('mongodbVSCodePlaygroundDB');
3
4	// The drop() command destroys all data from a collection.
5	// Make sure you run it against the correct database and collection.
6	db.albums.drop();
7
8	// Insert a few documents into the albums collection.
9	db.albums.insertMany([
10	{
11	'name' : 'lo-fi chill hop songs to study to', band_name: 'Silicon Infinite', price: 7.99, status: 'A',
12	songs: [
13	{ title: 'Snow beats', 'plays': 133 },
14	{ title: 'Rolling By', 'plays': 242 },
15	{ title: 'Clouds', 'plays': 3191 }
16	]
17	},
18	{
19	'name' : 'Moon Rocks', band_name: 'Silicon Infinite', price: 1.99, status: 'B',
20	songs: [
21	{ title: 'Milk Toast', 'plays': 118 },
22	{ title: 'Purple Mic', 'plays': 719 },
23	{ title: 'One Note Dinner Party', 'plays': 1242 }
24	]
25	},
26	{
27	'name' : 'Flavour', band_name: 'Organical', price: 4.99, status: 'A',
28	songs: [
29	{ title: 'But First Coffee', 'plays': 562 },
30	{ title: 'Autumn', 'plays': 901 }
31	]
32	},
33	]);

1	db.albums.aggregate( [
2	{
3	$group: {
4	_id: null, // An _id value of null on the $group operator accumulates values for all the input documents as a whole.
5	count: { $sum: 1 }
6	}
7	}
8	] );

1	db.albums.aggregate( [
2	{
3	$group: {
4	_id: null,
5	total: { $sum: "$price" }
6	}
7	}
8	] );

1	SELECT band_name,
2	SUM(price) AS total
3	FROM albums
4	GROUP BY band_name

1	db.albums.aggregate( [
2	{
3	$group: {
4	_id: "$band_name",
5	total: { $sum: "$price" }
6	}
7	}
8	] );

1	SELECT band_name,
2	count(*)
3	FROM albums
4	GROUP BY band_name
5	HAVING count(*) > 1;

1	db.albums.aggregate( [
2	{ $match: { status: 'A' } },
3	{
4	$group: {
5	_id: "$band_name",
6	total: { $sum: "$price" }
7	}
8	}
9	] );

1	SELECT band_name,
2	SUM(songs.plays) as total_plays
3	FROM albums,
4	songs
5	WHERE songs.album_id = albums.id
6	GROUP BY band_name;

1	db.albums.aggregate( [
2	{ $unwind: "$songs" },
3	{
4	$group: {
5	_id: "$band_name",
6	qty: { $sum: "$songs.plays" }
7	}
8	}
9	] );

1	SELECT name, title, plays
2	FROM songs s1 INNER JOIN albums ON (album_id = albums.id)
3	WHERE plays=(SELECT MAX(s2.plays)
4	FROM songs s2
5	WHERE s1.album_id = s2.album_id)
6	ORDER BY name;

1	db.albums.aggregate( [
2	{ $project:
3	{
4	name: 1,
5	plays: {
6	$filter: {
7	input: "$songs",
8	as: "item",
9	cond: { $eq: ["$item.plays", { $max: "$songs.plays" }] }
10	}
11	}
12	}
13	}
14	] );

MongoDB

MongoDB Aggregation Pipeline Queries vs SQL Queries

What is the Aggregation Framework?

Terminology and Concepts

SQL Setup

MongoDB Setup

Quick Reference

Count all records from albums

SQL

MongoDB

Sum the price field from albums

SQL

MongoDB

For each unique band_name, sum the price field

SQL

MongoDB

For each unique band_name, sum the price field, results sorted by sum

SQL

MongoDB

For band_name with multiple albums, return the band_name and the corresponding album count

SQL

MongoDB

Sum the price of all albums with status A and group by unique band_name

SQL

MongoDB

For each unique band_name with status A, sum the price field and return only where the sum is greater than $5.00

SQL

MongoDB

For each unique band_name, sum the corresponding song plays field associated with the albums

SQL

MongoDB

For each unique album, get the song from album with the most plays

SQL

MongoDB

Wrapping Up

Resources:

Related

Storing Large Objects and Files in MongoDB

Preparing Time Series Data for Analysis Tools With $densify and $fill

Introduction to MongoDB and Helidon

Install & Configure MongoDB on the Raspberry Pi

Table of Contents