Definition
$lookupChanged in version 5.1.
Performs a left outer join to a collection in the same database to filter in documents from the foreign collection for processing. The
$lookupstage adds a new array field to each input document. The new array field contains the matching documents from the foreign collection. The$lookupstage passes these reshaped documents to the next stage.Starting in MongoDB 5.1, you can use
$lookupwith sharded collections.To combine elements from two different collections, use the
$unionWithpipeline stage.Important
Excessive use of
$lookupmay slow down query performance. To reduce reliance on$lookup, consider an embedded data model to store related data in a single collection.For details on
$lookupperformance, see Performance Considerations.
Compatibility
You can use $lookup for deployments hosted in the following
environments:
MongoDB Atlas: The fully managed service for MongoDB deployments in the cloud
MongoDB Enterprise: The subscription-based, self-managed version of MongoDB
MongoDB Community: The source-available, free-to-use, and self-managed version of MongoDB
Syntax
The $lookup stage syntax:
{ $lookup: { from: <collection to join>, localField: <field from the input documents>, foreignField: <field from the documents of the "from" collection>, let: { <var_1>: <expression>, …, <var_n>: <expression> }, pipeline: [ <pipeline to run> ], as: <output array field> } }
The $lookup accepts a document with these fields:
Field | Necessity | Description |
|---|---|---|
Required | Specifies the foreign collection in the same database to join to the local collection. It is possible in some edge cases to subsitute Starting in MongoDB 5.1, the | |
Optional if | Specifies the field from the documents input to the
| |
Optional if | Specifies the foreign documents' If a foreign document does not contain a | |
Optional | Specifies variables to use in the pipeline stages. Use the variable expressions to
access the fields from the local collection's documents that are
input to the To reference variables in pipeline
stages, use the The let variables can be accessed by the
stages in the pipeline, including
additional
| |
Optional if | Specifies the The The To reference variables in pipeline
stages, use the The let variables can be accessed by the
stages in the pipeline, including
additional
| |
Required | Specifies the name of the new array field to add to the input
documents. The new array field contains the matching
documents from the |
Equality Match with a Single Join Condition
To perform an equality match between a field from the input documents
with a field from the documents of the foreign collection, the
$lookup stage has this syntax:
{ $lookup: { from: <collection to join>, localField: <field from the input documents>, foreignField: <field from the documents of the "from" collection>, pipeline: [ <pipeline to run> ], as: <output array field> } }
Note
In this example, pipeline is optional and runs after
the local and foreign equality stage.
The operation corresponds to this pseudo-SQL statement:
SELECT *, ( SELECT ARRAY_AGG(*) FROM <collection to join> WHERE <foreignField> = <collection.localField> ) AS <output array field> FROM collection;
Note
The SQL statements on this page are included for comparison to the MongoDB aggregation pipeline syntax. The SQL statements aren't runnable.
For MongoDB examples, see these pages:
Join Conditions and Subqueries on a Foreign Collection
MongoDB supports:
Executing a pipeline on a foreign collection.
Multiple join conditions.
Correlated and uncorrelated subqueries.
In MongoDB, an uncorrelated subquery means that every input document will return
the same result. A correlated subquery is a pipeline in a $lookup stage that uses the local
or input collection's fields to return results correlated to each incoming
document.
Note
Starting in MongoDB 5.0, for an uncorrelated subquery in a
$lookup pipeline stage containing a $sample
stage, the $sampleRate operator, or the
$rand operator, the subquery is always run again if
repeated. Previously, depending on the subquery output size, either the
subquery output was cached or the subquery was run again.
MongoDB correlated subqueries are comparable to SQL correlated subqueries, where the inner query references outer query values. An SQL uncorrelated subquery does not reference outer query values.
MongoDB 5.0 also supports concise correlated subqueries.
To perform correlated and uncorrelated subqueries with two collections,
and perform other join conditions besides a single equality match, use
this $lookup syntax:
{ $lookup: { from: <foreign collection>, let: { <var_1>: <expression>, …, <var_n>: <expression> }, pipeline: [ <pipeline to run on foreign collection> ], as: <output array field> } }
The operation corresponds to this pseudo-SQL statement:
SELECT *, <output array field> FROM collection WHERE <output array field> IN ( SELECT <documents as determined from the pipeline> FROM <collection to join> WHERE <pipeline> );
See the following examples:
Correlated Subqueries Using Concise Syntax
New in version 5.0.
Starting in MongoDB 5.0, you can use a concise syntax for a correlated
subquery. Correlated subqueries reference document fields from a foreign
collection and the "local" collection on which the
aggregate() method was run.
The following new concise syntax removes the requirement for an equality
match on the foreign and local fields inside of an $expr
operator:
{ $lookup: { from: <foreign collection>, localField: <field from local collection's documents>, foreignField: <field from foreign collection's documents>, let: { <var_1>: <expression>, …, <var_n>: <expression> }, pipeline: [ <pipeline to run> ], as: <output array field> } }
The operation corresponds to this pseudo-SQL statement:
SELECT *, <output array field> FROM localCollection WHERE <output array field> IN ( SELECT <documents as determined from the pipeline> FROM <foreignCollection> WHERE <foreignCollection.foreignField> = <localCollection.localField> AND <pipeline match condition> );
See this example:
Behavior
Views and Collation
If performing an aggregation that involves multiple views, such as
with $lookup or $graphLookup, the views must
have the same collation.
Restrictions
You cannot include the $out or the $merge
stage in the $lookup stage. That is, when specifying a
pipeline for the foreign collection, you cannot include either stage in
the pipeline field.
{ $lookup: { from: <collection to join>, let: { <var_1>: <expression>, …, <var_n>: <expression> }, pipeline: [ <pipeline to execute on the foreign collection> ], // Cannot include $out or $merge as: <output array field> } }
Atlas Search Support
Starting in MongoDB 6.0, you can specify the Atlas Search $search or $searchMeta stage
in the $lookup pipeline to search collections on the Atlas
cluster. The $search or the $searchMeta stage
must be the first stage inside the $lookup pipeline.
For example, when you Join Conditions and Subqueries on a Foreign Collection or run
Correlated Subqueries Using Concise Syntax, you can specify
$search or $searchMeta inside the pipeline as
shown below:
To see an example of $lookup with $search,
see the Atlas Search tutorial Run an Atlas Search $search Query
Using $lookup.
Sharded Collections
Starting in MongoDB 5.1, you can specify sharded collections in the from parameter of
$lookup stages.
You cannot use the $lookup stage within a transaction while
targeting a sharded collection.
Slot-Based Query Execution Engine
Note
Starting in version 7.0.17, the slot-based query execution engine is no longer enabled by default for patch versions of 7.0. If you want your queries to use the slot-based query execution engine, please upgrade to version 8.0, where it is enabled by default.
Starting in version 6.0, MongoDB can use the slot-based execution
query engine to execute $lookup stages
if all preceding stages in the pipeline can also be executed by the
slot-based execution engine and none of the following conditions are true:
The
$lookupoperation executes a pipeline on a foreign collection. To see an example of this kind of operation, see Join Conditions and Subqueries on a Foreign Collection.The
$lookup'slocalFieldorforeignFieldspecify numeric components. For example:{ localField: "restaurant.0.review" }.The
fromfield of any$lookupin the pipeline specifies a view or sharded collection.
For more information, see $lookup Optimization.
Performance Considerations
$lookup performance depends on the type of operation performed.
Refer to the following table for performance considerations for
different $lookup operations.
$lookup Operation | Performance Considerations |
|---|---|
| |
| |
|
For general performance strategies, see Indexing Strategies and Query Optimization.
Examples
The examples on this page use data from the sample_mflix sample dataset. For details on how to load this dataset into your self-managed MongoDB deployment, see Load the sample dataset. If you made any modifications to the sample databases, you may need to drop and recreate the databases to run the examples on this page.
Perform a Single Equality Join with $lookup
The following aggregation operation first filters the
movies collection to movies with a runtime greater
than 1000, then joins with the comments collection
on the _id and movie_id fields:
db.movies.aggregate( [ { $match: { runtime: { $gt: 1000 } } }, { $lookup: { from: "comments", localField: "_id", foreignField: "movie_id", as: "movie_comments" } }, { $project: { _id: 0, title: 1, year: 1, "movie_comments.name": 1, "movie_comments.text": 1, "movie_comments.date": 1 } } ] )
[ { title: 'Centennial', year: 1978, movie_comments: [ { name: 'Ellaria Sand', text: 'Excepturi nam nam eum possimus aspernatur autem. Quis nulla optio praesentium ut distinctio explicabo.', date: ISODate('1995-08-18T03:01:50.000Z') } ] }, { title: 'Baseball', year: 1994, movie_comments: [] } ]
The operation corresponds to this pseudo-SQL statement:
SELECT *, movie_comments FROM movies WHERE movie_comments IN ( SELECT * FROM comments WHERE movie_id = movies._id );
For more information, see Equality Match Performance Considerations.
Use $lookup with an Array
If the localField is an array, you can match the array
elements against a scalar foreignField without an
$unwind stage.
The following aggregation operation joins the movies
collection with the users collection, matching the
cast array field from movies against the scalar
name field from users:
db.movies.aggregate( [ { $match: { title: { $in: [ "Roger & Me", "The Sum of Us", "Centennial" ] } } }, { $lookup: { from: "users", localField: "cast", foreignField: "name", as: "cast_users" } }, { $project: { _id: 0, title: 1, year: 1, cast: 1, "cast_users.name": 1, "cast_users.email": 1 } }, { $sort: { year: 1 } } ] )
[ { cast: [ 'Raymond Burr', 'Barbara Carrera', 'Richard Chamberlain', 'Robert Conrad' ], title: 'Centennial', year: 1978, cast_users: [] }, { cast: [ 'Michael Moore', 'Roger B. Smith', 'Rhonda Britton', 'Fred Ross' ], title: 'Roger & Me', year: 1989, cast_users: [ { name: 'Michael Moore', email: 'michael_moore@fakegmail.com' } ] }, { cast: [ 'Jack Thompson', 'Russell Crowe', 'John Polson', 'Deborah Kennedy' ], title: 'The Sum of Us', year: 1994, cast_users: [ { name: 'Deborah Kennedy', email: 'deborah_kennedy@fakegmail.com' } ] } ]
Use $lookup with $mergeObjects
The $mergeObjects operator combines multiple
documents into a single document.
The following operation uses $lookup to join
the movies collection with the comments collection,
then uses $mergeObjects in
$replaceRoot to merge the first comment document
with the movie document:
db.movies.aggregate( [ { $match: { runtime: { $gt: 1000 } } }, { $lookup: { from: "comments", localField: "_id", foreignField: "movie_id", as: "movie_comments" } }, { $replaceRoot: { newRoot: { $mergeObjects: [ { $arrayElemAt: [ "$movie_comments", 0 ] }, "$$ROOT" ] } } }, { $project: { _id: 0, title: 1, year: 1, genres: 1, name: 1, email: 1, text: 1, date: 1 } } ] )
[ { name: 'Ellaria Sand', email: 'indira_varma@gameofthron.es', text: 'Excepturi nam nam eum possimus aspernatur autem. Quis nulla optio praesentium ut distinctio explicabo.', date: ISODate('1995-08-18T03:01:50.000Z'), genres: [ 'Action', 'Adventure', 'Drama' ], title: 'Centennial', year: 1978 }, { genres: [ 'Documentary', 'History', 'Sport' ], title: 'Baseball', year: 1994 } ]
Use Multiple Join Conditions and a Correlated Subquery
Pipelines can execute on a foreign collection and include
multiple join conditions. The $expr operator enables
more complex join conditions including conjunctions and
non-equality matches.
A join condition can reference a field in the local collection
on which the aggregate() method was
run and reference a field in the foreign collection. This
allows a correlated subquery between the two collections.
MongoDB 5.0 supports concise correlated subqueries.
The following example:
Joins the
moviesandcommentscollections using the_idandmovie_idfields.Filters comments to include only those posted after the movie's release year.
db.movies.aggregate( [ { $match: { title: { $in: [ "Class Action", "Kafka", "Corpse Bride" ] } } }, { $lookup: { from: "comments", localField: "_id", foreignField: "movie_id", let: { movie_year: "$year" }, pipeline: [ { $match: { $expr: { $gt: [ { $year: "$date" }, "$$movie_year" ] } } }, { $project: { _id: 0, name: 1, date: 1 } } ], as: "post_release_comments" } }, { $project: { _id: 0, title: 1, year: 1, post_release_comments: 1 } } ] )
[ { year: 1991, title: 'Class Action', post_release_comments: [ { name: 'Khal Drogo', date: ISODate('2016-12-06T07:17:03.000Z') } ] }, { year: 1991, title: 'Kafka', post_release_comments: [ { name: 'Khal Drogo', date: ISODate('1998-05-10T03:10:20.000Z') } ] }, { year: 2005, title: 'Corpse Bride', post_release_comments: [] } ]
The operation corresponds to this pseudo-SQL statement:
SELECT *, post_release_comments FROM movies WHERE post_release_comments IN ( SELECT name, date FROM comments WHERE movie_id = movies._id AND YEAR(date) > movies.year );
The $eq, $lt, $lte,
$gt, and $gte comparison operators placed in
an $expr operator can use an index on the from collection
referenced in a $lookup stage. Limitations:
Indexes can only be used for comparisons between fields and constants, so the
letoperand must resolve to a constant.For example, a comparison between
$aand a constant value can use an index, but a comparison between$aand$bcannot.Indexes are not used for comparisons where the
letoperand resolves to an empty or missing value.
For example, if the index { movie_id: 1 } exists on the
comments collection:
The equality match on the
comments.movie_idfield uses the index.
Perform an Uncorrelated Subquery with $lookup
An aggregation pipeline $lookup stage can execute
a pipeline on the foreign collection, which allows uncorrelated
subqueries. An uncorrelated subquery does not reference the
local document fields.
Note
Starting in MongoDB 5.0, for an uncorrelated subquery in a
$lookup pipeline stage containing a $sample
stage, the $sampleRate operator, or the
$rand operator, the subquery is always run again if
repeated. Previously, depending on the subquery output size, either the
subquery output was cached or the subquery was run again.
The following operation joins the users collection with
movies that have a runtime greater than 1000 minutes from the
movies collection:
db.users.aggregate( [ { $match: { email: { $in: [ "mark_addy@gameofthron.es", "lena_headey@gameofthron.es" ] } } }, { $lookup: { from: "movies", pipeline: [ { $match: { runtime: { $gt: 1000 } } }, { $project: { _id: 0, title: 1, year: 1 } } ], as: "long_movies" } }, { $project: { _id: 0, name: 1, email: 1, long_movies: 1 } } ] )
[ { name: 'Robert Baratheon', email: 'mark_addy@gameofthron.es', long_movies: [ { title: 'Centennial', year: 1978 }, { title: 'Baseball', year: 1994 } ] }, { name: 'Cersei Lannister', email: 'lena_headey@gameofthron.es', long_movies: [ { title: 'Centennial', year: 1978 }, { title: 'Baseball', year: 1994 } ] } ]
The operation corresponds to this pseudo-SQL statement:
SELECT *, long_movies FROM users WHERE long_movies IN ( SELECT title, year FROM movies WHERE runtime > 1000 );
For more information, see Uncorrelated Subquery Performance Considerations.
Perform a Concise Correlated Subquery with $lookup
New in version 5.0.
Starting in MongoDB 5.0, an aggregation pipeline
$lookup stage supports a concise correlated
subquery syntax
that improves joins between collections. The new concise syntax
removes the requirement for an equality match on the foreign and
local fields inside of an $expr operator in a
$match stage.
The following example:
Joins the
moviesandcommentscollections by matching the localField_idwith the foreignFieldmovie_id. The match is performed before thepipelineis run.Filters comments to include only those posted after the movie's release year, accessed using
$$movie_yearand$daterespectively.
db.movies.aggregate( [ { $match: { title: { $in: [ "I Don't Kiss", "Lucky Luke", "Mississippi Masala" ] } } }, { $lookup: { from: "comments", localField: "_id", foreignField: "movie_id", let: { movie_year: "$year" }, pipeline: [ { $match: { $expr: { $gt: [ { $year: "$date" }, "$$movie_year" ] } } }, { $project: { _id: 0, name: 1, date: 1 } } ], as: "post_release_comments" } }, { $project: { _id: 0, title: 1, year: 1, post_release_comments: 1 } } ] )
[ { title: "I Don't Kiss", year: 1991, post_release_comments: [ { name: 'Brandon Hardy', date: ISODate('2016-09-18T11:11:34.000Z') } ] }, { title: 'Lucky Luke', year: 1991, post_release_comments: [ { name: 'Kelsey Smith', date: ISODate('2010-01-13T17:55:01.000Z') } ] }, { title: 'Mississippi Masala', year: 1991, post_release_comments: [ { name: 'Phillip Collins', date: ISODate('2010-05-13T08:04:22.000Z') } ] } ]
This example uses the older verbose syntax from MongoDB versions before 5.0 and returns the same results as the previous concise example:
db.movies.aggregate( [ { $match: { title: { $in: [ "I Don't Kiss", "Lucky Luke", "Mississippi Masala" ] } } }, { $lookup: { from: "comments", let: { movie_id: "$_id", movie_year: "$year" }, pipeline: [ { $match: { $expr: { $and: [ { $eq: [ "$movie_id", "$$movie_id" ] }, { $gt: [ { $year: "$date" }, "$$movie_year" ] } ] } } }, { $project: { _id: 0, name: 1, date: 1 } } ], as: "post_release_comments" } }, { $project: { _id: 0, title: 1, year: 1, post_release_comments: 1 } } ] )
[ { title: "I Don't Kiss", year: 1991, post_release_comments: [ { name: 'Brandon Hardy', date: ISODate('2016-09-18T11:11:34.000Z') } ] }, { title: 'Lucky Luke', year: 1991, post_release_comments: [ { name: 'Kelsey Smith', date: ISODate('2010-01-13T17:55:01.000Z') } ] }, { title: 'Mississippi Masala', year: 1991, post_release_comments: [ { name: 'Phillip Collins', date: ISODate('2010-05-13T08:04:22.000Z') } ] } ]
The previous examples correspond to this pseudo-SQL statement:
SELECT *, post_release_comments FROM movies WHERE post_release_comments IN ( SELECT * FROM comments WHERE comments.movie_id = movies._id AND YEAR(comments.date) > movies.year );
For more information, see Correlated Subquery Performance Considerations.
The C# examples on this page use the sample_mflix database
from the Atlas sample datasets. To learn how to create a
free MongoDB Atlas cluster and load the sample datasets, see
Get Started in the MongoDB .NET/C#
Driver documentation.
The following Movie class models the documents in the sample_mflix.movies
collection:
public class Movie { public ObjectId Id { get; set; } public int Runtime { get; set; } public string Title { get; set; } public string Rated { get; set; } public List<string> Genres { get; set; } public string Plot { get; set; } public ImdbData Imdb { get; set; } public int Year { get; set; } public int Index { get; set; } public string[] Comments { get; set; } [] public DateTime LastUpdated { get; set; } }
Note
ConventionPack for Pascal Case
The C# classes on this page use Pascal case for their property names, but the
field names in the MongoDB collection use camel case. To account for this difference,
you can use the following code to register a ConventionPack when your
application starts:
var camelCaseConvention = new ConventionPack { new CamelCaseElementNameConvention() }; ConventionRegistry.Register("CamelCase", camelCaseConvention, type => true);
The following Comment class models the documents in the sample_mflix.comments
collection:
public class Comment { public Guid Id { get; set; } [] public Guid MovieId { get; set; } public string Text { get; set; } }
To use the MongoDB .NET/C# driver to add a $lookup stage to an aggregation
pipeline, call the Lookup() method on a PipelineDefinition object.
The following example creates a pipeline stage that performs a left outer join between the movies and comments collections.
The code joins the Id field from each Movie document to the MovieId
field in the Comment documents. The comments for each movie are stored in a field
named Comments in each Movie document.
var commentCollection = client .GetDatabase("aggregation_examples") .GetCollection<Comment>("comments"); var pipeline = new EmptyPipelineDefinition<Movie>() .Lookup<Movie, Movie, Comment, Movie>( foreignCollection: commentCollection, localField: m => m.Id, foreignField: c => c.MovieId, @as: m => m.Comments);
The Node.js examples on this page use the sample_mflix database from the
Atlas sample datasets. To learn how to create a free
MongoDB Atlas cluster and load the sample datasets, see Get Started in the MongoDB Node.js driver documentation.
To use the MongoDB Node.js driver to add a $lookup stage to an aggregation
pipeline, use the $lookup operator in a pipeline object.
The following example creates a pipeline stage that performs a left outer join between the movies and comments
collections. The code joins the _id field from each movie
document to the movie_id field in the comment documents. The
comments field stores the comments for each movie in each
movie document. The
example then runs the aggregation pipeline:
const pipeline = [ { $lookup: { from: "comments", localField: "_id", foreignField: "movie_id", as: "comments" } } ]; const cursor = collection.aggregate(pipeline); return cursor;