Aggregation Expression Operations
On this page
Overview
In this guide, you can learn how to use the MongoDB Java Driver to construct expressions for use in the aggregation pipeline. You can perform expression operations with discoverable, typesafe Java methods rather than BSON documents. Because these methods follow the fluent interface pattern, you can chain aggregation operations together to create code that is both more compact and more naturally readable.
The operations in this guide use methods from the com.mongodb.client.model.mql package. These methods provide an idiomatic way to use the Query API, the mechanism by which the driver interacts with a MongoDB deployment. To learn more about the Query API, see the Server manual documentation.
How to Use Operations
The examples in this guide assume that you include the following static imports in your code:
import static com.mongodb.client.model.Aggregates.*; import static com.mongodb.client.model.Accumulators.* import static com.mongodb.client.model.Projections.*; import static com.mongodb.client.model.Filters.*; import static com.mongodb.client.model.mql.MqlValues.*; import static java.util.Arrays.asList;
To access document fields in an expression, you must reference the
current document being processed by the aggregation pipeline. Use the
current()
method to refer to this document. To access the value of a
field, you must use the appropriately typed method, such as
getString()
or getDate()
. When you specify the type for a field,
you ensure that the driver provides only those methods which are
compatible with that type. The following code shows how to reference a
string field called name
:
current().getString("name")
To specify a value in an operation, pass it to the of()
constructor method to
convert it to a valid type. The following code shows how to reference a
value of 1.0
:
of(1.0)
To create an operation, chain a method to your field or value reference. You can build more complex operations by chaining additional methods.
The following example creates an operation to find patients in New Mexico who have visited the doctor’s office at least once. The operation performs the following actions:
Checks if the size of the
visitDates
array is greater than0
by using thegt()
methodChecks if the
state
field value is “New Mexico” by using theeq()
method
The and()
method links these operations so that the pipeline stage
matches only documents that meet both criteria.
current() .getArray("visitDates") .size() .gt(of(0)) .and(current() .getString("state") .eq(of("New Mexico")));
While some aggregation stages, such as group()
, accept operations
directly, other stages expect that you first include your operation in a
method such as computed()
or expr()
. These methods, which take
values of type TExpression
, allow you to use your expressions in
certain aggregations.
To complete your aggregation pipeline stage, include your expression in an aggregates builder method. The following list provides examples of how to include your expression in common aggregates builder methods:
match(expr(<expression>))
project(fields(computed("<field name>", <expression>)))
group(<expression>)
To learn more about these methods, see Aggregates Builders.
The examples use the asList()
method to create a list of
aggregation stages. This list is passed to the aggregate()
method of
MongoCollection
.
Constructor Methods
You can use these constructor methods to define values for use in Java aggregation expressions.
Method | Description |
---|---|
References the current document being processed by the aggregation pipeline. | |
References the current document being processed by the aggregation pipeline as a map value. | |
Returns an | |
Returns an array of | |
Returns an entry value. | |
Returns an empty map value. | |
Returns the null value as exists in the Query API. |
Important
When you provide a value to one of these methods, the driver treats
it literally. For example, of("$x")
represents the string value
"$x"
, rather than a field named x
.
See any of the sections in Operations for examples using these methods.
Operations
The following sections provide information and examples for aggregation expression operations available in the driver. The operations are categorized by purpose and functionality.
Each section has a table that describes aggregation methods available in the driver and corresponding expression operators in the Query API. The method names link to API documentation and the aggregation pipeline operator names link to descriptions and examples in the Server manual documentation. While each Java method is effectively equivalent to the corresponding Query API expression, they might differ in expected parameters and implementation.
Note
The driver generates a Query API expression that might be different from the Query API expression provided in each example. However, both expressions will produce the same aggregation result.
Important
The driver does not provide methods for all aggregation pipeline operators in
the Query API. If you must use an unsupported operation in an
aggregation, you must define the entire expression using the BSON Document
type. To learn more about the Document
type, see Documents.
Arithmetic Operations
You can perform an arithmetic operation on a value of type MqlInteger
or
MqlNumber
using the methods described in this section.
Java Method | Aggregation Pipeline Operator |
---|---|
Suppose you have weather data for a specific year that includes the precipitation measurement (in inches) for each day. You want find the average precipitation, in millimeters, for each month.
The multiply()
operator multiplies the precipitation
field by
25.4
to convert the value to millimeters. The avg()
accumulator method
returns the average as the avgPrecipMM
field. The group()
method
groups the values by month given in each document's date
field.
The following code shows the pipeline for this aggregation:
var month = current().getDate("date").month(of("UTC")); var precip = current().getInteger("precipitation"); asList(group( month, avg("avgPrecipMM", precip.multiply(25.4)) ));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $group: { _id: { $month: "$date" }, avgPrecipMM: { $avg: { $multiply: ["$precipitation", 25.4] } } } } ]
Array Operations
You can perform an array operation on a value of type MqlArray
using the methods described in this section.
Java Method | Aggregation Pipeline Operator |
---|---|
Suppose you have a collection of movies, each of which contains an array of nested documents for upcoming showtimes. Each nested document contains an array that represents the total number of seats in the theater, where the first array entry is the number of premium seats and the second entry is the number of regular seats. Each nested document also contains the number of tickets that have already been bought for the showtime. A document in this collection might resemble the following:
{ "_id": ..., "movie": "Hamlet", "showtimes": [ { "date": "May 14, 2023, 12:00 PM", "seats": [ 20, 80 ], "ticketsBought": 100 }, { "date": "May 20, 2023, 08:00 PM", "seats": [ 10, 40 ], "ticketsBought": 34 }] }
The filter()
method displays only the results matching the provided
predicate. In this case, the predicate uses sum()
to calculate the
total number of seats and compares that value to the number of ticketsBought
with lt()
. The project()
method stores these filtered results as a new
availableShowtimes
array.
Tip
You must specify the type of the array that you retrieve with the
getArray()
method if you work with the values of the
array as their specific type.
In this example, we specify that the seats
array contains values
of type MqlDocument
so that we can extract nested fields from
each array entry.
The following code shows the pipeline for this aggregation:
var showtimes = current().<MqlDocument>getArray("showtimes"); asList(project(fields( computed("availableShowtimes", showtimes .filter(showtime -> { var seats = showtime.<MqlInteger>getArray("seats"); var totalSeats = seats.sum(n -> n); var ticketsBought = showtime.getInteger("ticketsBought"); var isAvailable = ticketsBought.lt(totalSeats); return isAvailable; })) )));
Note
To improve readability, the previous example assigns intermediary values to
the totalSeats
and isAvailable
variables. If you don't pull
out these intermediary values into variables, the code still produces
equivalent results.
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $project: { availableShowtimes: { $filter: { input: "$showtimes", as: "showtime", cond: { $lt: [ "$$showtime.ticketsBought", { $sum: "$$showtime.seats" } ] } } } } } ]
Boolean Operations
You can perform a boolean operation on a value of type MqlBoolean
using the methods described in this section.
Suppose you want to classify very low or high weather temperature readings (in degrees Fahrenheit) as extreme.
The or()
operator checks to see if temperatures are extreme by comparing
the temperature
field to predefined values with lt()
and gt()
.
The project()
method records this result in the extremeTemp
field.
The following code shows the pipeline for this aggregation:
var temperature = current().getInteger("temperature"); asList(project(fields( computed("extremeTemp", temperature .lt(of(10)) .or(temperature.gt(of(95)))) )));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $project: { extremeTemp: { $or: [ { $lt: ["$temperature", 10] }, { $gt: ["$temperature", 95] } ] } } } ]
Comparison Operations
You can perform a comparison operation on a value of type MqlValue
using the methods described in this section.
Tip
The cond()
method is similar to the ternary operator in Java and you
should use it for simple branches based on a boolean value. You should use
the switchOn()
methods for more complex comparisons such as performing
pattern matching on the value type or other arbitrary checks on the value.
Java Method | Aggregation Pipeline Operator |
---|---|
The following example shows a pipeline that matches all the documents
where the location
field has the value "California"
:
var location = current().getString("location"); asList(match(expr(location.eq(of("California")))));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $match: { location: { $eq: "California" } } } ]
Conditional Operations
You can perform a conditional operation using the methods described in this section.
Java Method | Aggregation Pipeline Operator |
---|---|
Suppose you have a collection of customers with their membership information. Originally, customers were either members or not. Over time, membership levels were introduced and used the same field. The information stored in this field can be one of a few different types, and you want to create a standardized value indicating their membership level.
The switchOn()
method checks each clause in order. If the value matches the
type indicated by the clause, that clause determines the string value
corresponding to the membership level. If the original value is a string, it
represents the membership level and that value is used. If the data type is a
boolean, it returns either Gold
or Guest
for the membership level. If
the data type is an array, it returns the most recent string in the array which
matches the most recent membership level. If the member
field is an
unknown type, the switchOn()
method provides a default value of Guest
.
The following code shows the pipeline for this aggregation:
var member = current().getField("member"); asList(project(fields( computed("membershipLevel", member.switchOn(field -> field .isString(s -> s) .isBoolean(b -> b.cond(of("Gold"), of("Guest"))) .<MqlString>isArray(a -> a.last()) .defaults(d -> of("Guest")))) )));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $project: { membershipLevel: { $switch: { branches: [ { case: { $eq: [ { $type: "$member" }, "string" ] }, then: "$member" }, { case: { $eq: [ { $type: "$member" }, "bool" ] }, then: { $cond: { if: "$member", then: "Gold", else: "Guest" } } }, { case: { $eq: [ { $type: "$member" }, "array" ] }, then: { $last: "$member" } } ], default: "Guest" } } } } ]
Convenience Operations
You can apply custom functions to values of type
MqlValue
using the methods described in this section.
To improve readability and allow for code reuse, you can move redundant
code into static methods. However, it is not possible to directly chain
static methods in Java. The passTo()
method lets you chain values
into custom static methods.
Java Method | Aggregation Pipeline Operator |
---|---|
No corresponding operator |
Suppose you want to determine how a class is performing against some benchmarks. You want to find the average final grade for each class and compare it against the benchmark values.
The following custom method gradeAverage()
takes an array of documents and
the name of an integer field shared across those documents. It calculates the
average of that field across all the documents in the provided array and
determines the average of that field across all the elements in
the provided array. The evaluate()
method compares a provided value to
two provided range limits and generates a response string based on
how the values compare:
public static MqlNumber gradeAverage(MqlArray<MqlDocument> students, String fieldName) { var sum = students.sum(student -> student.getInteger(fieldName)); var avg = sum.divide(students.size()); return avg; } public static MqlString evaluate(MqlNumber grade, MqlNumber cutoff1, MqlNumber cutoff2) { var message = grade.switchOn(on -> on .lte(cutoff1, g -> of("Needs improvement")) .lte(cutoff2, g -> of("Meets expectations")) .defaults(g -> of("Exceeds expectations"))); return message; }
Tip
One advantage of using the passTo()
method is that you can reuse
your custom methods for other aggregations. You can
use the gradeAverage()
method to find the average of grades for
groups of students filtered by, for example, entry year or district, not just their
class. You can use the evaluate()
method to evaluate, for
example, an individual student's performance, or an entire school's or
district's performance.
The passArrayTo()
method takes all of the students and calculates the
average score by using the gradeAverage()
method. Then, the
passNumberTo()
method uses the evaluate()
method to determine how the
classes are performing. This example stores the result as the evaluation
field using the project()
method.
The following code shows the pipeline for this aggregation:
var students = current().<MqlDocument>getArray("students"); asList(project(fields( computed("evaluation", students .passArrayTo(students -> gradeAverage(students, "finalGrade")) .passNumberTo(grade -> evaluate(grade, of(70), of(85)))) )));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $project: { evaluation: { $switch: { branches: [ { case: { $lte: [ { $avg: "$students.finalGrade" }, 70 ] }, then: "Needs improvement" }, { case: { $lte: [ { $avg: "$students.finalGrade" }, 85 ] }, then: "Meets expectations" } ], default: "Exceeds expectations" } } } } ]
Conversion Operations
You can perform a conversion operation to convert between certain MqlValue
types using the methods described in this section.
Java Method | Aggregation Pipeline Operator |
---|---|
No corresponding operator | |
No corresponding operator | |
Suppose you want to have a collection of student data that includes their graduation years, which are stored as strings. You want to calculate the year of their five-year reunion and store this value in a new field.
The parseInteger()
method converts the graduationYear
to an integer
so that add()
can calculate the reunion year. The addFields()
method
stores this result as a new reunionYear
field.
The following code shows the pipeline for this aggregation:
var graduationYear = current().getString("graduationYear"); asList(addFields( new Field("reunionYear", graduationYear .parseInteger() .add(5)) ));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $addFields: { reunionYear: { $add: [ { $toInt: "$graduationYear" }, 5 ] } } } ]
Date Operations
You can perform a date operation on a value of type MqlDate
using the methods described in this section.
Java Method | Aggregation Pipeline Operator |
---|---|
Suppose you have data about package deliveries and want to match
deliveries that occurred on any Monday in the "America/New_York"
time
zone.
If the deliveryDate
field contains any string values representing
valid dates, such as "2018-01-15T16:00:00Z"
or Jan 15, 2018, 12:00
PM EST
, you can use the parseDate()
method to convert the strings
into date types.
The dayOfWeek()
method determines which day of the week it is and converts
it to a number based on which day is a Monday according to the
"America/New_York"
parameter. The eq()
method compares this value to
2
, which corresponds to Monday based on the provided timezone parameter.
The following code shows the pipeline for this aggregation:
var deliveryDate = current().getString("deliveryDate"); asList(match(expr(deliveryDate .parseDate() .dayOfWeek(of("America/New_York")) .eq(of(2)) )));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $match: { $expr: { $eq: [ { $dayOfWeek: { date: { $dateFromString: { dateString: "$deliveryDate" } }, timezone: "America/New_York" }}, 2 ] } } } ]
Document Operations
You can perform a document operation on a value of type MqlDocument
using the methods described in this section.
Java Method | Aggregation Pipeline Operator |
---|---|
No corresponding operator | |
Suppose you have a collection of legacy customer data which includes
addresses as child documents under the mailing.address
field. You want
to find all the customers who currently live in Washington state. A
document in this collection might resemble the following:
{ "_id": ..., "customer.name": "Mary Kenneth Keller", "mailing.address": { "street": "601 Mongo Drive", "city": "Vasqueztown", "state": "CO", "zip": 27017 } }
The getDocument()
method retrieves the mailing.address
field as a
document so the nested state
field can be retrieved with the
getString()
method. The eq()
method checks if the value of the
state
field is "WA"
.
The following code shows the pipeline for this aggregation:
var address = current().getDocument("mailing.address"); asList(match(expr(address .getString("state") .eq(of("WA")) )));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $match: { $expr: { $eq: [{ $getField: { input: { $getField: { input: "$$CURRENT", field: "mailing.address"}}, field: "state" }}, "WA" ] }}}]
Map Operations
You can perform a map operation on a value of either type MqlMap
or
MqlEntry
using the methods described in this section.
Tip
You should represent data as a map if the data maps arbitrary keys such as dates or item IDs to values.
Java Method | Aggregation Pipeline Operator |
---|---|
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator |
Suppose you have a collection of inventory data where each document represents an individual item you're responsible for supplying. Each document contains a field that is a map of all your warehouses and how many copies they currently have in their inventory of the item. You want to determine the total number of copies of items you have across all of your warehouses. A document in this collection might resemble the following:
{ "_id": ..., "item": "notebook" "warehouses": [ { "Atlanta", 50 }, { "Chicago", 0 }, { "Portland", 120 }, { "Dallas", 6 } ] }
The entries()
method returns the map entries in the warehouses
field as an array. The sum()
method calculates the total value of items
based on the values in the array retrieved with the getValue()
method.
This example stores the result as the new totalInventory
field using the
project()
method.
The following code shows the pipeline for this aggregation:
var warehouses = current().getMap("warehouses"); asList(project(fields( computed("totalInventory", warehouses .entries() .sum(v -> v.getValue())) )));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $project: { totalInventory: { $sum: { $getField: { $objectToArray: "$warehouses" }, } } } } ]
String Operations
You can perform a string operation on a value of type MqlString
using the methods described in this section.
Java Method | Aggregation Pipeline Operator |
---|---|
Suppose you want to generate lowercase usernames for employees of a company from the employees' last names and employee IDs.
The append()
method combines the firstName
and lastName
fields into
a single username, while the toLower()
method makes the entire username
lowercase. This example stores the result as a new username
field using
the project()
method.
The following code shows the pipeline for this aggregation:
var lastName = current().getString("lastName"); var employeeID = current().getString("employeeID"); asList(project(fields( computed("username", lastName .append(employeeID) .toLower()) )));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $project: { username: { $toLower: { $concat: ["$lastName", "$employeeID"] } } } } ]
Type-Checking Operations
You can perform a type-check operation on a value of type MqlValue
using the methods described in this section.
These methods do not return boolean values. Instead, you provide a default value
that matches the type specified by the method. If the checked value
matches the method type, the checked value is returned. Otherwise, the supplied
default value is returned. If you want to program branching logic based on the
data type, see switchOn()
.
Java Method | Aggregation Pipeline Operator |
---|---|
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator | |
No corresponding operator |
Suppose you have a collection of rating data. An early version of the review schema allowed users to submit negative reviews without a star rating. You want convert any of these negative reviews without a star rating to have the minimum value of 1 star.
The isNumberOr()
method returns either the value of rating
, or
a value of 1
if rating
is not a number or is null. The
project()
method returns this value as a new numericalRating
field.
The following code shows the pipeline for this aggregation:
var rating = current().getField("rating"); asList(project(fields( computed("numericalRating", rating .isNumberOr(of(1))) )));
The following code provides an equivalent aggregation pipeline in the Query API:
[ { $project: { numericalRating: { $cond: { if: { $isNumber: "$rating" }, then: "$rating", else: 1 } } } } ]