Docs Menu

Docs HomeDevelop ApplicationsPython DriversPyMongo

Retrieve Distinct Field Values

On this page

  • Overview
  • Sample Data
  • distinct() Method
  • Retrieve Distinct Values Across a Collection
  • Retrieve Distinct Values Across Specified Documents
  • Modify Distinct Behavior
  • API Documentation

Within a collection, different documents might contain different values for a single field. For example, one restaurant document has a borough value of "Manhattan", and another has a borough value of "Queens". With PyMongo, you can retrieve all the distinct values that a field contains across multiple documents in a collection.

The examples in this guide use the sample_restaurants.restaurants collection from the Atlas sample datasets. To learn how to create a free MongoDB Atlas cluster and load the sample datasets, see the Get Started with PyMongo.

To retrieve the distinct values for a specified field, call the distinct() method and pass in the name of the field you want to find distinct values for.

The following example retrieves the distinct values of the borough field in the restaurants collection:

results = restaurants.distinct("borough")
for restaurant in results:
print(restaurant)
Bronx
Brooklyn
Manhattan
Missing
Queens
Staten Island

The results show every distinct value that appears in the borough field across all documents in the collection. Although several documents have the same value in the borough field, each value appears in the results only once.

You can provide a query filter to the distinct() method to find the distinct field values across a subset of documents in a collection. A query filter is an expression that specifies search criteria used to match documents in an operation. For more information about creating a query filter, see Specify a Query.

The following example retrieves the distinct values of the borough field for all documents that have a cuisine field value of "Italian":

results = restaurants.distinct("borough", {
"cuisine": "Italian"
})
for restaurant in results:
print(restaurant)
Bronx
Brooklyn
Manhattan
Queens
Staten Island

The distinct() method accepts optional parameters, which represent options you can use to configure the operation. If you don't specify any options, the driver does not customize the operation.

The following table describes the options you can set to customize distinct():

Property
Description
filter
A query filter that specifies the documents to retrieve distinct values from.
session
An instance of ClientSession.
comment
A comment to attach to the operation.
maxTimeMS
The maximum amount of time to allow the operation to run, in milliseconds.
collation
An instance of Collation.

The following example retrieves the distinct values of the name field for all documents that have a borough field value of "Bronx" and a cuisine field value of "Pizza". It also uses the comment option to add a comment to the operation.

results = restaurants.distinct("name",
{ "borough": "Bronx",
"cuisine": "Pizza" },
comment="Bronx pizza restaurants"
)
$1.25 Pizza
18 East Gunhill Pizza
2 Bros
Aenos Pizza
Alitalia Pizza Restaurant
...

To learn more about any of the methods or types discussed in this guide, see the following API documentation:

← Count Documents