Create a MongoClient
On this page
Overview
To connect to a MongoDB deployment, you need two things:
A connection URI, also known as a connection string, which tells PyMongo which MongoDB deployment to connect to.
A MongoClient object, which creates the connection to the MongoDB deployment and lets you perform operations on it.
You can also use either of these components to customize the way PyMongo behaves while connected to MongoDB.
This guide shows you how to create a connection string and use a MongoClient
object
to connect to MongoDB.
Connection URI
A standard connection string includes the following components:
Component | Description |
---|---|
| Required. A prefix that identifies this as a string in the standard connection format. |
| Optional. Authentication credentials. If you include these, the client
authenticates the user against the database specified in |
| Required. The host and optional port number where MongoDB is running. If you don't
include the port number, the driver uses the default port, |
| Optional. The authentication database to use if the
connection string includes |
| Optional. A query string that specifies connection-specific
options as |
For more information about creating a connection string, see Connection Strings in the MongoDB Server documentation.
MongoClient
To create a connection to MongoDB, pass a connection URI as a string to the
MongoClient
constructor. In the
following example, the driver uses a sample connection URI to connect to a MongoDB
instance on port 27017
of localhost
:
from pymongo import MongoClient uri = "mongodb://localhost:27017/" client = MongoClient(uri)
The following table describes the positional parameters that the MongoClient()
constructor accepts. All parameters are optional.
Parameter | Description |
---|---|
| The hostname, IP address, or Unix domain socket path of the MongoDB deployment. If your application connects to a replica set or sharded cluster, you can specify multiple hostnames or IP addresses in a Python list. If you pass a literal IPv6 address, you must enclose the address in square brackets
( PyMongo doesn't support multihomed and round-robin DNS addresses. Data type: |
| The port number MongoDB Server is running on. You can include the port number in the Data type: |
| The default class that the client uses to decode BSON documents returned by queries.
This parameter supports the If you specify Data type: |
| If this parameter is For more information about aware and naive Data type: |
| If this parameter is If your application is running in a
function-as-a-service (FaaS)
environment, the default value is Data type: |
| An instance of the Data type: TypeRegistry |
Concurrent Execution
The following sections describe PyMongo's support for concurrent execution mechanisms.
Multithreading
PyMongo is thread-safe and provides built-in connection pooling
for threaded applications.
Because each MongoClient
object represents a pool of connections to the
database, most applications require only a single instance of
MongoClient
, even across multiple requests.
Multiple Forks
PyMongo supports calling the fork()
method to create a new process.
However, if you fork a process, you must create a new MongoClient
instance in the
child process.
Important
Don't Pass a MongoClient to a Child Process
If you use the fork()
method to create a new process, don't pass an instance
of the MongoClient
class from the parent process to the child process. This creates
a high probability of deadlock among MongoClient
instances in the child process.
PyMongo tries to issue a warning if this deadlock might occur.
For more information about deadlock in forked processes, see Forking a Process Causes a Deadlock.
Multiprocessing
PyMongo supports the Python multiprocessing
module.
However, on Unix systems, the multiprocessing module spawns processes by using
the fork()
method. This carries the same risks described in Multiple Forks
To use multiprocessing with PyMongo, write code similar to the following example:
# Each process creates its own instance of MongoClient. def func(): db = pymongo.MongoClient().mydb # Do something with db. proc = multiprocessing.Process(target=func) proc.start()
Important
Do not copy an instance of the MongoClient
class from the parent process to a child
process.
Type Hints
If you're using Python v3.5 or later, you can add type hints to your Python code.
The following code example shows how to declare a type hint for a MongoClient
object:
client: MongoClient = MongoClient()
In the previous example, the code doesn't specify a type for the documents that the
MongoClient
object will work with. To specify a document type,
include the Dict[str, Any]
type when you
create the MongoClient
object, as shown in the following example:
from typing import Any, Dict client: MongoClient[Dict[str, Any]] = MongoClient()
Troubleshooting
MongoClient Fails ConfigurationError
Providing invalid keyword argument names causes the driver to raise this error.
Ensure that the keyword arguments that you specify exist and are spelled correctly.
Forking a Process Causes a Deadlock
A MongoClient
instance spawns multiple threads to run background tasks, such as
monitoring connected servers. These threads share state that is protected by instances
of the threading.Lock
class, which are themselves
not fork-safe.
PyMongo is subject to the same limitations as any other multithreaded
code that uses the threading.Lock
class, or any mutexes.
One of these limitations is that the locks become useless after calling the
fork()
method. When fork()
executes, the driver copies all the parent process's locks to
the child process in the same state as they were in the parent. If they are
locked in the parent process, they are also locked in the child process. The child process
created by fork()
has only one thread, so any locks created by
other threads in the parent process are never released in the child process.
The next time the child process attempts to acquire one of these locks, deadlock occurs.
Starting in PyMongo version 4.3, after you call the os.fork()
method, the
driver uses the os.register_at_fork()
method to reset its locks and other shared state
in the child process. Although this reduces the likelihood of a deadlock,
PyMongo depends
on libraries that aren't fork-safe in multithreaded applications, including
OpenSSL and
getaddrinfo(3).
Therefore, a deadlock can still occur.
The Linux manual page for fork(2) also imposes the following restriction:
After a
fork()
in a multithreaded program, the child can safely call only async-signal-safe functions (see signal-safety(7)) until such time as it calls execve(2).
Because PyMongo relies on functions that are not async-signal-safe, it can cause deadlocks or crashes when running in a child process.
Tip
For an example of a deadlock in a child process, see PYTHON-3406 in Jira.
For more information about the problems caused by Python locks in
multithreaded contexts with fork()
, see Issue 6721
in the Python Issue Tracker.
API Documentation
To learn more about creating a MongoClient
object in PyMongo,
see the following API documentation: