Custom Types
On this page
Overview
This guide explains how to use PyMongo to encode and decode custom types.
Encode a Custom Type
You might need to define a custom type if you want to store a data type that
the driver doesn't understand. For example, the
BSON library's Decimal128
type is distinct
from Python's built-in Decimal
type. Attempting
to save an instance of Decimal
with PyMongo results in an
InvalidDocument
exception, as shown in the following code example:
from decimal import Decimal num = Decimal("45.321") db["coll"].insert_one({"num": num})
Traceback (most recent call last): ... bson.errors.InvalidDocument: cannot encode object: Decimal('45.321'), of type: <class 'decimal.Decimal'>
The following sections show how to define a custom type for this Decimal
type.
Define a Type Codec Class
To encode a custom type, you must first define a type codec. A type codec
describes how an instance of a custom type is
converted to and from a type that the bson
module can already encode.
When you define a type codec, your class must inherit from one of the base classes in the
codec_options
module. The following table describes these base classes, and when
and how to implement them:
Base Class | When to Use | Members to Implement |
---|---|---|
codec_options.TypeEncoder | Inherit from this class to define a codec that
encodes a custom Python type to a known BSON type. |
|
codec_options.TypeDecoder | Inherit from this class to define a codec that
decodes a specified BSON type into a custom Python type. |
|
codec_options.TypeCodec | Inherit from this class to define a codec that
can both encode and decode a custom type. |
|
Because the example Decimal
custom type can be converted to and from a
Decimal128
instance, you must define how to encode and decode this type.
Therefore, the Decimal
type codec class must inherit from
the TypeCodec
base class:
from bson.decimal128 import Decimal128 from bson.codec_options import TypeCodec class DecimalCodec(TypeCodec): python_type = Decimal bson_type = Decimal128 def transform_python(self, value): return Decimal128(value) def transform_bson(self, value): return value.to_decimal()
Add Codec to the Type Registry
After defining a custom type codec, you must add it to PyMongo's type registry,
the list of types the driver can encode and decode.
To do so, create an instance of the TypeRegistry
class, passing in an instance
of your type codec class inside a list. If you create multiple custom codecs, you can
pass them all to the TypeRegistry
constructor.
The following code examples adds an instance of the DecimalCodec
type codec to
the type registry:
from bson.codec_options import TypeRegistry decimal_codec = DecimalCodec() type_registry = TypeRegistry([decimal_codec])
Note
Once instantiated, registries are immutable and the only way to add codecs to a registry is to create a new one.
Get a Collection Reference
Finally, define a codec_options.CodecOptions
instance, passing your TypeRegistry
object as a keyword argument. Pass your CodecOptions
object to the
get_collection()
method to obtain a collection that can use your custom type:
from bson.codec_options import CodecOptions codec_options = CodecOptions(type_registry=type_registry) collection = db.get_collection("test", codec_options=codec_options)
You can then encode and decode instances of the Decimal
class:
import pprint collection.insert_one({"num": Decimal("45.321")}) my_doc = collection.find_one() pprint.pprint(my_doc)
{'_id': ObjectId('...'), 'num': Decimal('45.321')}
To see how MongoDB stores an instance of the custom type,
create a new collection object without the customized codec options, then use it
to retrieve the document containing the custom type. The following example shows
that PyMongo stores an instance of the Decimal
class as a Decimal128
value:
import pprint new_collection = db.get_collection("test") pprint.pprint(new_collection.find_one())
{'_id': ObjectId('...'), 'num': Decimal128('45.321')}
Encode a Subtype
You might also need to encode one or more types that inherit from your custom type.
Consider the following subtype of the Decimal
class, which contains a method to
return its value as an integer:
class DecimalInt(Decimal): def get_int(self): return int(self)
If you try to save an instance of the DecimalInt
class without first registering a type
codec for it, PyMongo raises an error:
collection.insert_one({"num": DecimalInt("45.321")})
Traceback (most recent call last): ... bson.errors.InvalidDocument: cannot encode object: Decimal('45.321'), of type: <class 'decimal.Decimal'>
To encode an instance of the DecimalInt
class, you must define a type codec for
the class. This type codec must inherit from the parent class's codec, DecimalCodec
,
as shown in the following example:
class DecimalIntCodec(DecimalCodec): def python_type(self): # The Python type encoded by this type codec return DecimalInt
You can then add the sublcass's type codec to the type registry and encode instances of the custom type:
import pprint from bson.codec_options import CodecOptions decimal_int_codec = DecimalIntCodec() type_registry = TypeRegistry([decimal_codec, decimal_int_codec]) codec_options = CodecOptions(type_registry=type_registry) collection = db.get_collection("test", codec_options=codec_options) collection.insert_one({"num": DecimalInt("45.321")}) my_doc = collection.find_one() pprint.pprint(my_doc)
{'_id': ObjectId('...'), 'num': Decimal('45.321')}
Note
The transform_bson()
method of the DecimalCodec
class results in
these values being decoded as Decimal
, not DecimalInt
.
Define a Fallback Encoder
You can also register a fallback encoder, a callable to encode types not recognized by BSON and for which no type codec has been registered. The fallback encoder accepts an unencodable value as a parameter and returns a BSON-encodable value.
The following fallback encoder encodes Python's Decimal
type to a Decimal128
:
def fallback_encoder(value): if isinstance(value, Decimal): return Decimal128(value) return value
After declaring a fallback encoder, perform the following steps:
Construct a new instance of the
TypeRegistry
class. Use thefallback_encoder
keyword argument to pass in the fallback encoder.Construct a new instance of the
CodecOptions
class. Use thetype_registry
keyword argument to pass in theTypeRegistry
instance.Call the
get_collection()
method. Use thecodec_options
keyword argument to pass in theCodecOptions
instance.
The following code example shows this process:
type_registry = TypeRegistry(fallback_encoder=fallback_encoder) codec_options = CodecOptions(type_registry=type_registry) collection = db.get_collection("test", codec_options=codec_options)
You can then use this reference to a collection to store instances of the Decimal
class:
import pprint collection.insert_one({"num": Decimal("45.321")}) my_doc = collection.find_one() pprint.pprint(my_doc)
{'_id': ObjectId('...'), 'num': Decimal128('45.321')}
Note
Fallback encoders are invoked after attempts to encode the given value with standard BSON encoders and any configured type encoders have failed. Therefore, in a type registry configured with a type encoder and fallback encoder that both target the same custom type, the behavior specified in the type encoder takes precedence.
Encode Unknown Types
Because fallback encoders don't need to declare the types that they encode
beforehand, you can use them in cases where a TypeEncoder
doesn't work.
For example, you can use a fallback encoder to save arbitrary objects to MongoDB.
Consider the following arbitrary custom types:
class MyStringType(object): def __init__(self, value): self.__value = value def __repr__(self): return "MyStringType('%s')" % (self.__value,) class MyNumberType(object): def __init__(self, value): self.__value = value def __repr__(self): return "MyNumberType(%s)" % (self.__value,)
You can define a fallback encoder that pickles the objects it receives
and returns them as Binary
instances with a custom
subtype. This custom subtype allows you to define a TypeDecoder
class that
identifies pickled artifacts upon retrieval and transparently decodes them
back into Python objects:
import pickle from bson.binary import Binary, USER_DEFINED_SUBTYPE from bson.codec_options import TypeDecoder def fallback_pickle_encoder(value): return Binary(pickle.dumps(value), USER_DEFINED_SUBTYPE) class PickledBinaryDecoder(TypeDecoder): bson_type = Binary def transform_bson(self, value): if value.subtype == USER_DEFINED_SUBTYPE: return pickle.loads(value) return value
You can then add the PickledBinaryDecoder
to the codec options for a collection
and use it to encode and decode your custom types:
from bson.codec_options import CodecOptions,TypeRegistry codec_options = CodecOptions( type_registry=TypeRegistry( [PickledBinaryDecoder()], fallback_encoder=fallback_pickle_encoder ) ) collection = db.get_collection("test", codec_options=codec_options) collection.insert_one( {"_id": 1, "str": MyStringType("hello world"), "num": MyNumberType(2)} ) my_doc = collection.find_one() print(isinstance(my_doc["str"], MyStringType)) print(isinstance(my_doc["num"], MyNumberType))
True True
Limitations
PyMongo type codecs and fallback encoders have the following limitations:
You can't customize the encoding behavior of Python types that PyMongo already understands, like
int
andstr
. If you try to instantiate a type registry with one or more codecs that act upon a built-in type, PyMongo raises aTypeError
. This limitation also applies to all subtypes of the standard types.You can't chain type encoders. A custom type value, once transformed by a codec's
transform_python()
method, must result in a type that is either BSON-encodable by default, or can be transformed by the fallback encoder into something BSON-encodable. It cannot be transformed a second time by a different type codec.The
Database.command()
method doesn't apply custom type decoders while decoding the command response document.The
gridfs
class doesn't apply custom type encoding or decoding to any documents it receives or returns.
API Documentation
For more information about encoding and decoding custom types, see the following API documentation: