How to Set Up MongoDB Class Maps for C# for Optimal Query Performance and Storage Size
Rate this article
Starting out with MongoDB and C#? These tips will help you get your class maps right from the beginning to support your desired schema.
When starting my first projects with MongoDB and C# several years ago, what captivated me the most was how easy it was to store plain old CLR objects (POCOs) in a collection without having to create a static relational structure first and maintaining it painfully over the course of development.
Though MongoDB and C# have their own set of data types and naming conventions, the MongoDB C# Driver connects the two in a very seamless manner. At the center of this, class maps are used to describe the details of the mapping.
This post shows how to fine-tune the mapping in key areas and offers solutions to common scenarios.
Even if you don't define a class map explicitly, the driver will create one as soon as the class is used for a collection. In this case, the properties of the POCO are mapped to elements in the BSON document based on the name. The driver also tries to match the property type to the BSON type of the element in MongoDB.
Though automatic mapping of a class will make sure that POCOs can be stored in a collection easily, tweaking the mapping is rewarded by better memory efficiency and enhanced query performance. Also, if you are working with existing data, customizing the mapping allows POCOs to follow C# and .NET naming conventions without changing the schema of the data in the collection.
Adjusting the class map can be as easy as adding attributes to the declaration of a POCO (declarative mapping). These attributes are used by the driver when the class map is auto-mapped. This happens when the class is first used to access data in a collection:
1 public class BlogPost 2 { 3 // ... 4 [ ]5 public string Title { get; set; } = string.Empty; 6 // ... 7 }
The above sample shows how the
BsonElement
attribute is used to adjust the name of the Title
property in a document in MongoDB:1 { 2 // ... 3 "title": "Blog post title", 4 // ... 5 }
However, there are scenarios when declarative mapping is not applicable: If you cannot change the POCOs because they are defined in a third-party libary or if you want to separate your POCOs from MongoDB-related code parts, there also is the option to define the class maps imperatively by calling methods in code:
1 BsonClassMap.RegisterClassMap<BlogPost>(cm => 2 { 3 cm.AutoMap(); 4 cm.MapMember(x => x.Title).SetElementName("title"); 5 });
The code above first performs the auto-mapping and then includes the
Title
property in the mapping as an element named title
in BSON, thus overriding the auto-mapping for the specific property.One thing to keep in mind is that the class map needs to be registered before the driver starts the automatic mapping process for a class. It is a good idea to include it in the bootstrapping process of the application.
This post will use declarative mapping for better readability but all of the adjustments can also be made using imperative mapping, as well. You can find an imperative class map that contains all the samples at the end of the post.
Whether you are working with existing data or want to name properties differently in BSON for other reasons, you can use the
BsonElement("specificElementName")
attribute introduced above. This is especially handy if you only want to change the name of a limited set of properties.If you want to change the naming scheme in a widespread fashion, you can use a convention that is applied when auto-mapping the classes. The driver offers a number of conventions out-of-the-box (see the namespace MongoDB.Bson.Serialization.Conventions) and offers the flexibility to create custom ones if those are not sufficient.
An example is to name the POCO properties according to C# naming guidelines in Pascal case in C#, but name the elements in camel case in BSON by adding the CamelCaseElementNameConvention:
1 var pack = new ConventionPack(); 2 pack.Add(new CamelCaseElementNameConvention()); 3 ConventionRegistry.Register( 4 "Camel Case Convention", 5 pack, 6 t => true);
Please note the predicate in the last parameter. This can be used to fine-tune whether the convention is applied to a type or not. In our sample, it is applied to all classes.
The above code needs to be run before auto-mapping takes place. You can still apply a
The above code needs to be run before auto-mapping takes place. You can still apply a
BsonElement
attribute here and there if you want to overwrite some of the names.MongoDB uses ObjectIds as identifiers for documents by default for the “_id” field. This is a data type that is unique to a very high probability and needs 12 bytes of memory. If you are working with existing data, you will encounter ObjectIds for sure. Also, when setting up new documents, ObjectIds are the preferred choice for identifiers. In comparison to GUIDs (UUIDs), they require less storage space and are ordered so that identifiers that are created later receive higher values.
In C#, properties can use
ObjectId
as their type. However, using string
as the property type in C# simplifies the handling of the identifiers and increases interoperability with other frameworks that are not specific to MongoDB (e.g. OData).In contrast, MongoDB should serialize the identifiers with the specific BSON type ObjectId to reduce storage size. In addition, performing a binary comparison on ObjectIds is much safer than comparing strings as you do not have to take letter casing, etc. into account.
1 public class BlogPost 2 { 3 [ ]4 public string Id { get; set; } = ObjectId.GenerateNewId().ToString(); 5 // ... 6 [ ]7 public ICollection<string> TopComments { get; set; } = new List<string>(); 8 }
By applying the
BsonRepresentation
attribute, the Id
property is serialized as an ObjectId
in BSON. Also, the array of identifiers in TopComments
also uses ObjectIds as their data type for the array elements:1 { 2 "_id" : ObjectId("6569b12c6240d94108a10d20"), 3 // ... 4 "TopComments" : [ 5 ObjectId("6569b12c6240d94108a10d21"), 6 ObjectId("6569b12c6240d94108a10d22") 7 ] 8 }
While
ObjectId
is the default type of identifier for MongoDB, GUIDs or UUIDs are a data type that is used for identifying objects in a variety of programming languages. In order to store and query them efficiently, using a binary format instead of strings is also preferred.In the past, GUIDs/UUIDs have been stored as BSON type binary of subtype 3; drivers for different programming environments serialized the value differently. Hence, reading GUIDs with the C# driver that had been serialized with a Java driver did not yield the same value. To fix this, the new binary subtype 4 was introduced by MongoDB. GUIDs/UUIDs are then serialized in the same way across drivers and languages.
To provide the flexibility to both handle existing values and new values on a property level, the MongoDB C# Driver introduced a new way of handling GUIDs. This is referred to as
GuidRepresentationMode.V3
. For backward compatibility, when using Version 2.x of the MongoDB C# Driver, the GuidRepresentationMode is V2 by default (resulting in binary subtype 3). This is set to change with MongoDB C# Driver version 3. It is a good idea to opt into using V3 now and specify the subtype that should be used for GUIDs on a property level. For new GUIDs, subtype 4 should be used.This can be achieved by running the following code before creating the client:
1 BsonDefaults.GuidRepresentationMode 2 = GuidRepresentationMode.V3;
Keep in mind that this setting requires the representation of the GUID to be specified on a property level. Otherwise, a
BsonSerializationException
will be thrown informing you that "GuidSerializer cannot serialize a Guid when GuidRepresentation is Unspecified." To fix this, add a BsonGuidRepresentation
attribute to the property:1 [ ]2 public Guid MyGuid { get; set; } = Guid.NewGuid();
There are various settings available for
GuidRepresentation
. For new GUIDs, Standard
is the preferred value, while the other values (e.g., CSharpLegacy
) support the serialization of existing values in binary subtype 3.Maybe you are working with existing data and only some part of the elements is relevant to your use case. Or you have older documents in your collection that contain elements that are not relevant anymore. Whatever the reason, you want to keep the POCO minimal so that it only comprises the relevant properties.
By default, the MongoDB C# Driver is strict and raises a
FormatException
if it encounters elements in a BSON document that cannot be mapped to a property on the POCO:"Element '[...]' does not match any field or property of class [...]."
Those elements are called "extra elements."One way to handle this is to simply ignore extra elements by applying the
BsonIgnoreExtraElements
attribute to the POCO:1 [ ]2 public class BlogPost 3 { 4 // ... 5 }
If you want to use this behavior on a large scale, you can again register a convention:
1 var pack = new ConventionPack(); 2 pack.Add(new IgnoreExtraElementsConvention(true)); 3 ConventionRegistry.Register( 4 "Ignore Extra Elements Convention", 5 pack, 6 t => true);
Be aware that if you use replace when storing the document, extra properties that C# does not know about will be lost.
On the other hand, MongoDB's flexible schema is built for handling documents with different elements. If you are interested in the extra properties or you want to safeguard for a replace, you can add a dictionary to your POCO and mark it with a
BsonExtraElements
attribute. The dictionary is filled with the content of the properties upon deserialization:1 public class BlogPost 2 { 3 // ... 4 [ ]5 public IDictionary<string, object> ExtraElements { get; set; } = new Dictionary<string, object>(); 6 }
Even when replacing a document that contains an extra-elements-dictionary, the key-value pairs of the dictionary are serialized as elements so that their content is not lost (or even updated if the value in the dictionary has been changed).
Pre-calculation is key for great query performance and is a common pattern when working with MongoDB. In POCOs, this is supported by adding read-only properties, e.g.:
1 public class BlogPost 2 { 3 // ... 4 public DateTime CreatedAt { get; set; } = DateTime.UtcNow; 5 public DateTime? UpdatedAt { get; set; } 6 public DateTime LastChangeAt => UpdatedAt ?? CreatedAt; 7 }
By default, the driver excludes read-only properties from serialization. This can be fixed easily by applying a
BsonElement
attribute to the property — you don't need to change the name:1 public class BlogPost 2 { 3 // ... 4 public DateTime CreatedAt { get; set; } = DateTime.UtcNow; 5 public DateTime? UpdatedAt { get; set; } 6 [ ]7 public DateTime LastChangeAt => UpdatedAt ?? CreatedAt; 8 }
After this change, the read-only property is included in the document and it can be used in indexes and queries:
1 { 2 // ... 3 "CreatedAt" : ISODate("2023-12-01T12:16:34.441Z"), 4 "UpdatedAt" : null, 5 "LastChangeAt" : ISODate("2023-12-01T12:16:34.441Z") 6 }
Common scenarios are very well supported by the MongoDB C# Driver. If this is not enough, you can create a custom serializer that supports your specific scenario.
Custom serializers can be used to handle documents with different data for the same element. For instance, if some documents store the year as an integer and others as a string, a custom serializer can analyze the BSON type during deserialization and read the value accordingly.
However, this is a last resort that you will rarely need to use as the existing options offered by the MongoDB C# Driver cover the vast majority of use cases.
As you have seen, the MongoDB C# Driver offers a lot of options to tweak the mapping between POCOs and BSON documents. POCOs can follow C# conventions while at the same time building upon a schema that offers good query performance and reduced storage consumption.
1 BsonClassMap.RegisterClassMap<BlogPost>(cm => 2 { 3 // Perform auto-mapping to include properties 4 // without specific mappings 5 cm.AutoMap(); 6 // Serialize string as ObjectId 7 cm.MapIdMember(x => x.Id) 8 .SetSerializer(new StringSerializer(BsonType.ObjectId)); 9 // Serialize ICollection<string> as array of ObjectIds 10 cm.MapMember(x => x.TopComments) 11 .SetSerializer( 12 new IEnumerableDeserializingAsCollectionSerializer<ICollection<string>, string, List<string>>( 13 new StringSerializer(BsonType.ObjectId))); 14 // Change member name 15 cm.MapMember(x => x.Title).SetElementName("title"); 16 // Serialize Guid as binary subtype 4 17 cm.MapMember(x => x.MyGuid).SetSerializer(new GuidSerializer(GuidRepresentation.Standard)); 18 // Store extra members in dictionary 19 cm.MapExtraElementsMember(x => x.ExtraElements); 20 // Include read-only property 21 cm.MapMember(x => x.LastChangeAt); 22 });
Top Comments in Forums
There are no comments on this article yet.