Building with Patterns: The Schema Versioning Pattern

Daniel Coupal and Ken W. Alger
April 19, 2019 | Updated: July 12, 2019
#University

It has been said that the only thing constant in life is change. This holds true to database schemas as well. Information we once thought wouldn’t be needed, we now want to capture. Or new services become available and need to be included in a database record. Regardless of the reason behind the change, after a while, we inevitably need to make changes to the underlying schema design in our application. While this often poses challenges, and perhaps at least a few headaches in a legacy tabular database system, in MongoDB we can use the Schema Versioning pattern to make the changes easier.

Updating a database schema in a tabular database can as mentioned, be challenging. Typically the application needs to be stopped, the database migrated to support the new schema and then restarted. This downtime can lead to poor customer experience. Additionally, what happens if the migration wasn’t a complete success? Reverting back to the prior state is often an even larger challenge.

The Schema Versioning pattern takes advantage of MongoDB’s support for differently shaped documents to exist in the same database collection. This polymorphic aspect of MongoDB is very powerful. It allows documents that have different fields or even different field types for the same field, to peaceful exist side by side.

The Schema Versioning Pattern

The implementation of this pattern is relatively easy. Our applications start with an original schema which eventually needs to be altered. When that occurs we can create and save the new schema to the database with a schema_version field. This field will allow our application to know how to handle this particular document. Alternatively, we can have our application deduce the version based on the presence or absence of some given fields, but the former method is preferred. We can assume that documents that don’t have this field, are version 1. Each new schema version would then increment the schema_version field value and could be handled accordingly in the application.

As new information is saved, we use the most current schema version. We could make a determination, depending on the application and use case, as to the need of updating all documents to the new design, updating when a record is accessed, or not at all. Inside the application, we would create handling functions for each schema version.

Sample Use Case

As stated, just about every database needs to be changed at some point during its lifecycle, so this pattern is useful in many situations. Let’s take a look at a customer profile use case. We start keeping customer information before there is a wide range of contact methods. They can only be reached at home or at work:

{
    "_id": "<ObjectId>",
    "name": "Anakin Skywalker",
    "home": "503-555-0000",
    "work": "503-555-0010"
}

As the years go by and more and more customer records are being saved, we notice that mobile numbers are needing to be saved as well. Adding that field in is straight forward.

{
    "_id": "<ObjectId>",
    "name": "Darth Vader",
    "home": "503-555-0100",
    "work": "503-555-0110",
    "mobile": "503-555-0120"
}

More time goes by and now we’re discovering that fewer and fewer people have a home phone, and other contact methods are becoming more important to record. Items like Twitter, Skype, and Google Hangouts are becoming more popular and maybe weren’t even available when we first started keeping contact information. We also want to attempt to future proof our application as much as possible and after reading the Building with Patterns series we know about the Attribute Pattern and implement that into a contact_method array of values. In doing so, we create a new schema version.

{
    "_id": "<ObjectId>",
    "schema_version": "2",
    "name": "Anakin Skywalker (Retired)",
    "contact_method": [
        { "work": "503-555-0210" },
        { "mobile": "503-555-0220" },
        { "twitter": "@anakinskywalker" },
        { "skype": "AlwaysWithYou" }
    ]
}

The flexibility of the MongoDB document model allows for all of this to occur without downtime of the database. From an application standpoint, it can be designed to read both versions of the schema. This application change in how to handle the schema difference shouldn’t require downtime either, assuming there is more than a single app server involved.

Conclusion

The Schema Versioning pattern is great for when application downtime isn’t an option, updating the documents may take hours, days, or weeks of time to complete, updating the documents to the new version isn’t a requirement, or a combination of any of these. It allows for a new schema_version field to easily be added and for the application to adjust to these changes. Additionally, it provides us as developers the opportunity to better decide when and how data migrations will take place. All of these things result in less future technical debt, another big advantage for this pattern.

As with the other patterns mentioned in this series, there are some things to consider with the Schema Versioning pattern too. If you have an index on a field that is not located at the same level in the document, you may need 2 indexes while you are migrating the documents.

One of the main benefits of this pattern is the simplicity involved in the data model itself. All that is required is to add the schema_version field. Then allow the application to handle and process the different document versions.

Additionally, as was seen in the use case example, we are able to combine schema design patterns together for extra performance. In this case, using the Schema Versioning and Attribute patterns together. Allowing to make schema upgrades without downtime makes the Schema Versioning pattern particularly powerful in MongoDB and could very well be enough of a reason to use MongoDB’s document model versus a legacy tabular database for your next application.

The next post in this series will be a wrap up of all of the patterns we’ve looked at thus far and provide some additional information about use cases we’ve found to be particularly well suited for each pattern.

If you have questions, please leave comments below.

Previous Parts of Building with Patterns:

The Polymorphic pattern
The Attribute pattern
The Bucket pattern
The Outlier pattern
The Computed pattern
The Subset pattern
The Extended Reference pattern
The Approximation pattern
The Tree pattern
The Pre-allocation pattern
The Document Versioning pattern

← Previous

Hacking For Single Mothers And Kids In Poverty

One of MongoDB's core values is "Build Together," and that value is one that reaches outside the company too. We saw that value embodied in the Femisphere Codeswitch, which is one of the many reasons we were proud sponsors of the event that brought single mothers and coders together to build apps.

April 19, 2019

Next →

Empower Financial Services Developers with the Document Model

In financial services , having a proper data modeling strategy is crucial. The volume of data that banks collect and create is continuously growing, partly due to the expansion of digital banking and payments. Financial institutions rely heavily on data to power applications, analyze risks, and make decisions. The way data is modeled can significantly impact the performance, scalability, and accuracy of these systems, as well as the speed of development for new financial products or services. According to Celent, 62% of banks say the “competitive threat from fintechs and other challengers is increasing.” This is no surprise, as the convenience of managing all financial affairs in one centralized app or on a digital-first platform is pulling more and more customers away from traditional financial institutions. In this blog, we will explore why 24% of retail banks ranked “data platforms and management” as one of their top three IT spending priorities for 2024–2025 and how data modeling plays a critical role in this strategic focus. Data modeling and the document model In financial services, data modeling is a critical process that underpins effective data management, enabling institutions to harness the full potential of their data. This process involves identifying relevant financial data and determining how it should be visualized, including its structure, relationships, and management. A well-structured data model leads to better application performance, simpler implementations, and lower costs. It also enhances adaptability and maintainability over time. The principle that data accessed together should be stored together is fundamental for optimizing performance. MongoDB is a document-oriented NoSQL database that offers flexibility and scalability, making it an ideal choice for the industry. It runs with unparalleled reliability, security, and flexibility in a multi-cloud or on-premises environment. Unlike traditional relational databases, MongoDB allows for a more dynamic schema, enabling organizations to adapt to changing business needs without the constraints of rigid table structures. Document databases have the following key features: Document model: Data is stored in documents (unlike other databases that store data in structures like tables or graphs). The documents map to objects in the most popular programming languages, enabling developers to rapidly develop their applications. Flexible schema: MongoDB’s schema-less design enables developers to store data in a way that reflects the real-world relationships between entities, making it easier to manage complex data structures. Data distribution and resilience: Document databases are distributed, which allows for horizontal scaling (typically cheaper than vertical scaling) and data distribution. Queryability through an API or query language: Document databases have an API or query language that allows developers to execute the CRUD operations on the database. Developers have the ability to query for documents based on unique identifiers or field values. MongoDB’s document model enables an organization to design document structures that mirror its application’s access patterns. By embedding related data as subdocuments and arrays within a single document, it can make sure that data that is frequently accessed together is stored together. This alignment simplifies the mapping between the application and the database, enhancing both development efficiency and performance. In contrast to relational databases, document databases are better suited to the needs of modern applications because of their ability to store diverse data types (both structured and unstructured) in binary JSON (BSON) format. This flexibility essentially eliminates the middle layer necessary to convert to a SQL-like format, resulting in easier-to-maintain applications, lower development times, and faster responses to changes. As a document’s schema is dynamic and self-describing, developers don’t need to predefine it in the database. They can modify it at any time, avoiding disruptive schema migrations and improving their own productivity and experience. Watch now: Intro to Data Modeling for Financial Services and Insurance explains: The definition of data modeling Relational database management systems vs MongoDB Relationships: Linking vs Embedding Design Patterns MongoDB’s document model for financial services Leading financial institutions across the world are increasingly adopting MongoDB with seven of the top ten banks in the world (per the 2024 Forbes Global 2000) utilizing MongoDB in their application architectures. MongoDB Atlas is an integrated suite of data services centered on a document database designed to accelerate and simplify how users build with data. MongoDB enables banks to employ an iterative approach to banking modernization . This approach preserves legacy components for as long as they’re needed. By deploying MongoDB as an operational data layer (ODL) in a phased approach, banks can embark on their digital transformation journeys iteratively, without the risk of an all-or-nothing, rip-and-replace approach. Once the new architecture is in place, development teams can build new business functionality faster and scale new services to millions of users. Here are some of the ways that MongoDB helps financial institutions meet their industry data challenges: Open finance: MongoDB’s flexible schema, with the ability to handle various data types including structured and unstructured data, seamlessly integrates with modern technologies and frameworks, making it a great fit for orchestrating open API ecosystems. Personalized banking experiences: Retail banks aim to deliver hyper-personalized services, such as tailored financial advice or customized product recommendations. A flexible and accurate document model ensures platforms can integrate data from multiple channels (mobile apps, branches, and ATMs) for a seamless experience. Encryption and access control: MongoDB provides security capabilities like field-level encryption, role-based access control (RBAC), and auditing—key features to enhance data security and facilitate compliance with regulations like Financial Data Access (FiDA). By encrypting sensitive customer data both at rest and in transit, MongoDB helps keep data secure and tamper-proof throughout its lifecycle. Data sovereignty and global deployment: Financial data must adhere to strict geographic regulations, with certain jurisdictions enforcing the requirement that data remain within specific regions. MongoDB Atlas offers multi-cloud and multi-region deployments, enabling financial institutions to store data in specific regions while maintaining compliance with data sovereignty laws. Multidocument ACID transactions: Transactions in MongoDB feel just like transactions developers are familiar with in relational databases. With multi-document atomicity, consistency, isolation, and durability (ACID) transactions, developers can address transactional use cases across multiple documents within the same cluster. Payment system scalability and flexibility: MongoDB’s document database excels in unifying an organization’s data, from backend payment processing to customer interactions, surfacing insights to create a seamless, connected, and personalized customer journey. Payment systems must accommodate fluctuating transaction volumes and evolving business needs. MongoDB Atlas makes scaling as easy as setting the right configuration. It supports both horizontal and vertical scaling. Fraud detection: MongoDB’s document data model incorporates any kind of data—any structure, any format, any source—no matter how often it changes, enabling an organization to create a holistic picture of its customers to better predict transaction anomalies in real-time. MongoDB then enables it to process large amounts of data and analyze it in real-time to identify suspicious activity. Financial data management: MongoDB lets organizations capture and store financial and trade-related data together, respond to compliance and regulatory requests with confidence, and analyze pretrade communications to gain insights and detect errors. AI-driven interactive banking: MongoDB is designed to work seamlessly with leading AI frameworks, enabling banks to integrate and scale AI applications quickly and efficiently. MongoDB’s flexibility supports innovation by providing a scalable, developer-friendly environment that enables businesses to rapidly develop new financial services and products and scale to support millions of users. And, as a document-based database, MongoDB supports the flexible data modeling that is so crucial to the financial services industry. If you have any questions or would like to learn more about MongoDB and data modeling, feel free to check out the following resources: Intro to Data Modeling for Financial Services and Insurance Temenos Banking Cloud Scales to Record High Transactions with MongoDB Atlas and Microsoft Azure Our Solutions Library is where you can learn about different use cases for gen AI and other interesting topics that are applied to financial services and many other industries.

January 22, 2025