Building with Patterns: The Tree Pattern

Daniel Coupal and Ken W. Alger
March 21, 2019 | Updated: April 15, 2019
#University

Many of the schema design patterns we've covered so far have stressed that saving time on JOIN operations is a benefit. Data that's accessed together should be stored together and some data duplication is okay. A schema design pattern like Extended Reference is a good example. However, what if the data to be joined is hierarchical? For example, you would like to identify the reporting chain from an employee to the CEO? MongoDB provides the $graphLookup operator to navigate the data as graphs, and that could be one solution. However, if you need to do a lot of queries of this hierarchical data structure, you may want to apply the same rule of storing together data that is accessed together. This is where we can use the Tree Pattern.

The Tree Pattern

There are many ways to represent a tree in a legacy tabular database. The most common ones are for a node in the graph to list its parent and for a node to list its children. Both of these representations may require multiple access to build the chain of nodes.

Corporate structure with parent nodes

Corporate structure with Parent nodes

Corporate structure with Child nodes

Corporate structure with Child nodes

Alternatively, we could store the full path from a node to the top of the hierarchy. In this case, we'd basically be storing the "parents" for each node. In a tabular database, it would likely be done by encoding a list of the parents. The approach in MongoDB is to simply represent this as an array.

As can be seen here, in this representation there is some data duplication. If the information is relatively static, like in genealogy, your parents and ancestors won't change making this array easy to manage. However, in our corporate structure example, when things change and there is restructuring, you will need to update the hierarchy as needed. This is still a small cost compared to the benefits you can gain from not calculating the trees all the time.

Sample Use Case

Product catalogs are another very good example of using the Tree pattern. Often products belong to categories, which are part of other categories. For example, a Solid State Drive may be under Hard Drives, which is under Storage, which is under Computer Parts. Occasionally the organization of the categories may change, but not too frequently.

IImage of a document with parent and ancestor nodes

Note in the document above the ancestor_categories field which keeps track of the entire hierarchy. We also have the field parent_category. Duplicating the immediate parent in these two fields is a best practice we've developed after working with many customers using the Tree Pattern. Including the "parent" field is often handy, especially if you need to maintain the ability to use $graphLookup on your documents.

Keeping the ancestors in an array provides the ability to create a multi-key index on those values. It allows for all descendants of a given category to be easily found. As for the immediate children, they are accessible by looking at the documents that have our given category as its immediate "parent". We just told you that this field would be handy.

Conclusion

As for many patterns, there is often a tradeoff between simplicity and performance when using them. In the case of the tree pattern, you get better performance by avoiding multiple joins, however, you will need to manage the updates to your graph.

The next post in this series will look at the Pre-Allocation Pattern.

If you have questions, please leave comments below.

Previous Parts of Building with Patterns:

The Polymorphic pattern
The Attribute pattern
The Bucket pattern
The Outlier pattern
The Computed pattern
The Subset pattern
The Extended Reference pattern
The Approximation pattern

← Previous

MongoDB Go Driver Tutorial Part 1: Connecting, Using BSON, and CRUD Operations

The official MongoDB Go Driver recently moved to GA with the release of version 1.0.0. It's now regarded as feature complete and ready for production use. This tutorial will help you get started with the MongoDB Go Driver. You will create a simple program and learn how to: Install the MongoDB Go Driver, Connect to MongoDB using the Go Driver, Use BSON objects in Go, and Send CRUD operations to MongoDB.

March 20, 2019

Next →

Empower Financial Services Developers with the Document Model

In financial services , having a proper data modeling strategy is crucial. The volume of data that banks collect and create is continuously growing, partly due to the expansion of digital banking and payments. Financial institutions rely heavily on data to power applications, analyze risks, and make decisions. The way data is modeled can significantly impact the performance, scalability, and accuracy of these systems, as well as the speed of development for new financial products or services. According to Celent, 62% of banks say the “competitive threat from fintechs and other challengers is increasing.” This is no surprise, as the convenience of managing all financial affairs in one centralized app or on a digital-first platform is pulling more and more customers away from traditional financial institutions. In this blog, we will explore why 24% of retail banks ranked “data platforms and management” as one of their top three IT spending priorities for 2024–2025 and how data modeling plays a critical role in this strategic focus. Data modeling and the document model In financial services, data modeling is a critical process that underpins effective data management, enabling institutions to harness the full potential of their data. This process involves identifying relevant financial data and determining how it should be visualized, including its structure, relationships, and management. A well-structured data model leads to better application performance, simpler implementations, and lower costs. It also enhances adaptability and maintainability over time. The principle that data accessed together should be stored together is fundamental for optimizing performance. MongoDB is a document-oriented NoSQL database that offers flexibility and scalability, making it an ideal choice for the industry. It runs with unparalleled reliability, security, and flexibility in a multi-cloud or on-premises environment. Unlike traditional relational databases, MongoDB allows for a more dynamic schema, enabling organizations to adapt to changing business needs without the constraints of rigid table structures. Document databases have the following key features: Document model: Data is stored in documents (unlike other databases that store data in structures like tables or graphs). The documents map to objects in the most popular programming languages, enabling developers to rapidly develop their applications. Flexible schema: MongoDB’s schema-less design enables developers to store data in a way that reflects the real-world relationships between entities, making it easier to manage complex data structures. Data distribution and resilience: Document databases are distributed, which allows for horizontal scaling (typically cheaper than vertical scaling) and data distribution. Queryability through an API or query language: Document databases have an API or query language that allows developers to execute the CRUD operations on the database. Developers have the ability to query for documents based on unique identifiers or field values. MongoDB’s document model enables an organization to design document structures that mirror its application’s access patterns. By embedding related data as subdocuments and arrays within a single document, it can make sure that data that is frequently accessed together is stored together. This alignment simplifies the mapping between the application and the database, enhancing both development efficiency and performance. In contrast to relational databases, document databases are better suited to the needs of modern applications because of their ability to store diverse data types (both structured and unstructured) in binary JSON (BSON) format. This flexibility essentially eliminates the middle layer necessary to convert to a SQL-like format, resulting in easier-to-maintain applications, lower development times, and faster responses to changes. As a document’s schema is dynamic and self-describing, developers don’t need to predefine it in the database. They can modify it at any time, avoiding disruptive schema migrations and improving their own productivity and experience. Watch now: Intro to Data Modeling for Financial Services and Insurance explains: The definition of data modeling Relational database management systems vs MongoDB Relationships: Linking vs Embedding Design Patterns MongoDB’s document model for financial services Leading financial institutions across the world are increasingly adopting MongoDB with seven of the top ten banks in the world (per the 2024 Forbes Global 2000) utilizing MongoDB in their application architectures. MongoDB Atlas is an integrated suite of data services centered on a document database designed to accelerate and simplify how users build with data. MongoDB enables banks to employ an iterative approach to banking modernization . This approach preserves legacy components for as long as they’re needed. By deploying MongoDB as an operational data layer (ODL) in a phased approach, banks can embark on their digital transformation journeys iteratively, without the risk of an all-or-nothing, rip-and-replace approach. Once the new architecture is in place, development teams can build new business functionality faster and scale new services to millions of users. Here are some of the ways that MongoDB helps financial institutions meet their industry data challenges: Open finance: MongoDB’s flexible schema, with the ability to handle various data types including structured and unstructured data, seamlessly integrates with modern technologies and frameworks, making it a great fit for orchestrating open API ecosystems. Personalized banking experiences: Retail banks aim to deliver hyper-personalized services, such as tailored financial advice or customized product recommendations. A flexible and accurate document model ensures platforms can integrate data from multiple channels (mobile apps, branches, and ATMs) for a seamless experience. Encryption and access control: MongoDB provides security capabilities like field-level encryption, role-based access control (RBAC), and auditing—key features to enhance data security and facilitate compliance with regulations like Financial Data Access (FiDA). By encrypting sensitive customer data both at rest and in transit, MongoDB helps keep data secure and tamper-proof throughout its lifecycle. Data sovereignty and global deployment: Financial data must adhere to strict geographic regulations, with certain jurisdictions enforcing the requirement that data remain within specific regions. MongoDB Atlas offers multi-cloud and multi-region deployments, enabling financial institutions to store data in specific regions while maintaining compliance with data sovereignty laws. Multidocument ACID transactions: Transactions in MongoDB feel just like transactions developers are familiar with in relational databases. With multi-document atomicity, consistency, isolation, and durability (ACID) transactions, developers can address transactional use cases across multiple documents within the same cluster. Payment system scalability and flexibility: MongoDB’s document database excels in unifying an organization’s data, from backend payment processing to customer interactions, surfacing insights to create a seamless, connected, and personalized customer journey. Payment systems must accommodate fluctuating transaction volumes and evolving business needs. MongoDB Atlas makes scaling as easy as setting the right configuration. It supports both horizontal and vertical scaling. Fraud detection: MongoDB’s document data model incorporates any kind of data—any structure, any format, any source—no matter how often it changes, enabling an organization to create a holistic picture of its customers to better predict transaction anomalies in real-time. MongoDB then enables it to process large amounts of data and analyze it in real-time to identify suspicious activity. Financial data management: MongoDB lets organizations capture and store financial and trade-related data together, respond to compliance and regulatory requests with confidence, and analyze pretrade communications to gain insights and detect errors. AI-driven interactive banking: MongoDB is designed to work seamlessly with leading AI frameworks, enabling banks to integrate and scale AI applications quickly and efficiently. MongoDB’s flexibility supports innovation by providing a scalable, developer-friendly environment that enables businesses to rapidly develop new financial services and products and scale to support millions of users. And, as a document-based database, MongoDB supports the flexible data modeling that is so crucial to the financial services industry. If you have any questions or would like to learn more about MongoDB and data modeling, feel free to check out the following resources: Intro to Data Modeling for Financial Services and Insurance Temenos Banking Cloud Scales to Record High Transactions with MongoDB Atlas and Microsoft Azure Our Solutions Library is where you can learn about different use cases for gen AI and other interesting topics that are applied to financial services and many other industries.

January 22, 2025