Safe Software Deployments: Through the Looking Glass

Mark Porter
January 11, 2022

We’ve covered a lot of ground in this Safe Software Deployment series, from the 180 Rule to Z Deployments to the Goldilocks Gauge. But there is an elephant in the room. Or should I say, a jabberwock.

In Lewis Carroll’s novel Through the Looking Glass, Alice discovers that the mirror above her mantle is not a mirror at all, but a doorway to another world in which things work very differently. When developers push software from staging to production, they often have a similar experience. Though they want to believe that staging and production are the same, they discover that staging is not a mirror at all, and production is another world, in which things work very differently. And out of that distortion come bugs and outages.

The bottom line is this: Staging ≠ Production. And it never will. There are simply too many variables between these two environments to ever achieve exact alignment. Those environments can be different in hardware: CPU cores, threads per core, cache size, microcode; bus architectures; memory size; firmware. Or different in software or configuration: OS versions; compilers; libraries; network traffic profiles. Or different in network topology, edge caching; DNS and directory services. And of course we know that no matter how diligent you test in staging, customer workloads exercise the software in different ways. But the major reason they are different is because both environments have had a different set of software deployed to them over time, with a different set of configuration parameters - and a different combination of patches, hacks, and rollbacks.

That staging is a mirror of production is one of the great delusions of software development. Developers often tell this lie to themselves, to overcome their fear of pushing to prod. Worse, developers often know the truth, but don’t really know how to explain this ambiguity to their management chain, leading to inevitable trust issues when deployments fail.

So what can we do about it?

First, accept reality. Modern, distributed software systems are nonlinear, and all test environments are simulacrums. It’s particularly important to help managers understand that even if it were possible to create exact duplicates of production – which it’s not – it would be practically and financially unjustifiable.

Second, approach testing like an actuary. The leap from staging to production is essentially an exercise in probability. You should know your architecture, operational characteristics, and costs well enough to prioritize tests and reduce risk. You may even want to create two or more test environments that are deliberately different to reduce the odds of failure. And you should continue to run tests after the release, so you can surface bugs before your customers do.

Third, if you can, treat both production and staging like cattle, not pets. If you have an enlightened software organization, one that believes in best practices, set up your systems so that you can blow away and recreate both your production and staging environments at regular intervals. This will reset many of the deviations and environment drift that build up over time.

Finally, you have to be able to do automated rollbacks (the 180 Rule) which work reliably (Z Deployments), and that are the right size for optimum efficiency and safety (the Goldilocks Gauge. ;-)

I saved this column for last because completely eliminating the differences between staging and production is not a solvable problem. And frankly, you shouldn’t even try. But you don’t want to get caught flat-footed either, so you need a system of best practices that greatly reduces risk and fosters confidence. In other words, a system of Safe Software Deployment that helps you overcome the fear of pushing to prod.

And most importantly, everyone in your organization needs to have a common understanding of the problem, from the top down. So feel free to print this post out, slide it under the door of your manager, and slink away like that Cheshire Cat.

Have another technique for managing the divergence between staging and production? Share it with me at @MarkLovesTech.

← Previous

Why Telcos Implement TM Forum Open APIs with MongoDB

How MongoDB speeds up development of new TM Forum services In the evolving and increasingly complex telecommunications industry, providers are turning to open digital architectures to enable interoperability and manage new digital offerings. TM Forum (TMF), an alliance of more than 850 companies, accelerates digital innovation through its TMF Open APIs , which provide a standard interface for the exchange of different telco data models. The use of TMF Open APIs ranges from providers of off-the-shelf software to proprietary developments of the largest telecommunications providers. In working with many of the world’s largest communication service providers (CSPs) and the related software provider ecosystem, MongoDB has seen a significant number of organizations leveraging these emerging standards. Through exposing common interfaces CSPs are able to adopt a modular architecture made up of best-of-breed components (either internally or externally developed) while minimizing the time, effort, and cost required to integrate them. “MongoDB’s document model technology has given CSPs the ability to be more agile and more innovative at the same time, which aligns perfectly with the mission of TM Forum Open APIs,” said George Glass, TM Forum Chief Technology Officer. “We’re delighted to see MongoDB partnering with developers to deliver TM Forum-compliant microservices in days instead of weeks or months.” TMF Open APIs empower CSPs to build new microservices in days, not weeks or months The MongoDB document model allows developers to work with data in a natural way and store data as it is retrieved by the application. In the context of TMF Open APIs, this means that TMF resources of the API can be persisted 1:1 in the database without the need for additional mappings. This is highlighted in the example below, which demonstrates how a portion of the TMF666 (Account Management) resource model can be simply and intuitively implemented as MongoDB documents as opposed to traditional relational structures. Relational Model MongoDB Representation { "_id": "3df04c97-51c7-4cb4-817e-1bd86eb2e2b3", "name": "John Doe", "state": "verified", "accountType": "B2B", "description": "Special Customer", "accountBalance": [ { "amount": { "unit": "euro", "value": NumberDecimal(89.98) }, "balanceType": "main", "validFor": { "endDateTime": ISODate("2021-09-01"), "startDateTime": ISODate("2021-07-01") } } ], "accountRelationship": [...], "contact": [...], "creditLimit": { "unit": "euro", "value": 500 }, "lastModified": ISODate("2021-08-31T16:28:41.111Z") } When new versions of the specifications are released, or requirements demand that the data model be extended with custom defined data, the new attributes can be implemented in the application without having to spend time and effort changing database tables and constraints. This flexibility allows CSPs to achieve the agility promise of TMF Open APIs in practice. Development teams can rapidly develop TMF-compliant microservices because there is no need to model TMF entity models in your relational database. Together with full-stack digital BSS Finnish provider Tecnotree, MongoDB is helping build broader access for telecommunications development teams in charge of launching new microservices. Tecnotree recently earned Platinum Badge for Open APIs from TM Forum and were among the first in the telecom industry to adopt, mature and stabilize open-source principles for Digital Business Support System (BSS). “As we continue to support telecoms operators’ customer-first approach to drive growth in the enterprise market, partnership with MongoDB is helping us push aside some of the biggest technological obstacles faced by operators globally,’’ said Sajan Joy Thomas, VP-Product Office, Tecnotree Oyj. Query data flexibly using TMF specifications With this flexibility comes the ability to query data flexibly in order to support a variety of defined access patterns. This is demonstrated by the FinancialAccount resource in the TM666 specification (account management API). The GET operation contains a parameter to specify which fields should be returned (a projection) and iterate through the result set by 'offset' and 'limit’ parameters. As the specification evolves, the MongoDB query language and rich indexing capabilities will allow for complex filter parameters to be easily supported. The original GET operation referred to above can be implemented using MongoDB’s aggregation framework with the following query: db.collection.aggregate([ { $skip: 50 }, { $limit: 10 }, { $project: { "accountType": "B2C", "state": "active" } } ]) Adding additional filtering is as simple as including an additional $match stage as the first stage in the pipeline (with appropriate index support). These aggregation queries can easily be represented in an idiomatic way using any of the supported drivers. The following Java example implementation shows how the above query can be written using the Java driver. No error-prone query string is needed nor an additional object-relational mapping layer to translate the result from MongoDB objects into Java objects. public List<FinancialAccount> getFinancialAccounts(String fields, Integer skip, Integer limit) { [...] var projectStage = createProjectStageFromFieldList(fields); var pipeline = new ArrayList<Bson>(); pipeline.add(skip(skip)); pipeline.add(limit(limit)); if (projectStage != null) { pipeline.add(projectStage); } collection.aggregate(pipeline).iterator().forEachRemaining(accounts::add); return accounts; } GitHub: https://github.com/mongodb-industry-solutions/tmforum-openapi-example Aside from some additional boilerplate code, this is the only code needed to get a basic working API. TMF provides the API as a Swagger specification, and this can be used to generate the resource and API classes automatically. These resource classes can be passed directly to the MongoDB driver, which will handle the translation into MongoDB data types. The easiest way to access the Swagger file and get started is to download the specification from the TMF Open API resources page . In the linked GitHub project, a Maven plugin is used to generate the code from the Swagger file, but this can also be manually done by using the Swagger Codegen CLI, depending on what fits better into the development workflow: java -jar swagger-codegen-cli-3.0.30.jar generate \ -i ./TTMF666-Account-v4.0.0.swagger.json \ -l java \ -o ./client/java In most applications, more is needed than a functioning external API. Business processes, such as customer creation, may require that multiple operations on resources be treated as an atomic transaction. MongoDB has support for distributed transactions at global scale to fulfil this requirement at the database level. Despite this, with the power of the document model, transactions are often not needed in many cases where a relational database would need them. Take a look at the FinancialAccount again. Updating all related information in one atomic operation would require a transaction in relational databases, whereas it’s just one update operation of a document in MongoDB. This leads to an overall reduction in I/O and better application performance. These advantages, combined with industry leading scale and resilience capabilities, drive CSPs to implement TMF APIs with MongoDB as the primary data source and indeed, many of the example implementations provided by TM Forum use MongoDB for data persistence. More resources for telecommunications IT professionals: [ Case study ] How Verizon built an edge architecture and manages edge applications [ Video ] Establishing a simpler, cloud-based database model to benefit internal operations [ Solution brief ] MongoDB for TM Forum Open APIs

January 10, 2022

Next →

Building Gen AI with MongoDB & AI Partners | February 2025

February was big for MongoDB—and, more importantly, for anyone looking to build AI applications that deliver highly accurate, relevant information (in other words, for everyone building AI apps). MongoDB announced the acquisition of Voyage AI , a pioneer in state-of-the-art embedding and reranking models that power next-generation AI applications. Because generative AI is by nature probabilistic, models can “hallucinate”, and generate false or misleading information. This can lead to serious risks, especially in cases or industries (e.g., financial services) where accurate information is paramount. To address this, organizations building AI apps need high-quality retrieval; they need to trust that the most relevant information is extracted from their data with precision. Voyage AI’s advanced embedding and reranking models enable applications to extract meaning from highly specialized and domain-specific text and unstructured data. With roots at Stanford and MIT, Voyage AI’s world-class team is trusted by AI innovators like Anthropic, LangChain, Harvey, and Replit. Integrating Voyage AI’s technology with MongoDB will enable organizations to easily build trustworthy, AI-powered applications by offering highly accurate and relevant information retrieval deeply integrated with operational data. For more, check out MongoDB CEO Dev Ittycheria’s blog post about Voyage AI , and what this means for developers and businesses (in short, delivering high-quality results at scale). Onward! P.S. If you’re in Vegas for HumanX this week, stop by booth 412 to say hi to MongoDB! Welcoming new AI and tech partners The Voyage AI news was hardly the only exciting development last month. In February 2025, MongoDB welcomed three new AI and tech partners that offer product integrations with MongoDB. Read on to learn more about each great new partner! CopilotKit Seattle-based CopilotKit provides open source infrastructure for in-app AI copilots. CopilotKit helps organizations build production-ready copilots and agents effortlessly. “We’re excited to be partnering with MongoDB to help companies build best-in-class copilots that leverage RAG & take action based on internal data,” said Uli Barkai, Co-Founder and Chief Marketing Officer at CopilotKit. “MongoDB made it dead simple to build a scalable vector database with operational data. This collaboration enables developers to easily ship production-grade RAG applications.” Varonis Varonis is the leader in data security, protecting data wherever it lives—across SaaS, IaaS, and hybrid cloud environments. Varonis’ cloud-native Data Security Platform continuously discovers and classifies critical data, removes exposures, and detects advanced threats with AI-powered automation. “Varonis’s mission is to protect data wherever it lives,” said David Bass, Executive Vice President of Engineering and Chief Technology Officer at Varonis. “We are thrilled to further advance our mission by offering AI-powered data security and compliance for MongoDB, the database of choice for high-performance application and AI development. With this integration, joint customers can automatically discover and classify sensitive data, detect abnormal activities, secure AI data pipelines, and prevent data leaks.” Xlrt Xlrt is an automated insight-generation platform that enables financial institutions to create innovative financial credit products at scale by simplifying the financial spreading process. “We are excited to partner with MongoDB Atlas to transform AI-driven financial workflows,” said Rupesh Chaudhuri, Chief Operating Officer and Co-Founder of Xlrt. “XLRT.ai leverages agentic AI, combining graph-based contextualization, vector search, and LLMs to redefine data-driven decision-making. With MongoDB's robust NoSQL and vector search capabilities, we’re delivering unparalleled efficiency, accuracy, and scalability in automating financial processes.” To learn more about building AI-powered apps with MongoDB, check out our AI Learning Hub and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving AI partner ecosystem. And visit the MongoDB AI Applications Program (MAAP) page to learn how MongoDB and the MAAP ecosystem helps organizations build applications with advanced AI capabilities.

March 12, 2025