Building Gen AI Applications Using Iguazio and MongoDB

Wei You Pan, Ashwin Gangadhar, Yaron Haviv, and Zeev Rispler

#genAI

AI can lead to major enterprise advancements and productivity gains. By offering new capabilities, they open up opportunities for enhancing customer engagement, content creation, process automation, and more.

According to McKinsey & Company, generative Al has the potential to deliver an additional $200-340B in value for the banking industry. One popular use case is customer service, where gen AI chatbots have quickly transformed the way customers interact with organizations. They handle customer inquiries and provide personalized recommendations while empathizing with them and offering nuanced support tailored to individual needs. Another less obvious use case is fraud detection and prevention. AI offers a transformative approach by interpreting regulations, supporting data cleansing, and enhancing the efficacy of surveillance systems. These systems can analyze transactions in real-time and flag suspicious activities more accurately, which helps institutions prevent monetary losses.

In this post, we introduce the joint MongoDB and Iguazio gen AI solution which allows for the development and deployment of resilient and scalable gen AI applications. Before diving into how it works and its value for you, let’s first discuss the challenges enterprises face when operationalizing gen AI applications.

Check out our AI resource page to learn more about building AI-powered apps with MongoDB.

Challenges to operationalizing gen AI

Building an AI application starts with a proof of concept. However, enterprises need to successfully operationalize and deploy models in production to derive business value and ensure the solution is resilient. Doing so comes with its own set of challenges such as:

  • Engineering challenges - Deploying gen AI applications requires substantial engineering efforts from enterprises. They need to maintain technological consistency throughout the operational pipeline, set up sufficient infrastructure resources, and ensure the availability of a team equipped with a comprehensive ML and data skillset. Currently, AI development and deployment processes are slow, time-consuming, and fraught with friction.

  • LLM risks - When deploying LLMs, enterprises need to reduce privacy risks and comply with ethical AI standards. This includes preventing hallucinations, ensuring unbiased outputs, filtering out offensive content, protecting intellectual property, and aligning with regulatory standards.

  • Glue logic and standalone solutions - The AI landscape is vibrant, and new solutions are frequently being developed. Autonomously integrating these solutions can create overhead for ops and data professionals, resulting in duplicate efforts, brittle architectures, time-consuming processes, and a lack of consistency.

Iguazio and MongoDB together: High-performing and simplified gen AI operationalization

The joint Iguazio and MongoDB solution leverages the innovation of these two leading platforms. The integrated solution allows customers to streamline data processing and storage, ensuring gen AI apps reach production while eliminating risks, improving performance, and enhancing governance.

MongoDB for end-to-end AI data management

MongoDB Atlas, an integrated suite of data services centered around a multi-cloud NoSQL database, enables developers to unify operational (structured and unstructured data), analytical, and AI data services into a single platform to streamline building AI-enriched applications. MongoDB’s flexible data model enables easy integration with different AI/ML platforms, allowing organizations to adapt to changes in the AI landscape without extensive infrastructure modifications. MongoDB meets the requirements of a modern AI and vector data store:

  • Operational and unified: MongoDB’s ability to serve as the operational data store (ODS) enables financial institutions to efficiently handle large volumes of real-time operational data and unifies AI/vector data, ensuring AI/ML models use the most accurate information. It also enables organizations to meet compliance and regulatory requirements (e.g., 3DS2, ISO20022, PsD2) by the timely processing of large data volumes.

  • Multi-modal: Alongside structured data, there's a growing need for semi-structured and unstructured data in gen AI applications. MongoDB's JSON-based multi-modal document model allows you to handle and process diverse data types, including documents, network/knowledge graphs, geospatial data, and time series data. Atlas Vector Search lets you search unstructured data. You can create vector embeddings with ML models and store and index them in Atlas for retrieval augmented generation (RAG), semantic search, recommendation engines, dynamic personalization, and other use cases.

  • Flexible: MongoDB’s flexible schema design enables development teams to make application adjustments to meet changing data requirements and redeploy application changes in an agile manner.

  • Vector store: Alongside the operational data store, MongoDB serves as a vector store with vector indexing and search capabilities for performing semantic analysis. To help improve gen AI experiences with greater accuracy and mitigate hallucination risks, using a RAG architecture together with the multi-modal operational data typically required by AI applications.

  • Deployment flexibility: MongoDB can be deployed self-managed on-premise, in the cloud, or in a SaaS environment. Or deployed across a hybrid cloud environment for institutions not ready to be entirely on the public cloud.

Iguazio’s AI platform

Iguazio (acquired by McKinsey) is an AI platform designed to streamline the development of ML and gen AI applications in production at scale.

Iguazio’s gen AI-ready architecture includes capabilities for data management, model development, application deployment, and LiveOps. The platform—now part of QuantumBlack Horizon, McKinsey’s suite of AI development tools—addresses enterprises’ two biggest challenges when advancing from gen AI proofs of concept to live implementations within business environments.

  • Scalability: Ensures uninterrupted service regardless of workload demands, scaling gen AI applications when required.

  • Governance: Gen AI guardrails mitigate risk by directing essential monitoring, data privacy, and compliance activities.

By automating and orchestrating AI, Iguazio accelerates time-to-market, lowers operating costs, enables enterprise-grade governance, and enhances business profitability.

Iguazio’s platform includes LLM customization capabilities, GPU provisioning to improve utilization and reduce cost, and hybrid deployment options (including multi-cloud or on premises). This positions Iguazio to uniquely answer enterprise needs, even in highly regulated environments, either in a self-serve or managed services model (through QuantumBlack, McKinsey’s AI arm). Iguazio’s AI platform provides:

  • Structured and unstructured data pipelines for processing, versioning, and loading documents.

  • Automated flow of data prep, tuning, validating, and LLM optimization to specific data efficiently using elastic resources (CPUs, GPUs, etc.).

  • Rapid deployment of scalable real-time serving and application pipelines that use LLMs (locally hosted or external) as well as the required data integration and business logic.

  • Built-in monitoring for the LLM data, training, model, and resources, with automated model re-tuning and RLHF.

  • Ready-made gen AI application recipes and components.

  • An open solution with support for various frameworks and LLMs and flexible deployment options (any cloud, on-prem).

  • Built-in guardrails to eliminate risks and improve accuracy and control.

Examples: Building with Iguazio and MongoDB

#1 Building a smart customer care agent

The joint solution can be used to create smart customer care agents. The diagram below illustrates a production-ready gen AI agent application with its four main elements:

  1. Data pipeline for processing the raw data (eliminating risks, improving quality, encoding, etc.).

  2. Application pipelines for processing incoming requests (enriched with data from MongoDB’s multi-modal store), running the agent logic, and applying various guardrails and monitoring tasks.

  3. Development and CI/CD pipelines for fine-tuning and validating models, testing the application to detect accuracy risk challenges, and automatically deploying the application.

  4. A monitoring system collecting application and data telemetry to identify resource usage, application performance, risks, etc. The monitoring data can be used to improve the application performance further through an RLHF (reinforcement learning from human feedback) integration.

Diagram showing how Iguazio and MongoDB work together to build and scale gen AI apps efficiently. The diagram has 5 categories: Data pipelines, Application pipeline, Unified Multi-Modal Developer Data Platform, ML & CI/CD Pipeline, and Monitoring & Feedback. The user interacts with the front end of the application, which connects to the application pipeline. The application pipeline interacts with the developer data platform, powered by MongoDB. And finally, the developer data platform interacts with and receives data from the data pipeline, the CI/CD pipeline, and monitoring & feedback.

#2 Building a hyper-personalized banking agent

In this example, accompanied by a demo video, we show a banking agent based on a modular RAG architecture that helps customers choose the right credit card for them. The agent has access to a MongoDB Atlas data platform with a list of credit cards and a large array of customer details. When a customer chats with the agent, it chooses the best credit card for them, based on the data and additional personal customer information, and can converse with them in an appropriate tone. The bank can further hyperpersonalize the chat to make it more appealing to the client and improve the odds of the conversion, or add guardrails to minimize AI hallucinations and improve interaction accuracy.

Example customer #1: Olivia

Olivia is a young client requesting a credit card. The agent looks at her credit card history and annual income and recommends a card with low fees. The tone of the conversation is casual.

Screenshot of a conversation between a bank customer and a credit card AI chatbot. The conversation starts with the chatbot asking how they can assist the customer. The customer, Olivia, replies: I'm looking for a new credit card. The chatbot then says: Based on your income, Olivia, I'd recommend the KMK Card for you. It's super cool because it has no annual or joining fees. How does that sound to you? Olivia replies: Sounds good, what else can you tell me about it? The chatbot finishes the conversation by saying: the KMK card from Kotak Mahindra Bank is a pretty cool options! It offers a bunch of benefits like rewards points, cashback on your dining and movie transactions, and even free PVR movie tickets. Plus, you can save when you shop at department and grocery stores. And the best part? There are no annual or joining fees. Sounds like a sweet deal, right?

When Olivia asks for more information, the agent accesses the card data while retaining the same youthful and fun tone.

Example customer #2: Miss Jessope

The second example involves an older woman who the agent calls “Ms Jessope”. When asking for a new card, the agent accesses her credit card history to choose the best card based on her history. The conversation takes place in a respectful tone.

Another screenshot example of a customer interacting with an AI chatbot. The conversation starts with the customer Miss Jessope, asking for a new credit card. The chatbot replies, Based on our policy and your income class, I would recommend the Clark Bank Credit Card. It has no annual fees and no joining fees, making it a cost-effective choice for you. Is there anything else I can assist you with? Miss Jessope then asks for more information, and the chatbot provides some additional details about the card.

When requesting more information, the response is more informative and detailed, and the language remains respectful.

How does this work under the hood?

As you can see from the figure below, the tool has access to customer profile data in MongoDB Atlas collection bfsi.user_data and is able to hyperpersonalize its response and recommendations based on various aspects of the customer profile.

Screenshot of bfsi.user data.

A RAG process is implemented using the Iguazio AI Platform with MongoDB Atlas data platform. The Atlas Vector Search capabilities were used to find the relevant operational data stored in MongoDB (card name, annual fees, client occupation, interest rates, and more) to augment the contextual data during the interaction itself to personalize the interaction. The virtual agent is also able to talk to another agent tool that has a view of the credit card data in bfsi.card_info (such as card name, annual and joining fees, card perks such as cashback, and more), to pick a credit card that would best suit the needs of the customer.

Screenshot of bfsi.card into in the MongoDB Atlas dashboard.

To ensure the client gets the best choice of card, a guardrail is added that filters the cards chosen according to the data gathered by the agent as a built-in component of the agent tool. In addition, another set of guardrails is added to validate that the card offered suits the customer by comparing the card with the optimal ones recommended for the customer’s age range.

This whole process is straightforward to set up and configure using the Iguazio AI Platform, with seamless integration to MongoDB. The user only needs to create the agent workflow and connect it to MongoDB Atlas, and everything works out of the box.

Lastly, as you can see from the demo above, the agent was able to leverage the vector search capabilities of MongoDB Atlas to retrieve, summarize, and personalize the messaging on the card information and benefits in the same tone as the user’s.

For more detailed information and resources on how MongoDB and Iguazio can transform your gen AI applications, we encourage you to apply for an exclusive innovation workshop with MongoDB's industry experts to explore bespoke modern app development and tailored solutions for your organization.

Additionally, you can enjoy these resources:

Add vector search to your arsenal for more accurate and cost-efficient RAG applications by enrolling in the MongoDB and DeepLearning.AI course "Prompt Compression and Query Optimization" for free today.