Better Business Loans with MongoDB and Generative AI

Wei You Pan and Paul Claret

#genAI

Business loans are a cornerstone of banking operations, providing significant benefits to both financial institutions and broader economies. For example, in 2023 the value of commercial and industrial loans in the United States reached nearly $2.8 trillion. However, these loans can present unique challenges and risks that banks must navigate. Besides credit risk, where the borrower may default, banks also face business risk, in which economic downturns or sector-specific declines can impact borrowers' ability to repay loans.

In this post, we dive into the potential of generative AI to generate detailed risk assessments for business loans, and how MongoDB’s multimodal features can be leveraged for comprehensive and multidimensional risk analyses.

The critical business plan

A business plan is essential for a business loan as it serves as a comprehensive roadmap detailing the borrower's plans, strategies, and financial projections. It helps lenders understand the business's goals, viability, and profitability, demonstrating how the loan will be used for growth and repayment. A detailed business plan includes market analysis, competitive positioning, operational plans, and financial forecasts which build a compelling case for the lender's investment and the business’s ability to manage risks effectively, increasing the likelihood of securing the loan.

Reading through borrower credit information and detailed business plans (roughly 15-20 pages long) poses significant challenges for loan officers due to time constraints, the material’s complexity, and the difficulty of extracting key metrics from detailed financial projections, market analyses, and risk factors. Navigating technical details and industry-specific jargon can also be challenging and require specialized knowledge. Identifying critical risk factors and mitigation strategies only adds further complexity along with ensuring accuracy and consistency among loan officers and approval committees.

To overcome these challenges, gen AI can assist loan officers by efficiently analyzing business plans, extracting essential information, identifying key risks, and providing consistent interpretations, thereby facilitating informed decision-making.

Assessing loans with gen AI

Interactive risk analysis with gen AI-powered chatbots

Gen AI can help analyze business plans when built on a flexible developer data platform like MongoDB Atlas. One approach is implementing a gen AI-powered chatbot that allows loan officers to "discuss" the business plan. The chatbot can analyze the input and provide insights on the various risks associated with lending to the borrower for the proposed business. MongoDB sits at the heart of many customer support applications due to its flexible data model that makes it easy to build a single, 360-degree view of data from a myriad of siloed backend source systems.

Figure 1 below shows an example of how ChatGPT-4o responds when asked to assess the risk of a business loan. Although the input of the loan purpose and business description is simplistic, gen AI can offer a detailed analysis.

Figure 1: Example of how ChatGPT-4o could respond when asked to assess the risk of a business loan
Example screenshot of how ChatGPT-4o could respond when asked to assess the risk of a business loan. The response lists out different categories with some information within each category, and then summarizes everything into a conclusion at the end.

Hallucinations or ignorance?

By applying gen AI to risk assessments, lenders can explore additional risk factors that gen AI can evaluate. One factor could be the risk of natural disasters or broader climate risks. In Figure 2 below, we added flood risk specifically as a factor to the previous question to see what the ChatGPT4-o comes back with.

Figure 2: Example of how ChatGPT-4o responded to flood risk as a factor
Example screenshot of how ChatGPT-4o responded to flood risk as a factor. The bot's response begins with an analysis of the location.

Based on the above, there is a low risk of flooding. To validate this, we asked ChatGPT-4o the question differently, focusing on its knowledge of flood data. It suggested reviewing FEMA flood maps and local flood history, indicating it might not have the latest information.

Figure 3: Asking location-specific flood questions
Screenshot of the asking location-specific flood questions to the AI bot. The bot then provides a response about flood risk assessment.

In the query shown in Figure 3 above, ChatGPT gave an opposite answer and indicated there is “significant flooding” providing references to flood evidence after having performed an internet search across 4 sites which it did not perform previously.

From this example, we can see that when ChatGPT does not have the relevant data, it starts to make false claims, which can be considered hallucinations. Initially, it indicated a low flood risk due to a lack of information. However, when specifically asked about flood risk in the second query, it suggested reviewing external sources like FEMA flood maps, recognizing its limitations and need for external validation.

Gen AI-powered chatbots can recognize and intelligently seek additional data sources to fill their knowledge gaps. However, a causal web search won’t provide the level of detail required.

Retrieval-augmented generation-assisted risk analysis

The promising example above demonstrates the experience of how gen AI can augment loan officers to analyze business loans. However, interacting with a gen AI chatbot relies on loan officers repeatedly prompting and augmenting the context with relevant information. This can be time-consuming and impractical due to the lack of prompt engineering skills or the lack of data needed.

Below is a simplified solution of how gen AI can be used to augment the risk analysis process to fill the knowledge gap of the LLM. This demo uses MongoDB as an operational data store leveraging geospatial queries to find out the floods within 5km of the proposed business location. The prompting for this risk analysis highlights the analysis of the flood risk assessment rather than the financial projections.

A similar test was performed on Llama 3, hosted by our MAAP partner Fireworks.AI. It tested the model’s knowledge of flood data showing a similar knowledge gap as ChatGPT-4o. Interestingly, rather than providing misleading answers, LLama 3 provided a “hallucinated list of flood data,” but highlighted that “this data is fictional and for demonstration purposes only. In reality, you would need to access reliable sources such as FEMA's flood data or other government agencies' reports to obtain accurate information.”

Figure 4: LLM’s response with Fictional flood locations
Screenshot of the LLM's response when asked to list out flood locations within a 5km radius of a specific address. The LLM responded with dates of when floods occurred, the exact address the flood occurred, and how much damage happened.

With this consistent demonstration of the knowledge gap in the LLMs in specialized areas, it reinforces the need to explore how RAG (retrieval-augmented generation) with a multimodal data platform can help.

In this simplified demo, you select a business location, a business purpose, and a description of a business plan. To make inputs easier, an “Example” button has been added to leverage gen AI to generate a sample brief business description to avoid the need to key in the description template from scratch.

Figure 5: Choosing a location on the map and writing a brief plan description
Screenshot of the LLM being given a location on a map and asked to write a brief description of a loan purpose and business plan.

Upon submission, it will provide an analysis using RAG with the appropriate prompt engineering to provide a simplified analysis of the business with the consideration of the location and also the flood risk earlier downloaded from external flood data sources.

Figure 6: Loan risk response using RAG
Screenshot of the LLM's response regarding risk assessment. The LLM takes into account the property value, location, flood risk, and then formulates a final risk analysis

In the Flood Risk Assessment section, gen AI-powered geospatial analytics enable loan officers to quickly understand historical flood occurrences and identify the data sources.

You can also reveal all the sample flood locations within the vicinity of the business location selected by clicking on the “Pin” icon. The geolocation pins include the flood location and the blue circle indicates the 5km radius in which flood data is queried, using a simple geospatial command $geoNear.

Figure 7: Flood locations displayed with pins
Map view of New York where the LLM has placed pins showing where floods have occurred

The following diagram provides a logical architecture overview of the RAG data process implemented in this solution highlighting the different technologies used including MongoDB, Meta Llama 3, and Fireworks.AI.

Figure 8: RAG data flow architecture diagram
Diagram of the RAG data flow architecture. The AI app is at the center of the diagram and contains the MongoDB agg pipeline and MongoDB geospatial. Connecting to the app is the private knowledge base on MongoDB and the GenAI application.

With MongoDB's multimodal capabilities, developers can enhance the RAG process by utilizing features such as network graphs, time series, and vector search. This enriches the context for the gen AI agent, enabling it to provide more comprehensive and multidimensional risk analysis through multimodal analytics.

Building risk assessments with MongoDB

When combined with RAG and a multimodal developer data platform like MongoDB Atlas, gen AI applications can provide more accurate and context-aware insights to reduce hallucination and offer profound insights to augment a complex business loan risk assessment process.

Due to the iterative nature of the RAG process, the gen AI model will continually learn and improve from new data and feedback, leading to increasingly accurate risk assessments and minimizing hallucinations. A multimodal data platform would allow you to fully maximize the capabilities of the multimodal AI models.

Head over to our quick-start guide to get started with Atlas Vector Search today.

If you would like to discover how MongoDB can help you on this multimodal gen AI application journey, we encourage you to apply for an exclusive innovation workshop with MongoDB's industry experts to explore bespoke modern app development and tailored solutions to your organization.

Additionally, you can enjoy these resources: