- Use cases: Gen AI
- Industries: Financial Services, Insurance
- Products and tools: Atlas, Spark Streaming Connector, Vector Search
- Partners: Langchain, Fireworks.ai
In this solution, you’ll learn how the convergence of alternative data, artificial intelligence, and generative AI (gen AI) is reshaping the foundations of credit scoring. The challenges of traditional models are being overcome through the adoption of alternative credit scoring methods by offering a more inclusive and nuanced assessment of creditworthiness. This solution provides an example via an online credit card application process that illustrates an approach to realizing the transformative opportunities of gen AI and how MongoDB can be leveraged to supercharge credit scoring. This approach can be applied to other credit products — like personal loans, mortgages, corporate loans, and trade finance credit lines — and their applications without necessarily confining them to a credit card product only.
The pursuit of credit can be a labyrinthine journey, particularly when credit assessment processes and, in particular, credit scoring mechanisms pose significant challenges. Here are some of the challenges or limitations of traditional credit scoring models:
An online application process generally aims to provide convenience and efficiency, allowing individuals to access tailored financial products while ensuring transparency and accuracy in the application information. However, due to issues using traditional credit scoring methods, the limitations and challenges hinder the experience provided, especially for individuals with limited credit profiles based on the traditional credit scoring approach.
In the following solution accelerator, we will explore how MongoDB can help transform this credit application in the following key aspects of the process:
Applying for credit cards or other credit products can often be a lengthy and intricate process. Let’s delve into the details:
Application process complexity: Obtaining a credit card involves several steps, which can be time-consuming. Here’s a brief overview of the process:
Redundant information collection: Unfortunately, banks often collect “redundant” data that they should already have. For instance:
In summary, streamlining this process by eliminating redundant requests and leveraging existing data could significantly enhance the user experience.
These application forms for a credit card may be relatively simple, but the complexity increases with other credit products (e.g., auto-loan, mortgage, trade finance, etc.). Within an application form, there could be tabular but also hierarchical information that needs to be filled in not to mention alternative data to be sourced from possibly authorized third-party data sources by the borrower. MongoDB’s flexible developer data platform natively supports JSON data and does not require documents to have the same schema, improving the ability to handle various types of data.
Leveraging JSON for online credit application forms simplifies the data capture process and also the performance in data processing. JSON's structured data representation proves highly conducive for organizing the multifaceted information within credit applications, encompassing personal, financial, and employment details. Its human-readable format facilitates collaboration among developers, supporting ease of editing and understanding of the data model, while interoperability across various platforms ensures seamless data exchange. The flexibility of JSON aligns perfectly with the dynamic nature of credit application requirements, enabling straightforward modifications and additions.
MongoDB stands out as an optimal choice for processing JSON documents in credit applications due to its native support for JSON-like BSON format. The database's flexibility allows for dynamic schema adjustments, aligning well with the evolving nature of credit application forms. MongoDB's ability to handle hierarchical data structures, coupled with robust querying and indexing capabilities, ensures efficient retrieval and organization of complex credit application information. As a scalable solution, MongoDB accommodates growing volumes of credit data while maintaining performance. Its seamless integration with JavaScript and other popular programming languages, tools, and technologies (e.g., Spark, Kafka) enhances development workflows, while features such as document validation and support for open banking standards further contribute to data integrity and standardized information exchange. In essence, MongoDB provides a versatile and efficient platform for storing and processing JSON documents that are highly suited for the nature of online credit applications.
Leveraging MongoDB’s developer data platform — an integrated suite of data services centered around a cloud database — we can create a comprehensive customer/user banking profile by combining relevant data points.
Below, we will show you how it can be done. Here is an architectural diagram of the data processing pipeline for the predicting probability of delinquency and credit scoring:
The data pipeline for credit scoring a customer involves the following steps:
The goal is to accurately assess the creditworthiness of a customer to make informed lending decisions and financial product recommendations. The pipeline is a demonstration of existing risk-scoring pipelines maintained by organizations.
When it comes to credit application declination, understanding the reasons behind it is crucial. Let’s explore how MongoDB and large language models (LLMs) can shed light on XGBoost model predictions (the model used in this tutorial).
Here is the architecture diagram explaining credit scoring using an LLM.
As explained in the earlier section, the risk profiling ML pipeline employed provides a probability score that defines the risk associated with the profile for product recommendation. This message is communicated back to the user in a templatized manner where only the final status of the application is communicated to the end user. In the proposed architecture with LLMs, prompt engineering can be utilized effectively to explain the reason for the final approved product status with valid reasons explained to the end customer.
Here, you can find the code and example responses. The code to generate a similar message can be done using Python in a Jupyter Notebook. The details on setting up MongoDB Atlas and fetching a connection string are available at this link.
Below is one example of a rejection explanation.
This sort of messaging to the customer can be categorized as a form of explainable AI where the features used in the model to perform risk profiling can be ranked and used as a part of the custom prompt to the LLM. This can help generate more descriptive reasons for the end customer to explain their user profile, as shown above. LLMs can also help summarize the list of descriptive reasons to provide a simplified view of the description. The application can then allow drill-downs to the details if the customer wants to find out more to enhance their credit profile and user experience.
In this demo, there are two approaches used to score the credit application. The credit application status is determined using an ML approach as described in the earlier section with the use of more than 20 credit-related features/dimensions. Here is a subset of the top 15 most important features:
For more details on features used in this demo and the explanation of each feature used, please have a look at the source code provided via the GitHub repository.
To demonstrate the difference between the ML versus traditional credit scoring approach, at the bottom of the Status Explanation screen, we are also showing how a typical traditional credit scoring method may score the same credit application but typically using only a handful of dimensions. In this demo, we have used 5 features typically used for leading credit score providers. These features are the credit applicant’s repayment history, credit utilization, credit history, outstanding and number of credit inquiries, to show how the traditional score and ML results may vary with the change in the credit profile.
The credit institution should always try to cross-sell to the customer with a relevant product that meets their needs as they are already engaged in the process and application portal.
Financial institutions can implement a product recommendation system that provides a human-friendly explanation of the rationale for the new recommendation, which would open up new revenue opportunities that legacy systems today do not provide. Providing the rationales can create a more personalized relationship with clients and further increase the acceptance of the recommended product. Here is an example of a data architecture that is used to achieve this.
MongoDB Atlas Vector Search is a feature that allows you to perform semantic search and generative AI over any type of data. It integrates your operational database and vector search in a single, unified, and fully managed platform with a MongoDB native interface. You can create vector embeddings with machine learning models, then store and index them in MongoDB Atlas for retrieval augmented generation (RAG), semantic search, recommendation engines, dynamic personalization, and other use cases. Visit the Atlas Vector Search Quick Start guide to try our semantic search tool now.
Retrieval-augmented generation (RAG) is a paradigm that uses vector search to retrieve relevant documents based on the input query. It then provides these retrieved documents as context to the LLMs to help generate a more informed and accurate response.
The tutorial above mentions technologies that can be used to solve a credit card product recommendation use case. The steps involved in the process are described below:
Here, you can find the code and examples of alternative product recommendations. Below are a few examples. The code to generate a product recommendation and customize the product recommendation description can be performed using Python in a Jupyter Notebook.
In conclusion, credit scoring is undergoing a transformative phase with the integration of gen AI. As we explore the dynamics of traditional models, challenges faced by borrowers, and the future envisioned with generative AI, it becomes evident that transparency, efficiency, and personalization are at the forefront of the evolving credit scoring landscape. The synergy of technology and financial acumen is shaping a future where credit decisions are not only accurate but also empowering for borrowers.
The code to demonstrate all the features of MongoDB for building such a solution are available in the following GitHubs:
The proposed solution's functional and nonfunctional features include:
Create this demo by following the instructions and associated models in this solution’s repository.
Learn how MongoDB’s developer data platform supports a wide range of use cases in the lending and leasing space.
Explore how gen AI can be leveraged to mitigate bias and improve accuracy of credit scoring.
Discover how Toyota is financing the next generation of mobility services with MongoDB.