Jack Yallop

30 results

Enhancing Retail with Retrieval-Augmented Generation (RAG)

In the rapidly evolving retail landscape, tech innovations are reshaping how businesses operate and interact with customers. Generative AI could add up to $275 billion of profit to the apparel, fashion, and luxury sectors’ by 2028, according to McKinsey analysis . One of the most promising developments in this realm is retrieval-augmented generation (RAG) , a powerful application of artificial intelligence (AI) that combines the strength of data retrieval with generative capabilities to supercharge retail enterprises. RAG offers compelling advantages specifically tailored for retailers looking to enhance their operations and customer engagement from personalization to enhanced efficiency. Let’s delve into how RAG is revolutionizing the retail sector. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Why RAG in retail Imagine a customer walks into your store, and based on their previous opt-in online interactions, your technology recognizes their preferences and seamlessly guides them through a personalized service—a feat made possible by RAG. Central to RAG’s effectiveness is its ability to integrate and analyze diverse data sources scattered across data warehouses. This integration enables retailers to gain comprehensive insights into their business performance, understand consumer behavior patterns, and make data-driven decisions swiftly. Below are some of the compelling advantages that RAG can offer: Personalization: RAG enables retailers to deliver highly personalized customer experiences by leveraging AI to understand and predict individual preferences based on past interactions. Operational efficiency: By integrating diverse data sources and optimizing processes like supply chain management, RAG helps retailers streamline operations, reduce costs, and improve overall efficiency. For example, RAG aids in tracking shipments and optimizing logistics—a traditional pain point in the industry. Data utilization: It allows retailers to harness the power of big data by integrating and analyzing disparate data sources, providing actionable insights for informed decision-making. Customer engagement: RAG facilitates proactive customer engagement strategies through features like autonomous recommendation engines and hyper-personalized marketing campaigns, thereby increasing customer satisfaction and loyalty. In essence, RAG empowers retailers to harness AI's full potential to deliver superior customer experiences, optimize operations, and maintain a competitive edge in the dynamic retail landscape. But without a clear roadmap, even the most sophisticated AI solutions can falter. By pinpointing specific challenges—such as optimizing inventory management or enhancing customer service—retailers can leverage RAG to tailor solutions that deliver measurable business outcomes. Despite its transformative potential, retailers must first be AI-ready and able to integrate it in a way that enhances operational efficiency without overwhelming existing systems. To achieve this, retailers need to address data silos, ensure data privacy, and establish robust ethical guidelines for AI use. According to a Workday Global Survey , only 4% of respondents said their data is fully accessible, and 59% say their enterprise data is somewhat or completely siloed. Without a solid data foundation, retailers will struggle to achieve the benefits they are looking for from AI. Embracing the future of retail with RAG and MongoDB By harnessing the power of data integration, precise use case definition, and cutting-edge AI technologies like RAG, retail enterprises can not only streamline operations but also elevate customer experiences to unprecedented levels of personalization and efficiency. Building a gen AI operational data layer (ODL) enables retailers to make the most of their AI-enabled applications. A data layer is an architectural pattern that centrally integrates and organizes siloed enterprise data, making it available to consuming applications. As shown below in Figure 1, pulling data into a single database eliminates data silos, centralizes data management, and improves data integrity. Using MongoDB Atlas to unify structured and unstructured operational data offers a cohesive solution by centralizing all data management in a scalable, cloud-based platform. This unification simplifies data management, enhances data consistency, and improves the efficiency of AI and machine learning workflows by providing a single source of truth. With a flexible data schema, retailers can accommodate any data structure, format, or source—which is critical for the 80% of real-world data that is unstructured . Figure 1: Generative AI data layer As AI continues to evolve, the retail industry is poised to see rapid advancements, driven by the innovative use of technologies like RAG. The future of retail lies in seamlessly integrating data and AI to create smarter, more responsive business models. If you would like to learn more about RAG for Retail, visit the following resources: Presentation: Retrieval-Augmented Generation (RAG) to Supercharge Retail Enterprises White Paper: Enhancing Retail Operations with AI and Vector Search: The Business Case for Adoption The MongoDB Solutions Library is curated with tailored solutions to help developers kick-start their projects Add vector search to your arsenal for more accurate and cost-efficient RAG applications by enrolling in the MongoDB and DeepLearning.AI course " Prompt Compression and Query Optimization " for free today.

July 30, 2024

The Converged AI and Application Datastore for Insurance

In the inherently information-driven insurance industry, companies ingest, analyze, and process massive amounts of data, requiring extensive decision-making. To manage this, they rely on a myriad of technologies and IT support staff to keep operations running smoothly but often lack effectiveness due to their outdated nature. Artificial intelligence (AI) holds great promise for insurers by streamlining processes, enhancing decision-making, and improving customer experiences with significantly less time, resources, and staff compared with traditional IT systems. The convergence of AI and innovative application datastores is transforming how insurers work with data. In this post, we’ll look at how these elements are reshaping the insurance industry and offering greater potential for AI-powered applications, with MongoDB at the heart of the converged AI and application datastore. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Scenario planning and flexible data layers One of the primary concerns for IT leaders and decision-makers in the insurance industry is making smart technology investments. The goal is to consolidate existing technology portfolios, which often include a variety of systems like SQL Server, Oracle, and IBM IMS. Consolidation helps reduce inventory and prepare for the future. But what does future-proofing really look like? Scenario planning is an effective strategy for future-proofing. This involves imagining different plausible futures and investing in the common elements that remain beneficial across all scenarios. For insurance companies, a crucial common thread is the data layer. By making data easier to work with, companies can ensure that their technology investments remain valuable regardless of how future scenarios unfold. MongoDB’s flexible developer data platform offers a distinct architectural advantage by making data easier to work with, regardless of the cloud vendor or AI application in use. This flexibility is vital for preparing for disruptive future scenarios, whether they involve regulatory changes, market shifts, or technological advancements. Watch now: The Converged AI and Application Datastore: How API's, AI & Data are Reshaping Insurance The role of AI and data in insurance Generative AI is revolutionizing the insurance sector, offering new ways to manage and utilize data. According to Celent's 2023 Technology Insight and Strategy Survey, 33% of companies across different industries have AI projects in planning, 29% in development, and 19% in production (shown in Figure 1 below). This indicates a significant shift towards AI-driven solutions by insurers actively experimenting with gen AI. Figure 1: Celent Technology Insight and Strategy Survey 2023 However, there's tension between maintaining existing enterprise systems and innovating with AI. Insurance companies must balance keeping the lights on with investing in AI to meet the expectations of boards and stakeholders. The solution lies in integrating AI in a way that enhances operational efficiency without overwhelming existing systems. However, data challenges need to be addressed to achieve this, specifically around access to data. According to a Workday Global Survey , only 4% of respondents said their data is fully accessible, and 59% say their enterprise data is somewhat or completely siloed. Without a solid data foundation, insurers will struggle to achieve the benefits they are looking for from AI. Data architectures and unstructured data When adopting advanced technologies like AI and ML, which require data as the foundation, organizations often grapple with the challenge of integrating these innovations into legacy systems due to their inflexibility and resistance to modification. A robust data architecture is essential for future-proofing and consolidating technology investments. Insurance companies often deal with a vast amount of unstructured data, such as claim images and videos, which can be challenging to manage. By leveraging AI, specifically through vector search and large language models, companies can efficiently process and analyze this data. MongoDB is ideal for managing unstructured data due to its flexible, JSON-like document model, which accommodates a wide variety of data types and structures without requiring a predefined schema. Additionally, MongoDB’s flexibility enables insurers to integrate seamlessly with various technologies, making it a versatile and powerful solution for unstructured data management. For example, consider an insurance adjuster assessing damage from claim photos. Traditionally, this would require manually reviewing each image. With AI, the photos can be converted into vector embeddings and matched against a database of similar claims, drastically speeding up the process. This not only improves efficiency but also enhances the accuracy of assessments. The converged AI and application datastore with MongoDB Building a single view of data across various systems is a game-changer for the insurance industry. Data warehouses and data lakes have long provided single views of customer and claim data, but they often rely on historical data, which may be outdated. The next step is integrating real-time data with these views to make them more dynamic and actionable. A versatile database platform plays a crucial role in this integration. By consolidating data into a single, easily accessible view, insurance companies can ensure that various personas, from underwriters to data scientists, can interact with the data effectively. This integration allows for more responsive and informed decision-making, which is crucial for staying competitive in a rapidly evolving market. This can be achieved with a converged AI and application datastore, as shown in Figure 2 below. This is where operational data, analytics insights, and unstructured data become operationally ready for the applications that leverage AI. Figure 2: Converged AI and application datastore reference architecture The convergence of AI, data, and application datastores is reshaping the insurance industry. By making smart technology investments, leveraging AI to manage unstructured data, and building robust data architectures, insurance companies can future-proof their operations and embrace innovation. A versatile and flexible data platform provides the foundation for these advancements, enabling companies to make their data more accessible, actionable, and valuable. The MongoDB Atlas developer data platform puts powerful AI and analytics capabilities directly in the hands of developers and offers the capabilities to enrich applications by consolidating, ingesting, and acting on any data type instantly. Because MongoDB serves as the operational data store (ODS)—with its flexible document model—insurers can efficiently handle large volumes of data in real-time. By integrating MongoDB with AI/ML platforms, insurers can develop models trained on the most accurate and up-to-date data, thereby addressing the critical need for adaptability and agility in the face of evolving technologies. With built-in security controls across all data, whether managed in a customer environment or through MongoDB Atlas, a fully managed cloud service, MongoDB ensures robust security with features such as authentication (single sign-on and multi-factor authentication), role-based access controls, and comprehensive data encryption. These security measures act as a safeguard for sensitive data, mitigating the risk of unauthorized access from external parties and providing organizations with the confidence to embrace AI and ML technologies. If you would like to learn more about the convergence of AI and application datastores, visit the following resources: Video: The Converged AI and Application Datastore: How API's, AI & Data are Reshaping Insurance Paper: Innovation in Insurance with Artificial Intelligence Head over to our quick-start guide to get started with Atlas Vector Search today. Add vector search to your arsenal for more accurate and cost-efficient RAG applications by enrolling in the DeepLearning.AI course " Prompt Compression and Query Optimization " for free today.

July 18, 2024

Anti-Money Laundering and Fraud Prevention With MongoDB Vector Search and OpenAI

Fraud and anti-money laundering (AML) are major concerns for both businesses and consumers, affecting sectors like financial services and e-commerce. Traditional methods of tackling these issues, including static, rule-based systems and predictive artificial intelligence (AI) methods, work but have limitations, such as lack of context and feature engineering overheads to keeping the models relevant, which can be time-consuming and costly. Vector search can significantly improve fraud detection and AML efforts by addressing these limitations, representing the next step in the evolution of machine learning for combating fraud. Any organization that is already benefiting from real-time analytics will find that this breakthrough in anomaly detection takes fraud and AML detection accuracy to the next level. In this post, we examine how real-time analytics powered by Atlas Vector Search enables organizations to uncover deeply hidden insights before fraud occurs. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. The evolution of fraud and risk technology Over the past few decades, fraud and risk technology have evolved in stages, with each stage building upon the strengths of previous approaches while also addressing their weaknesses: Risk 1.0: In the early stages (the late 1990s to 2010), risk management relied heavily on manual processes and human judgment, with decision-making based on intuition, past experiences, and limited data analysis. Rule-based systems emerged during this time, using predefined rules to flag suspicious activities. These rules were often static and lacked adaptability to changing fraud patterns . Risk 2.0: With the evolution of machine learning and advanced analytics (from 2010 onwards), risk management entered a new era with 2.0. Predictive modeling techniques were employed to forecast future risks and detect fraudulent behavior. Systems were trained on historical data and became more integrated, allowing for real-time data processing and the automation of decision-making processes. However, these systems faced limitations such as, Feature engineering overhead: Risk 2.0 systems often require manual feature engineering. Lack of context: Risk 1.0 and Risk 2.0 may not incorporate a wide range of variables and contextual information. Risk 2.0 solutions are often used in combination with rule-based approaches because rules cannot be avoided. Companies have their business- and domain-specific heuristics and other rules that must be applied. Here is an example fraud detection solution based on Risk 1.0 and Risk 2.0 with a rules-based and traditional AI/ML approach. Risk 3.0: The latest stage (2023 and beyond) in fraud and risk technology evolution is driven by vector search. This advancement leverages real-time data feeds and continuous monitoring to detect emerging threats and adapt to changing risk landscapes, addressing the limitations of data imbalance, manual feature engineering, and the need for extensive human oversight while incorporating a wider range of variables and contextual information. Depending on the particular use case, organizations can combine or use these solutions to effectively manage and mitigate risks associated with Fraud and AML. Now, let us look into how MongoDB Atlas Vector Search (Risk 3.0) can help enhance existing fraud detection methods. How Atlas Vector Search can help A vector database is an organized collection of information that makes it easier to find similarities and relationships between different pieces of data. This definition uniquely positions MongoDB as particularly effective, rather than using a standalone or bolt-on vector database. The versatility of MongoDB’s developer data platform empowers users to store their operational data, metadata, and vector embeddings on MongoDB Atlas and seamlessly use Atlas Vector Search to index, retrieve, and build performant gen AI applications. Watch how you can revolutionize fraud detection with MongoDB Atlas Vector Search. The combination of real-time analytics and vector search offers a powerful synergy that enables organizations to discover insights that are otherwise elusive with traditional methods. MongoDB facilitates this through Atlas Vector Search integrated with OpenAI embedding, as illustrated in Figure 1 below. Figure 1: Atlas Vector Search in action for fraud detection and AML Business perspective: Fraud detection vs. AML Understanding the distinct business objectives and operational processes driving fraud detection and AML is crucial before diving into the use of vector embeddings. Fraud Detection is centered on identifying unauthorized activities aimed at immediate financial gain through deceptive practices. The detection models, therefore, look for specific patterns in transactional data that indicate such activities. For instance, they might focus on high-frequency, low-value transactions, which are common indicators of fraudulent behavior. AML , on the other hand, targets the complex process of disguising the origins of illicitly gained funds. The models here analyze broader and more intricate transaction networks and behaviors to identify potential laundering activities. For instance, AML could look at the relationships between transactions and entities over a longer period. Creation of Vector Embeddings for Fraud and AML Fraud and AML models require different approaches because they target distinct types of criminal activities. To accurately identify these activities, machine learning models use vector embeddings tailored to the features of each type of detection. In this solution highlighted in Figure 1, vector embeddings for fraud detection are created using a combination of text, transaction, and counterparty data. Conversely, the embeddings for AML are generated from data on transactions, relationships between counterparties, and their risk profiles. The selection of data sources, including the use of unstructured data and the creation of one or more vector embeddings, can be customized to meet specific needs. This particular solution utilizes OpenAI for generating vector embeddings, though other software options can also be employed. Historical vector embeddings are representations of past transaction data and customer profiles encoded into a vector format. The demo database is prepopulated with synthetically generated test data for both fraud and AML embeddings. In real-world scenarios, you can create embeddings by encoding historical transaction data and customer profiles as vectors. Regarding the fraud and AML detection workflow , as shown in Figure 1, incoming transaction fraud and AML aggregated text are used to generate embeddings using OpenAI. These embeddings are then analyzed using Atlas Vector Search based on the percentage of previous transactions with similar characteristics that were flagged for suspicious activity. In Figure 1, the term " Classified Transaction " indicates a transaction that has been processed and categorized by the detection system. This classification helps determine whether the transaction is considered normal, potentially fraudulent, or indicative of money laundering, thus guiding further actions. If flagged for fraud: The transaction request is declined. If not flagged: The transaction is completed successfully, and a confirmation message is shown. For rejected transactions, users can contact case management services with the transaction reference number for details. No action is needed for successful transactions. Combining Atlas Vector Search for fraud detection With the use of Atlas Vector Search with OpenAI embeddings, organizations can: Eliminate the need for batch and manual feature engineering required by predictive (Risk 2.0) methods. Dynamically incorporate new data sources to perform more accurate semantic searches, addressing emerging fraud trends. Adopt this method for mobile solutions, as traditional methods are often costly and performance-intensive. Why MongoDB for AML and fraud prevention Fraud and AML detection require a holistic platform approach as they involve diverse data sets that are constantly evolving. Customers choose MongoDB because it is a unified data platform (as shown in Figure 2 below) that eliminates the need for niche technologies, such as a dedicated vector database. What’s more, MongoDB’s document data model incorporates any kind of data—any structure (structured, semi-structured, and unstructured), any format, any source—no matter how often it changes, allowing you to create a holistic picture of customers to better predict transaction anomalies in real time. By incorporating Atlas Vector Search, institutions can: Build intelligent applications powered by semantic search and generative AI over any type of data. Store vector embeddings right next to your source data and metadata. Vectors inserted or updated in the database are automatically synchronized to the vector index. Optimize resource consumption, improve performance, and enhance availability with Search Nodes . Remove operational heavy lifting with the battle-tested, fully managed MongoDB Atlas developer data platform. Figure 2: Unified risk management and fraud detection data platform Given the broad and evolving nature of fraud detection and AML, these areas typically require multiple methods and a multimodal approach. Therefore, a unified risk data platform offers several advantages for organizations that are aiming to build effective solutions. Using MongoDB, you can develop solutions for Risk 1.0, Risk 2.0, and Risk 3.0, either separately or in combination, tailored to meet your specific business needs. The concepts are demonstrated with two examples: a card fraud solution accelerator for Risk 1.0 and Risk 2.0 and a new Vector Search solution for Risk 3.0, as discussed in this blog. It's important to note that the vector search-based Risk 3.0 solution can be implemented on top of Risk 1.0 and Risk 2.0 to enhance detection accuracy and reduce false positives. If you would like to discover more about how MongoDB can help you supercharge your fraud detection systems, take a look at the following resources: Revolutionizing Fraud Detection with Atlas Vector Search Card Fraud solution accelerator (Risk 1.0 and Risk 2.0) Risk, AML, and Fraud detection solution GitHub respository Add vector search to your arsenal for more accurate and cost-efficient RAG applications by enrolling in the DeepLearning.AI course " Prompt Compression and Query Optimization " for free today.

July 17, 2024

Building Gen AI-Powered Predictive Maintenance with MongoDB

In today’s fast-evolving industrial landscape, digital transformation has become a necessity. From manufacturing plants to connected vehicles, the push towards predictive maintenance excellence is driving organizations to embrace smarter, more efficient ways of managing operations. One of the most compelling advancements in this domain is predictive maintenance powered by generative AI , a cutting-edge approach that will revolutionize how industries maintain and optimize their equipment. For manufacturers seeking maintenance excellence, a unified data store and a developer data platform are key enablers. These tools provide the foundation for integrating AI applications that can analyze sensor data, predict failures, and optimize maintenance schedules. MongoDB Atlas is the only multi-cloud developer data platform available that is designed to streamline and speed up developers' data handling. With MongoDB Atlas, developers can enhance end-to-end value chain optimization through AI/ML, advanced analytics, and real-time data processing, supporting cutting-edge mobile, edge, and IoT applications. In this post, we’ll explore the basics of predictive maintenance and how MongoDB can be used for maintenance excellence. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. Understanding the need for predictive maintenance Predictive maintenance is about anticipating and addressing equipment failures before they occur, ensuring minimal disruption to operations. Traditional maintenance strategies, like time-based or usage-based maintenance, are less effective than predictive maintenance because they don’t account for the varying conditions and complexities of machinery. Unanticipated equipment breakdown can result in line stoppage and substantial throughput losses, potentially leading to millions of dollars in revenue loss. Since the pandemic, many organizations have begun significant digital transformations to improve efficiency and resilience. However, a concerning gap exists between tech adoption and return on investment. While 89% of organizations have begun digital and AI transformations, only 31% have seen the expected revenue lift, and only 25% have realized the expected cost savings. These numbers highlight the importance of implementing new technologies strategically. Manufacturers need to carefully consider how AI can address their specific challenges and then integrate them into existing processes effectively. Predictive maintenance boosts efficiency and saves money Predictive maintenance uses data analysis to identify problems in machines before they fail. This allows organizations to schedule maintenance at the optimal time, maximizing machine reliability and efficiency. Indeed, according to Deloitte , predictive maintenance can lead to a variety of benefits, including: 3-5% reduction in new equipment costs 5-20% increase in labor productivity 15-20% reduction in facility downtime 10-30% reduction in inventory levels 5-20% reduction in carrying costs Since the concept was introduced, predictive maintenance has constantly evolved. We've moved beyond basic threshold-based monitoring to advanced techniques like machine learning (ML) models. These models can not only predict failures but also diagnose the root cause, allowing for targeted repairs. The latest trend in predictive maintenance is automated strategy creation. This involves using AI to not only predict equipment breakdowns but also to generate repair plans, ensuring the right fixes are made at the right time. Generative AI in predictive maintenance To better understand how gen AI can be used to build robust predictive maintenance solutions, let's dig into the characteristics of organizations that have successfully implemented AI. They exhibit common traits across five key areas: Identifying high-impact value drivers and AI use cases: Efforts should be concentrated on domains where artificial intelligence yields maximal utility rather than employing it arbitrarily. Aligning AI strategy with data strategy: Organizations must establish a strong data foundation with a data strategy that directly supports their AI goals. Continuous data enrichment and accessibility: High-quality data, readily available and usable across the organization, is essential for the success of AI initiatives. Empowering talent and fostering development: By equipping their workforce with training and resources, organizations can empower them to leverage AI effectively. Enabling scalable AI adoption: Building a strong and scalable infrastructure is key to unlocking the full potential of AI by enabling its smooth and ongoing integration across the organization. Implementing predictive maintenance using MongoDB Atlas When combined with a robust data management platform like MongoDB Atlas, gen AI can predict failures with remarkable accuracy and suggest optimal maintenance schedules. MongoDB Atlas is the only multi-cloud developer data platform designed to accelerate and simplify how developers work with data. Developers can power end-to-end value chain optimization with AI/ML, advanced analytics, and real-time data processing for innovative mobile, edge, and IoT applications. MongoDB Atlas offers a suite of features perfectly suited for building a predictive maintenance system, as shown in Figure 1 below. Its ability to handle both structured and unstructured data allows for comprehensive condition monitoring and anomaly detection. Here’s how you can build a generative AI-powered predictive maintenance software using MongoDB Atlas: Machine prioritization: This stage prioritizes machines for the maintenance excellence program using a retrieval-augmented generation (RAG) system that takes in structured and unstructured data related to maintenance costs and past failures. Generative AI revolutionizes this process by reducing manual analysis time and minimizing investment risks. At the end of this stage, the organization knows exactly which equipment or assets are well-suited for sensorization. Utilizing MongoDB Atlas, which stores both structured and unstructured data, allows for semantic searches that provide accurate context to AI models. This results in precise machine prioritization and criticality analysis. Failure prediction: MongoDB Atlas provides the necessary tools to implement failure prediction, offering a unified view of operational data, real-time processing, integrated monitoring, and seamless machine learning integration. Sensors on machines, like milling machines, collect data (e.g., air temperature and torque) and process it through Atlas Stream Processing , allowing continuous, real-time data handling. This data is then analyzed by trained models in MongoDB, with results visualized using Atlas Charts and alerts pushed via Atlas Change Streams to mobile devices, establishing an end-to-end failure prediction system. Repair plan generation: To implement a comprehensive repair strategy, generating a detailed maintenance work order is crucial. This involves integrating structured data, such as repair instructions and spare parts, with unstructured data from machine manuals. MongoDB Atlas serves as the operational data layer, seamlessly combining these data types. By leveraging Atlas Vector Search and aggregation pipelines , the system extracts and vectorizes information from manuals and past work orders. This data feeds into a large language model (LLM), which generates the work order template, including inventory and resource details, resulting in an accurate and efficient repair plan. Maintenance guidance generation: Generative AI is used to integrate service notes and additional information with the repair plan, providing enhanced guidance for technicians. For example, if service notes in another language are found in the maintenance management system, we extract and translate the text to suit our application. This information is then combined with the repair plan using a large language model. The updated plan is pushed to the technician’s mobile app via Atlas Change Streams. The system generates step-by-step instructions by analyzing work orders and machine manuals, ensuring comprehensive guidance without manually sifting through extensive documents. Figure 1: Achieving end-to-end predictive maintenance with MongoDB Atlas Developer Data Platform In the quest for operational excellence, predictive maintenance powered by generative AI and MongoDB Atlas stands out as a game-changer. This innovative approach not only enhances the reliability and efficiency of industrial operations but also sets the stage for a future where AI-driven insights and actions become the norm. By leveraging the advanced capabilities of MongoDB Atlas, manufacturers can unlock new levels of performance and productivity, heralding a new era of smart manufacturing and connected systems. If you would like to learn more about generative AI-powered predictive maintenance, visit the following resources: [Video] How to Build a Generative AI-Powered Predictive Maintenance Software [Whitepaper] Generative AI in Predictive Maintenance Applications [Whitepaper] Critical AI Use Cases in Manufacturing and Motion: Realizing AI-powered innovation with MongoDB Atlas

June 27, 2024

Payments Modernization and the Role of the Operational Data Layer

To stay relevant and competitive, payment solution providers must enhance their payment processes to adapt to changing customer expectations, regulatory demands, and advancing technologies. The imperative for modernization is clear: payment systems must become faster, more secure, and seamlessly integrated across platforms. Driven by multiple factors—real-time payments, regulatory shifts like Payment Services Directive 2 (PSD2), heightened customer expectations, the power of open banking, and the disruptive force of fintech startups—the need for payment modernization has never been more pressing. But transformation is not without its challenges. Complex systems, industry reliance on outdated technology, high upgrade costs, and technical debt all pose formidable obstacles. This article will explore modernization approaches and how MongoDB helps smooth transformations. Approaches to modernization As businesses work to modernize their payment systems, they need to overcome the complexities inherent in updating legacy systems. Forward-thinking organizations embrace innovative strategies to streamline their operations, enhance scalability, and facilitate agile responses to evolving market demands. Two such approaches gaining prominence in the realm of payment system modernization are domain-driven design and microservices architecture : Domain-driven design: This approach focuses on a business's core operations to develop scalable and easier-to-manage systems. Domain-driven design ensures that technology serves strategic business goals by aligning system development with business needs. At its core, this approach seeks to break down complex business domains into manageable components, or "domains," each representing a distinct area of business functionality. Microservices architecture: Unlike traditional monolithic architectures, characterized by tightly coupled and interdependent components, a microservices architecture decomposes applications into a collection of loosely coupled services, each of which is responsible for a specific business function or capability. It introduces more flexibility and allows for quicker updates, facilitating agile responses to changing business requirements. Discover how Wells Fargo launched their next-generation card payments by building an operational data store with MongoDB . Modernizing with an operational data layer In the payments modernization process, the significance of an operational data layer (ODL) cannot be overstated. An ODL is an architectural pattern that centrally integrates and organizes siloed enterprise data, making it available to consuming applications. The simplest representation of this pattern looks something like the sample reference architecture below. Figure 1: Operational Data Layer structure An ODL is deployed in front of legacy systems to enable new business initiatives and to meet new requirements that the existing architecture can’t handle—without the difficulty and risk of fully replacing legacy systems. It can reduce the workload on source systems, improve availability, reduce end-user response times, combine data from multiple systems into a single repository, serve as a foundation for re-architecting a monolithic application into a suite of microservices, and more. The ODL becomes a system of innovation, allowing the business to take an iterative approach to digital transformation. Here's why an ODL is considered ideal for payment operations: Unified data management: Payment systems involve handling a vast amount of diverse data, including transaction details, customer information, and regulatory compliance data. An ODL provides a centralized repository for storing and managing this data, eliminating silos and ensuring data integrity. Real-time processing: An ODL enables real-time processing of transactions, allowing businesses to handle high numbers of transactions swiftly and efficiently. This capability is essential for meeting customer expectations for instant payments and facilitating seamless transactions across various channels. Scalability and flexibility: Payment systems must accommodate fluctuating transaction volumes and evolving business needs. An ODL offers scalability and flexibility, allowing businesses to scale their infrastructure as demand grows. Enhanced security: An ODL incorporates robust security features —such as encryption, access controls, and auditing capabilities—to safeguard data integrity and confidentiality. By centralizing security measures within the ODL, businesses can ensure compliance with regulatory requirements and mitigate security risks effectively. Support for payments data monetization: Payment systems generate a wealth of data that can provide valuable insights into customer behavior, transaction trends, and business performance. An ODL facilitates real-time analytics and reporting by providing a unified platform for collecting, storing, and analyzing this data. Transform with MongoDB MongoDB’s fundamental technology principles ensure companies can reap the advantages of microservices and domain-driven design—specifically, our flexible data model and built-in redundancy, automation, and scalability. Indeed, the document model is tailor-made for the intricacies of payment data, ensuring adaptability and scalability as market demands evolve. Here’s how MongoDB helps with domain-driven design and microservice implementation to adopt industry best practices: Ease of use: MongoDB’s document model makes it simple to model or remodel data to fit the needs of payment applications. Documents are a natural way of describing data. They present a single data structure, with related data embedded as sub-documents and arrays, making it simpler and faster for developers to model how data in the application will be mapped to data stored in the database. In addition, MongoDB guarantees the multi-record ACID transactional semantics that developers are familiar with, making it easier to reason about data. Flexibility: MongoDB’s dynamic schema is ideal for handling the requirements of microservices and a domain-driven design. Domain-driven design emphasizes modeling the domain to reflect the business requirements, which may evolve over time. MongoDB's flexible schema allows you to store domain objects as documents without rigid schema constraints, facilitating agile development and evolution of the domain model. Speed: Using MongoDB for an ODL means you can get better performance when accessing data, and write less code to do so. A document is a single place for the database to read and write data for an entity. This locality of data ensures the complete document can be accessed in a single database operation that avoids the need internally to pull data from many different tables and rows. Data access and microservice-based APIs: MongoDB integrates seamlessly with modern technologies and frameworks commonly used in microservices architectures. MongoDB's flexible data model and ability to handle various data types, including structured and unstructured data, is a great fit for orchestrating your open API ecosystem to make data flow between banks, third parties, and consumers possible. Scalability: Even if an ODL starts at a small scale, you need to be prepared for growth as new source systems are integrated, adding data volume, and new consuming systems are developed, increasing workload. MongoDB provides horizontal scale-out on low-cost, commodity hardware or cloud infrastructure using sharding to meet the needs of an ODL with large data sets and high throughput requirements. High availability: Microservices architectures require high availability to ensure that individual services remain accessible even in the event of failures. MongoDB provides built-in replication and failover capabilities, ensuring data availability and minimal downtime in case of server failures. Payment modernization is not merely a trend but a strategic imperative. By embracing modern payment solutions and leveraging the power of an ODL with MongoDB, organizations can unlock new growth opportunities, enhance operational efficiency, and deliver superior customer experiences. Learn how to build an operational data layer with MongoDB using this Payments Modernization Solution Accelerator . Learn more about how MongoDB is powering industries on our solution library .

May 15, 2024

From Relational Databases to AI: An Insurance Data Modernization Journey

Imagine you’re a data architect, a developer, or a data engineer at an insurance company. Management has asked you and your team to build a new AI claim adjustment system, a customer-facing LLM-powered chatbot, and an application to streamline the underwriting process. However, doing so is far from straightforward due to the challenges you face on a daily basis. The bulk of your time is spent navigating your company’s outdated legacy systems, which were built in the 1970s and 1980s. Some of these legacy platforms were written in COBOL and CICS, and today very few people on your team know how to develop and maintain those technologies. Moreover, the data models you work with are another source of frustration. Every interaction with them is a reminder of the intricate structures that have evolved over time, making data manipulation and analysis a nightmare. In sum, legacy systems are preventing your team—and your company—from innovating and keeping up with both your industry and customer demands. Whether you’re trying to modernize your legacy systems to improve operational efficiency, boost developer productivity, or if you want to build AI-powered apps that integrate with large language models (LLMs), MongoDB has a solution for that. In this post, we’ll walk you through a journey that starts with a relational data model refactored into MongoDB collections, vectorization and querying of unstructured data and, finally, retrieval augmented generation (RAG) : asking large language models (LLMs) questions about data in natural language. Identifying, modernizing, and storing the data Our journey starts with an assessment of the data sources we want to work with. As shown below, we can bucket the data into three different categories: Structured legacy data: Tables of claims, coverages, billings, and more. Is your data locked in rigid relations schemas? This tutorial is a step-by-step guide on how to migrate a real-life insurance relational model with the help of MongoDB Relational Migrator , refactoring 21 tables to only five MongoDB collections. Structured data (JSON): You might have files of policies, insurance products, or forms in JSON format. Check out our docs to learn how to insert those into a MongoDB collection. Unstructured data (PDFs, Audios, Images, etc.): If you need to create and store a numerical representation (vector embedding) of, for instance, claim-related photos of accidents or PDFs of policy guidelines, you can have a look at this blog that will walk you through the process of generating embeddings of pictures of car crashes and persisting them alongside existing fields in a MongoDB collection. Figure 1: Storing different types of data into MongoDB Regardless of the original format or source, our data has finally landed into MongoDB Atlas into what we call a Converged AI Data Store, which is a platform that centrally integrates and organizes enterprise data, including vectors, that enable the development of ML- and AI-powered applications. Accessing, experimenting, and interacting with the data It’s time to put the data to work. The Converged AI Data Store unlocks a plethora of use cases and efficiency gains, both for the business and for developers. The next step of the journey is about the different ways we can interact with our data: Database and Full Text Search: Learn how to run database queries, start from the basics and move up to advanced features such as facets, fuzzy search, autocomplete, highlighting, and more with Atlas Search . Vector Search: We can finally leverage unstructured data. The Image Search blog we mentioned earlier also explains how to create a Vector Search index and run vector queries against embeddings of photos. RAG: Combining Vector Search and the power of LLMs, it is possible to interact in natural language with our data (see Figure 2 below), asking complex questions and getting detailed answers. Follow this tutorial to become a RAG expert. Figure 2: Retrieval augmented generation (RAG) diagram where we dynamically combine our custom data with the LLM to generate reliable and relevant outputs Having explored all the different ways we can ask questions of the data, we made it to the end of our journey. You are now ready to modernize your company’s systems and finally be able to keep up with the business’ demands. What will you build next? If you would like to discover more about Converged AI and Application Data Stores with MongoDB, take a look at the following resources: AI, Vectors, and the Future of Claims Processing: Why Insurance Needs to Understand The Power of Vector Databases Build a ML-Powered Underwriting Engine in 20 Minutes with MongoDB and Databricks

March 14, 2024

Réduire les biais dans le credit scoring grâce à l'IA générative

Le credit scoring joue un rôle essentiel pour déterminer qui accède au crédit et à quelles conditions. Malgré son importance, les systèmes de credit scoring traditionnels sont depuis longtemps confrontés à de nombreux problèmes critiques liés aux biais et à la discrimination, à une prise en compte limitée des données et à des défis d'évolutivité. Par exemple, une étude sur les prêts américains a montré que les emprunteurs issus des minorités ethniques paient des taux d'intérêt plus élevés (+8 %) et leurs demandes de prêt sont plus souvent rejetées (+14 %) que les emprunteurs appartenant à des groupes plus privilégiés. En raison de la nature rigide des systèmes, ils peuvent mettre plus de temps à s'adapter à l'évolution de l'économie et aux comportements des consommateurs, ce qui porte préjudice à certains individus, qui sont négligés. Pour y remédier, les banques et autres prêteurs cherchent à adopter l'intelligence artificielle afin de développer des modèles de plus en plus élaborés de credit scoring. Cet article se penche sur les principes fondamentaux du credit scoring, les défis posés par les systèmes actuels et examine comment l'intelligence artificielle (IA), en particulier l'IA générative (GenAI), peut réduire les biais et améliorer la précision. De l'intégration de sources de données alternatives au développement de modèles de machine learning, nous découvrirons en quoi l'IA pour repenser le credit scoring. Consultez notre page de ressources sur l'IA pour en savoir plus sur la création d'applications alimentées par l'IA avec MongoDB. Qu'est-ce que le credit scoring ? Le credit scoring fait partie intégrante du paysage financier. Il sert d'indicateur numérique de la solvabilité d’un individu. Cet indicateur clé est utilisé par les prêteurs pour évaluer le risque associé à l'octroi d'un crédit à des particuliers ou des entreprises. Traditionnellement, les banques s'appuient sur des règles prédéfinies et des modèles statistiques souvent construits à l'aide de la régression linéaire ou de la régression logistique. Ces modèles reposent sur des données historiques de crédit et se concentrent sur des facteurs tels que l'historique des paiements, la part d'utilisation du crédit et sa durée. Cependant, l’évaluation des nouveaux demandeurs de crédit représente un défi, d'où la nécessité d'un profilage plus précis. Pour répondre aux besoins des segments mal desservis ou non desservis, car historiquement discriminés, les fintechs et les banques numériques intègrent de plus en plus d'informations au-delà des antécédents de crédit traditionnels avec des données alternatives pour obtenir une vue plus complète du comportement financier d'un individu. Défis liés au système traditionnel Les scores de crédit font partie intégrante de notre quotidien, car ils constituent un facteur déterminant dans diverses transactions financières, telles que l'obtention d'un prêt, la location d'un appartement, les assurances, voire dans les processus de recrutement. Étant donné que la recherche d'un crédit peut être le parcours du combattant, voici quelques-uns des défis ou des limites des modèles de credit scoring traditionnels qui entravent souvent le chemin vers l'obtention d'un crédit. Antécédents de crédit limités : de nombreuses personnes, en particulier celles qui débutent dans ce domaine, se heurtent à un obstacle de taille : des antécédents de crédit limités, voire inexistants. Les modèles traditionnels de credit scoring reposent en grande partie sur les comportements antérieurs en matière de crédit. Les individus qui ne disposent pas d'un solide historique ont donc plus de mal à prouver leur solvabilité. Environ 45 millions d'Américains n'ont pas de scores de crédit simplement parce que ces données n'existent pas pour eux. Revenus irréguliers : les revenus irréguliers, caractéristiques du travail à temps partiel ou indépendant, pose un défi pour les modèles traditionnels de credit scoring, car ils associent des risques plus élevés à ces individus, ce qui peut entraîner un refus ou des restrictions. En 2023 aux États-Unis , les sources de données diffèrent quant au nombre de travailleurs indépendants. Selon une source, plus de 27 millions d'Américains ont rempli des formulaires Schedule C, qui concernent les bénéfices ou les pertes d'une entreprise. Il faut donc utiliser des méthodes d'évaluation différentes pour ces travailleurs. Forte utilisation d'un crédit en cours : une forte dépendance à l'égard d'un crédit en cours est souvent perçue comme un signe de difficultés financières potentielles, ce qui influence les décisions des emprunteurs. Les demandes de crédit peuvent être rejetées ou approuvées à des conditions moins favorables, ce qui traduit les inquiétudes quant à la capacité du demandeur à gérer judicieusement le paiement d'un crédit supplémentaire. Manque de clarté dans les motifs de refus : comprendre les raisons d'un refus permet aux demandeurs de s'attaquer aux causes profondes. Au Royaume-Uni, une étude réalisée entre avril 2022 et avril 2023 a montré que les principaux motifs de rejet étaient les suivants : « mauvais antécédents de crédit » (38 %), « incapacité à rembourser » (28 %), « trop de crédits en cours » (19 %). 10 % ont déclaré qu'on ne leur avait pas expliqué pourquoi leur demande avait été rejetée. Même lorsqu'elles sont données, les explications sont souvent trop vagues, ce qui laisse les demandeurs dans l'ignorance. Il leur est donc difficile de s'attaquer aux causes profondes et d'améliorer leur solvabilité pour de futures demandes. Le manque de transparence n'est pas seulement un problème pour les clients, il peut également pénaliser les banques. Par exemple, en 2023, une banque berlinoise a été condamnée à payer une amende pour manque de transparence dans le cadre du refus d'une demande de carte bancaire. Manque de flexibilité : l'évolution du comportement des consommateurs, en particulier des jeunes générations qui préfèrent les transactions numériques, remet en question les modèles traditionnels. Des facteurs tels que l'essor de l'économie parallèle, les emplois non traditionnels, l'endettement des étudiants et le coût élevé de la vie compliquent l'évaluation de la stabilité des revenus et de la santé financière. Les prévisions traditionnelles en matière de risque de crédit sont limitées en cas de perturbations sans précédent comme la pandémie de COVID-19, car les modèles de notation n'en tiennent pas compte. Reconnaître ces défis met en évidence la nécessité d'adopter des modèles alternatifs qui peuvent s'adapter à l'évolution des comportements financiers, traiter des sources de données non traditionnelles et fournir une évaluation plus complète et plus précise de la solvabilité dans ce paysage financier en perpétuelle évolution. Utiliser des données alternatives Il s'agit d'utiliser des sources de données non traditionnelles (aussi appelées « données alternatives ») et des méthodes pour évaluer la solvabilité d'un individu. Si le système traditionnel s'appuie largement sur les antécédents de crédit des principales sociétés de crédit, la méthode alternative intègre un éventail plus large de facteurs afin de mieux appréhender le comportement financier d'un individu. Voici quelques-unes des sources de données alternatives les plus utilisées : factures d'eau, de gaz, d'électricité : au-delà des antécédents de crédit, le paiement régulier de services comme l'eau et l'électricité montre si le demandeur respecte ses obligations financières et fournit des informations cruciales au-delà des indicateurs traditionnels ; quittances de loyer : pour ceux qui n'ont pas de prêt hypothécaire, les quittances constituent une source de données alternative clé. Montrer que le demandeur paie régulièrement et à temps son loyer permet de donner un aperçu complet de la responsabilité et la fiabilité financières. utilisation du téléphone portable : l’omniprésence des téléphones portables permet d’accéder à une multitude de données alternatives. L’analyse des appels et de SMS fournit des informations sur le réseau, la stabilité et les relations sociales d’un individu, ce qui constitue de précieuses données pour le credit scoring. habitudes d'achat en ligne : l'analyse de la fréquence, de la nature et du montant des achats effectués en ligne fournit de précieuses informations sur les comportements en matière de dépenses et donne une compréhension plus nuancée des habitudes financières ; antécédents scolaires et professionnels : le credit scoring alternatif tient compte des antécédents scolaires et professionnels d’un individu. Les signaux positifs, tels que les bons résultats scolaires et la stabilité de l'emploi, jouent un rôle majeur dans l'évaluation de la stabilité financière. Ces sources de données alternatives représentent une transition vers une approche plus inclusive, nuancée et holistique des évaluations de crédit. À mesure que la technologie financière continue de progresser, l’exploitation de ces ensembles de données alternatives garantit une évaluation plus complète de la solvabilité, ce qui marque une étape charnière dans l’évolution des modèles de credit scoring. L'utilisation de l'intelligence artificielle L'utilisation de l'IA permet également de relever les défis posés par les systèmes traditionnels, et ce pour plusieurs raisons : capacité à réduire les biais : comme les modèles statistiques traditionnels, les modèles d'IA, y compris les grands modèles de langage (LLM), entraînés sur des données historiques qui sont biaisées, conserveront les biais présents ces données, ce qui conduira à des résultats discriminatoires. Les LLM peuvent se concentrer sur certaines fonctionnalités plus que sur d'autres ou ne comprennent pas toujours le contexte plus large de la situation financière d'un individu. La décision est donc biaisée. Cependant, il existe différentes techniques pour pallier cette problématique : stratégies d’atténuation : les initiatives commencent par l’utilisation de données d’entraînement diversifiées et représentatives afin d’éviter de renforcer les biais existants. Des stratégies d’atténuation inadéquates ou inefficaces peuvent entraîner la persistance de résultats biaisés dans les modèles de credit scoring basés sur l’IA. Pour résoudre ce problème, il est essentiel d'accorder une attention toute particulière aux données collectées et à l’élaboration de modèles. L’intégration de données alternatives joue un rôle clé dans la réduction des biais. Des outils rigoureux de détection des biais, des contraintes d'équité et des techniques de régularisation pendant le processus d'entraînement des modèles améliorent leur fiabilité. En effet, l'ajustement de la représentation des caractéristiques et l'utilisation de techniques de post-traitement et d'algorithmes spécialisés contribuent à réduire les biais. L’évaluation inclusive des modèles, la surveillance continue et l’amélioration itérative, combinées au respect des directives éthiques et des pratiques de gouvernance, complètent une approche multidimensionnelle visant à réduire les biais dans les modèles d'IA. C'est un enjeu clé pour répondre aux préoccupations liées aux biais démographiques ou socioéconomiques qui peuvent être présents dans les données historiques sur le crédit. Audits réguliers des biais : effectuez des audits réguliers pour identifier et atténuer les biais dans les LLM. Cela peut impliquer l'analyse des résultats des modèles pour les disparités entre les groupes démographiques et l'ajustement des algorithmes en conséquence. Transparence et explicabilité : améliorez la transparence et l’explicabilité dans les LLM pour comprendre comment les décisions sont prises. Cela peut permettre d'identifier et de corriger les processus décisionnels biaisés. Trade Ledger , un logiciel de prêt en tant que service (SaaS), utilise une approche axée sur les données pour prendre des décisions éclairées avec une plus grande transparence et une meilleure traçabilité en rassemblant des données provenant de plusieurs sources avec des schémas différents dans une seule source de données. Capacité à analyser des ensembles de données vastes et diversifiés : contrairement aux modèles traditionnels qui s'appuient sur des règles prédéfinies et des données de crédit historiques, les modèles d'IA peuvent traiter une myriade d'informations, notamment des sources de données non traditionnelles. L'objectif est d'aboutir à une évaluation plus complète de la solvabilité d'un individu, en veillant à ce qu'un éventail plus large de comportements financiers soit pris en compte. Capacité d'adaptation inégalée : lorsque les conditions économiques changent et que les comportements des consommateurs évoluent, les modèles alimentés par l'IA peuvent rapidement s'adapter et apprendre à partir de nouvelles données. L'apprentissage continu garantit que le credit scoring reste pertinent et efficace dans un secteur financier en perpétuelle évolution. Les objections les plus courantes des banques qui ne veulent pas utiliser l'IA à cette fin sont la transparence et l'explicabilité des décisions de crédit. La complexité inhérente de certains modèles d'IA, en particulier les algorithmes de deep learning, peut entraîner des difficultés à fournir des explications claires sur les décisions en matière de crédit. Heureusement, la transparence et l'interprétation des modèles d'IA ont connu des avancées significatives. Des techniques comme les valeurs SHapley Additive exPlanations (SHAP) et les graphiques Local Interpretable Model-Agnostic Explanations (LIME) , ainsi que plusieurs autres avancées dans le domaine de l'IA explicable (XAI) nous permettent désormais de comprendre comment le modèle aboutit à des décisions spécifiques. Cela permet non seulement de renforcer la confiance dans le processus de credit scoring, mais aussi de répondre à la critique fréquemment évoquée selon laquelle les modèles d'IA sont des « boîtes noires ». Conscientes des enjeux liés aux données alternatives qui se présentent souvent sous forme semi-structurée ou non structurée, les institutions financières travaillent avec MongoDB pour améliorer leurs processus de demande de crédit grâce à un moyen plus rapide, plus simple et plus flexible d’effectuer des paiements et d’accorder des crédits. Amar Bank, la première banque numérique d’Indonésie , lutte contre les biais en fournissant des microcrédits à des personnes qui ne seraient pas en mesure d’obtenir des services financiers auprès des banques traditionnelles (non bancarisées et mal desservies). Les processus de souscription traditionnels n'étant pas adaptés aux clients qui n'ont pas d'antécédents de crédit ou de garanties, ils ont rationalisé les décisions de prêt en exploitant des données non structurées. En s'appuyant sur MongoDB Atlas, ils ont développé un modèle d'analyse prédictive qui intègre des données structurées et non structurées pour évaluer la solvabilité des emprunteurs. L'évolutivité et la capacité de MongoDB à gérer divers types de données ont joué un rôle déterminant dans l'expansion et l'optimisation de leurs activités. La grande majorité des Indiens ont du mal à obtenir un crédit en raison de réglementations strictes et d'un manque de données en la matière. Grâce à l'utilisation de systèmes de souscription modernes, Slice , l'un des principaux innovateurs de l'écosystème fintech indien, contribue à élargir l'accès au crédit en Inde en rationalisant leur processus KYC pour une expérience de crédit plus fluide. En utilisant MongoDB Atlas dans différents cas d'utilisation, notamment en tant que magasin de fonctionnalités ML en temps réel, slice a transformé son processus d'intégration et réduit les temps de traitement à moins d'une minute. slice utilise le magasin de fonctionnalités en temps réel avec MongoDB et les modèles de ML pour calculer instantanément plus de 100 variables, ce qui permet de déterminer l'éligibilité au crédit en moins de 30 secondes. Transformer le credit scoring grâce à l'IA générative Outre l'utilisation de données alternatives et de l'IA dans le credit scoring, la GenAI a le potentiel de révolutionner le credit scoring avec sa capacité à créer des données synthétiques et à comprendre des modèles complexes. Elle offre ainsi une approche plus nuancée, adaptative et prédictive. Sa capacité à synthétiser divers ensembles de données répond à l'une des principales limites des modèles traditionnels, à savoir la dépendance à l'égard des données de crédit historiques. En créant des données synthétiques qui reflètent les comportements financiers réels, les modèles de GenAI permettent une évaluation plus inclusive de la solvabilité. Cette évolution favorise l'inclusion financière en permettant à un plus grand nombre de personnes d'accéder au crédit. L'adaptabilité est essentielle pour gérer la nature dynamique des conditions économiques et l'évolution des comportements des consommateurs. Contrairement aux modèles traditionnels qui peinent à s'adapter aux perturbations imprévues, la capacité de la GenAI à apprendre et à s'adapter en permanence garantit l'efficacité du credit scoring en temps réel. Elle offre ainsi un outil plus résistant et plus réactif pour évaluer le risque de crédit. Outre ses prouesses prédictives, la GenAI peut contribuer à la transparence et à l'interprétabilité. Les modèles peuvent expliquer leurs décisions, ce qui permet de mieux comprendre les évaluations de crédit et de renforcer la confiance des consommateurs, des régulateurs et des institutions financières. Cependant, l'une des principales préoccupations liées à l'utilisation de la GenAI est l'hallucination, qui se caractérise par la génération d'informations absurdes, voire complètement erronées. Il existe plusieurs techniques pour atténuer ce risque. L'une d'entre elles consiste à utiliser la génération augmentée de récupération (RAG) . En récupérant des informations factuelles à partir de sources actualisées, la RAG s’assure que les réponses du modèle reflètent les informations les plus récentes et les plus précises disponibles. À titre d'exemple, Patronus AI utilise la RAG avec MongoDB Atlas pour permettre aux ingénieurs de noter et de comparer les performances des LLM dans des scénarios réels, de générer des cas de test contradictoires généralisés et de surveiller les hallucinations et autres comportements inattendus et dangereux. Cela peut aider à détecter les erreurs généralisées des LLM et à déployer des produits d'IA en toute sécurité. Robust Intelligence fait également appel aux services de MongoDB. Le pare-feu IA de la société protège les LLM en production en validant les entrées et les sorties en temps réel. Il évalue et atténue les risques opérationnels tels que les hallucinations, les risques éthiques, notamment les biais du modèle et les résultats toxiques, ainsi que les risques de sécurité tels que les injections rapides et les extractions de données à caractère personnel. Alors que l'IA générative continue de se développer, son intégration dans le credit scoring et les systèmes étendus de demande de crédit promet non seulement une avancée technologique, mais aussi une transformation fondamentale des processus d'évaluation et d'octroi de crédit. Une étape charnière dans l'histoire du crédit La convergence des données alternatives, de l'intelligence artificielle et de l'IA générative remodèle les fondements du credit scoring, et marque une étape charnière dans le secteur financier. L'adoption de méthodes alternatives de notation de crédit, qui offrent une évaluation plus inclusive et plus nuancée, permet de relever les défis des modèles traditionnels. Malgré le risque d'hallucination, l'IA générative est une solution résolument innovante. Elle révolutionne non seulement les capacités technologiques, mais redéfinit également fondamentalement l'évaluation du crédit et favorise une nouvelle ère d'inclusion, d'efficacité et d'équité financières. Si vous souhaitez en savoir plus sur la création d'applications alimentées par AI avec MongoDB, consultez les ressources suivantes : Numériser l’expérience de prêt et de crédit-bail avec MongoDB Développez des applications enrichies par l'IA avec des contrôles de sécurité adaptés, à l'échelle et qui offrent les performances attendues par les utilisateurs. Découvrez comment slice permet à des millions de personnes d'obtenir un crédit en moins d'une minute .

February 20, 2024

Reducir los sesgos en la calificación crediticia con IA generativa

La calificación crediticia desempeña un papel fundamental para determinar quién puede acceder a créditos y con qué términos. Sin embargo, a pesar de su importancia, los sistemas tradicionales de calificación crediticia han estado plagados de una serie de problemas críticos, desde sesgos y discriminación, hasta una consideración limitada de los datos y desafíos de escalabilidad. Por ejemplo, un estudio de préstamos estadounidenses demostró que a los prestatarios provenientes de minorías se les cobraban tasas de interés más altas (más del 8 %) y rechazaban préstamos con más frecuencia (más del 14 %) que a los prestatarios de grupos más privilegiados. La rigidez de los sistemas crediticios puede hacer que se adapten con lentitud a los cambios económicos y a la evolución de los comportamientos de los consumidores, dejando a algunas personas desatendidas y olvidadas. Para superar esto, los bancos y otros prestamistas buscan adoptar la inteligencia artificial para desarrollar modelos cada vez más sofisticados para puntuar el riesgo crediticio. En este artículo, exploraremos los fundamentos de la calificación crediticia, los desafíos que presentan los sistemas actuales y profundizaremos en cómo se puede aprovechar la inteligencia artificial (IA), en particular, la IA generativa (genAI) para mitigar el sesgo y mejorar la precisión. Desde la incorporación de fuentes de datos alternativas hasta el desarrollo de modelos de aprendizaje automático (ML), descubriremos el potencial transformador de la IA para remodelar el futuro de la calificación crediticia. Consulte nuestra página de recursos de IA para obtener más información sobre cómo crear aplicaciones basadas en IA con MongoDB. ¿Qué es la calificación crediticia? La calificación crediticia es un aspecto integral del panorama financiero, que sirve como un indicador numérico de la solvencia de una persona. Los prestamistas emplean esta métrica vital para evaluar el riesgo potencial asociado con la concesión de créditos o préstamos a particulares o empresas. Tradicionalmente, los bancos se basan en reglas predefinidas y modelos estadísticos que a menudo se construyen mediante regresión lineal o regresión logística. Los modelos se basan en datos crediticios históricos, con un enfoque en factores como el historial de pago, la utilización del crédito y la extensión del historial crediticio. Sin embargo, evaluar a los nuevos solicitantes de crédito plantea un desafío, lo que lleva a la necesidad de crear perfiles más precisos. Para atender a los segmentos desatendidos o desatendidos y tradicionalmente discriminados, las fintech y los bancos digitales están incorporando cada vez más información más allá del historial crediticio tradicional con datos alternativos para crear una visión más completa del comportamiento financiero de una persona. Desafíos de la calificación crediticia tradicional Las calificaciones crediticias son parte integral de la vida moderna porque sirven como un determinante crucial en diversas transacciones financieras, incluida la obtención de préstamos, el alquiler de un departamento, la obtención de seguros y, a veces, hasta las evaluaciones de empleo. El afán por mejorar el crédito puede ser un recorrido laberíntico, estos son algunos de los desafíos o las limitaciones de los modelos tradicionales de calificación crediticia que a menudo nublan el camino hacia la aprobación de la solicitud de crédito. Historial de crédito limitado: Muchas personas, especialmente las que se inician en el mundo del crédito, se encuentran con un obstáculo importante, o sea, un historial crediticio limitado o inexistente. Los modelos tradicionales de calificación crediticia dependen en gran medida del comportamiento crediticio pasado, lo que dificulta que las personas sin un historial crediticio sólido demuestren su solvencia. Aproximadamente 45 millones de estadounidenses no tienen calificaciones crediticias simplemente porque esos puntos de datos no existen para ellos. Ingresos irregulares: Los ingresos irregulares, típicos de los trabajos a tiempo parcial o por cuenta propia, suponen un reto para los modelos tradicionales de calificación crediticia, ya que pueden etiquetar a las personas como de mayor riesgo y dar lugar a denegaciones de solicitudes o a límites de crédito restrictivos. En 2023 en los Estados Unidos , las fuentes de datos difieren en la cantidad de personas que trabajan por cuenta propia. Una fuente muestra que más de 27 millones de estadounidenses presentaron documentos fiscales del Anexo C, que cubren los ingresos o pérdidas netos de un negocio, lo que destaca la necesidad de diferentes métodos de calificación crediticia para los trabajadores por cuenta propia. Elevada utilización del crédito existente: La fuerte dependencia del crédito existente se percibe a menudo como una señal de tensión financiera potencial, lo que influye en las decisiones crediticias. Las solicitudes de crédito pueden ser rechazadas o aprobadas con condiciones menos favorables, lo que refleja la preocupación por la capacidad del solicitante para gestionar de manera sensata el crédito adicional. Falta de claridad en los motivos del rechazo: No comprender los motivos detrás de los rechazos impide que los solicitantes aborden las causas raíz; en el Reino Unido, un estudio realizado entre abril de 2022 y abril de 2023 mostró que los principales motivos de rechazo incluían “un historial crediticio deficiente” (38 %), “no poder pagar los pagos” (28 %), “tener demasiados créditos” (19 %) y el 10 % dijo que no les dijeron por qué. Las razones, incluso cuando se dan, suelen ser demasiado imprecisas, lo que deja a los solicitantes sin los conocimientos que necesitan para abordar la causa raíz y mejorar su solvencia para futuras solicitudes. La falta de transparencia no solo es un problema para los clientes, sino que también puede suponer una sanción para los bancos. Por ejemplo, un banco de Berlín fue multado con €300 000 en 2023 por falta de transparencia a la hora de rechazar una solicitud de tarjeta de crédito. Falta de flexibilidad: Los cambios en el comportamiento de los consumidores, especialmente entre las generaciones más jóvenes que prefieren las transacciones digitales, desafían los modelos tradicionales. Factores como el aumento de la economía del trabajo a corto plazo, el empleo no tradicional, la deuda de préstamos estudiantiles y los altos costos de vida complican la evaluación de la estabilidad de los ingresos y la salud financiera. Las predicciones tradicionales de riesgo crediticio se limitan durante disrupciones sin precedentes como la pandemia de COVID-19, sin tener en cuenta esto en los modelos de puntuación. Reconocer estos desafíos destaca la necesidad de modelos alternativos de calificación crediticia que puedan adaptarse a comportamientos financieros en evolución, manejar fuentes de datos no tradicionales y proporcionar una evaluación más inclusiva y precisa de la solvencia crediticia en el panorama financiero dinámico actual. Calificación crediticia con datos alternativos La calificación crediticia alternativa se refiere al uso de fuentes de datos no tradicionales (también conocidas como datos alternativos) y métodos para evaluar la solvencia de una persona. Si bien la calificación crediticia tradicional depende en gran medida del historial crediticio de las principales agencias de crédito, la calificación crediticia alternativa incorpora una gama más amplia de factores para crear una imagen más completa del comportamiento financiero de una persona. A continuación, se muestran algunas de las fuentes de datos alternativas más populares: Pagos de servicios públicos: Más allá del historial crediticio, los pagos consistentes de servicios públicos como la electricidad y el agua ofrecen un poderoso indicador de responsabilidad financiera y revelan un compromiso con el cumplimiento de las obligaciones financieras, lo que proporciona información crucial más allá de las métricas tradicionales. Historial de alquiler: Para aquellos que no tienen una hipoteca, el historial de pagos de alquiler surge como una fuente de datos alternativa clave. Demostrar pagos de alquiler consistentes y oportunos ofrece una imagen integral de disciplina y confiabilidad financiera. Patrones de uso de teléfonos móviles: La ubicuidad de los teléfonos móviles desbloquea una gran cantidad de datos alternativos. El análisis de los patrones de llamadas y mensajes de texto permite conocer la red de contactos, la estabilidad y las conexiones sociales de una persona, lo que aporta información valiosa para la evaluación crediticia. Comportamiento de compra en línea: Examinar la frecuencia de compra, el tipo de compras y la cantidad gastada en compras en línea ofrece información valiosa sobre los comportamientos de gasto, lo que contribuye a una comprensión más matizada de los hábitos financieros. Historial educativo y laboral: La calificación crediticia alternativa tiene en cuenta el historial educativo y laboral de una persona. Los indicadores positivos, como los logros educativos y el empleo estable, desempeñan un papel crucial en la evaluación de la estabilidad financiera. Estas fuentes de datos alternativas representan un cambio hacia un enfoque más inclusivo, matizado y holístico de las evaluaciones crediticias. A medida que la tecnología financiera continúa avanzando, aprovechar estos conjuntos de datos alternativos garantiza una evaluación más completa de la solvencia, marcando un paso transformador en la evolución de los modelos de calificación crediticia. Calificación crediticia alternativa con inteligencia artificial Además del uso de datos alternativos, el uso de la IA como método alternativo ha surgido como una fuerza transformadora para abordar los desafíos de la calificación crediticia tradicional por varias razones: Capacidad para mitigar el sesgo: Al igual que los modelos estadísticos tradicionales, los modelos de IA, incluidos los LLM, entrenados con datos históricos sesgados heredarán los sesgos presentes en esos datos, lo que conducirá a resultados discriminatorios. Los LLM pueden enfocarse en ciertas características más que otras o pueden carecer de la capacidad de comprender el contexto más amplio de la situación financiera de un individuo, lo que lleva a una toma de decisiones sesgada. Sin embargo, existen varias técnicas para mitigar el sesgo de los modelos de IA: Estrategias de mitigación: Las iniciativas comienzan con el uso de datos de capacitación diversos y representativos para evitar reforzar los sesgos existentes. Las estrategias de mitigación inadecuadas o ineficaces pueden dar como resultado resultados sesgados que persisten en los modelos de calificación crediticia de IA. Prestar atención a los datos recopilados y al desarrollo del modelo es crucial para mitigar este sesgo. La incorporación de datos alternativos para la calificación crediticia desempeña un papel fundamental en la reducción de los sesgos. Las herramientas rigurosas de detección de sesgos, las restricciones de imparcialidad y las técnicas de regularización durante el entrenamiento mejoran la responsabilidad del modelo: Equilibrar la representación de características y emplear técnicas de posprocesamiento y algoritmos especializados contribuyen a la mitigación del sesgo. La evaluación inclusiva de modelos, el monitoreo continuo y la mejora iterativa, junto con el cumplimiento de pautas éticas y prácticas de gobernanza, completan un enfoque multifacético para reducir el sesgo en los modelos de IA. Esto es particularmente significativo para abordar las preocupaciones relacionadas con sesgos demográficos o socioeconómicos que pueden estar presentes en los datos crediticios históricos. Auditorías periódicas de sesgos: Realice auditorías periódicas para identificar y mitigar los sesgos en los LLM. Esto puede implicar analizar los resultados del modelo para detectar disparidades entre grupos demográficos y ajustar los algoritmos en consecuencia. Transparencia y capacidad de explicación: Aumentar la transparencia y la capacidad de explicación en los LLM para entender cómo se toman las decisiones. Esto puede ayudar a identificar y abordar procesos de toma de decisiones sesgados. Trade Ledger , una herramienta de software como servicio (SaaS) para préstamos, utiliza un enfoque basado en los datos para tomar decisiones informadas con mayor transparencia y trazabilidad al reunir datos de múltiples fuentes con diferentes esquemas en una única fuente de datos. Capacidad para analizar conjuntos de datos amplios y diversos: A diferencia de los modelos tradicionales que se basan en reglas predefinidas y datos crediticios históricos, los modelos de IA pueden procesar una infinidad de información, incluidas las fuentes de datos no tradicionales, para crear una evaluación más completa de la solvencia de una persona, lo que garantiza que se tenga en cuenta una gama más amplia de comportamientos financieros. La IA aporta una adaptabilidad sin precedentes: A medida que cambian las condiciones económicas y evolucionan los comportamientos de los consumidores, los modelos impulsados por IA pueden ajustarse rápidamente y aprender de los nuevos datos. El aspecto de aprendizaje continuo garantiza que la calificación crediticia siga siendo pertinente y eficaz ante los constantes cambios del panorama financiero. Las objeciones más comunes de los bancos a no utilizar la IA en la calificación crediticia son la transparencia y la capacidad de explicación en las decisiones crediticias. La complejidad inherente de algunos modelos de IA, especialmente los algoritmos de aprendizaje profundo, puede generar desafíos para brindar explicaciones claras para las decisiones con respecto a los créditos. Afortunadamente, la transparencia y la interpretabilidad de los modelos de IA han experimentado avances significativos. Técnicas como los valores SHapley Additive exPlanations (SHAP) y los gráficos Local Interpretable Model-Agnostic Explanations (LIME) , así como otros avances en el ámbito de la IA explicable (XAI), nos permiten ahora comprender cómo llega el modelo a decisiones crediticias concretas. Esto no solo mejora la confianza en el proceso de puntuación crediticia, sino que también aborda la crítica común de que los modelos de IA son “cajas negras”. Al comprender la importancia de aprovechar los datos alternativos que a menudo vienen en un formato semiestructurado o no estructurado, las instituciones financieras trabajan con MongoDB para mejorar sus procesos de solicitud de crédito con una forma más rápida, simple y flexible de hacer pagos y ofrecer crédito: Amar Bank, el principal banco digital de Indonesia , está combatiendo el sesgo al proporcionar micropréstamos a personas que no podrían obtener servicios financieros de los bancos tradicionales (no bancarizados y desatendidos). Los procesos tradicionales de suscripción eran inadecuados para los clientes que carecían de historial crediticio o garantías, por lo que optimizó las decisiones de préstamo al aprovechar los datos no estructurados. Con MongoDB Atlas, desarrollaron un modelo de análisis predictivo que integra datos estructurados y no estructurados para evaluar la solvencia de los prestatarios. La escalabilidad y la capacidad de MongoDB para administrar diversos tipos de datos fueron fundamentales para expandir y optimizar sus operaciones de préstamo. Para la gran mayoría de los indios, obtener crédito suele ser un desafío debido a las estrictas regulaciones y la falta de datos crediticios. Mediante el uso de sistemas de suscripción modernos, Slice , un innovador líder en el ecosistema fintech de la India, está ayudando a ampliar la accesibilidad al crédito en la India al simplificar su proceso KYC para dar una experiencia crediticia más fluida. Al utilizar MongoDB Atlas en diferentes casos de uso, incluso como almacén de funciones de ML en tiempo real, slice transformó su proceso de incorporación, reduciendo los tiempos de procesamiento a menos de un minuto. slice utiliza el almacén de características en tiempo real con MongoDB y modelos ML para calcular más de 100 variables al instante, lo que permite determinar la elegibilidad crediticia en menos de 30 segundos. Transformar la calificación crediticia con IA generativa Además del uso de datos alternativos y la IA en la calificación crediticia, la IA generativa tiene el potencial de revolucionar la calificación y la evaluación crediticia con su capacidad para crear datos sintéticos y comprender patrones intrincados, lo que ofrece un enfoque más matizado, adaptativo y predictivo. La capacidad de la IA generativa para sintetizar diversos conjuntos de datos aborda una de las limitaciones clave de la calificación crediticia tradicional: la dependencia de los datos crediticios históricos. Al crear datos sintéticos que reflejan los comportamientos financieros del mundo real, los modelos de IA generativa permiten una evaluación más inclusiva de la solvencia. Este cambio transformador promueve la inclusión financiera y abre puertas a un sector demográfico más amplio para que accedan a oportunidades de crédito. La adaptabilidad desempeña un papel crucial a la hora de navegar por la naturaleza dinámica de las condiciones económicas y los cambios en los comportamientos de los consumidores. A diferencia de los modelos tradicionales, que tienen dificultades para adaptarse a las interrupciones imprevistas, la capacidad de la IA generativa para aprender y adaptarse continuamente garantiza que la calificación crediticia siga siendo efectiva en tiempo real, ofreciendo una herramienta más resistente y receptiva para evaluar el riesgo crediticio. Además de su capacidad predictiva, la IA generativa puede contribuir a la transparencia y la capacidad de interpretación en la calificación crediticia. Los modelos pueden generar explicaciones para sus decisiones, proporcionando información más clara sobre las evaluaciones crediticias y mejorando la confianza entre los consumidores, los reguladores y las instituciones financieras. Sin embargo, una preocupación clave al hacer uso de la IA generativa es el problema de la alucinación, donde el modelo puede presentar información que no tiene sentido o es completamente falsa. Existen varias técnicas para mitigar este riesgo y una de ellas es el uso de la Generación de Aumento de Recuperación (RAG) . El RAG minimiza las alucinaciones basando las respuestas del modelo en información objetiva procedente de fuentes actualizadas, lo que garantiza que las respuestas del modelo reflejen la información más actual y precisa disponible. Por ejemplo, Patronus AI aprovecha RAG con MongoDB Atlas para permitir a los ingenieros calificar y comparar el rendimiento de los modelos de lenguaje grande (LLM) en escenarios del mundo real, generar casos de prueba adversariales a escala y monitorear las alucinaciones y otros comportamientos inesperados e inseguros. Esto puede ayudar a detectar errores de LLM a escala e implementar productos de IA de forma segura y confiable. Otro socio tecnológico de MongoDB es Robust Intelligence . El firewall de IA de la empresa protege los LLM en producción validando las entradas y salidas en tiempo real. Evalúa y mitiga los riesgos operativos, como las alucinaciones, los riesgos éticos, incluidos el sesgo del modelo y los resultados tóxicos, y los riesgos de seguridad, como las inyecciones rápidas y las extracciones de información personal identificable (PII). A medida que la IA generativa continúa madurando, su integración en la calificación crediticia y en los sistemas de solicitud de crédito más amplios promete no solo un avance tecnológico, sino una transformación fundamental en la forma en que evaluamos y otorgamos crédito. Un momento crucial en la historia del crédito La convergencia de datos alternativos, inteligencia artificial e IA generativa está remodelando los cimientos de la calificación crediticia, marcando un momento crucial en la industria financiera. Los desafíos de los modelos tradicionales se están superando a través de la adopción de métodos alternativos de calificación crediticia, ofreciendo una evaluación más inclusiva y matizada. La IA generativa, si bien presenta el desafío potencial de la alucinación, representa la vanguardia de la innovación, no solo revolucionando las capacidades tecnológicas, sino redefiniendo fundamentalmente la forma en que se evalúa el crédito, fomentando una nueva era de inclusión, eficiencia y equidad financiera. Si desea obtener más información sobre cómo crear aplicaciones enriquecidas con IA con MongoDB, eche un vistazo a los siguientes recursos: Digitalizar la experiencia de préstamos y arrendamientos con MongoDB Entregue aplicaciones enriquecidas con IA con los controles de seguridad adecuados, y a la escala y con el rendimiento que esperan los usuarios Descubra cómo slice posibilita la aprobación de crédito en menos de un minuto para millones

February 20, 2024

Reduzierung von Bias bei der Kreditwürdigkeitsprüfung mit generativer KI

Die Kreditwürdigkeitsprüfung spielt eine entscheidende Rolle bei der Entscheidung, wer Zugang zu einem Kredit erhält und zu welchen Konditionen. Trotz ihrer Bedeutung werden herkömmliche Kreditbewertungssysteme seit langem von einer Reihe kritischer Probleme geplagt, die von Verzerrungen und Diskriminierung bis hin zur begrenzten Berücksichtigung von Daten und Problemen bei der Skalierbarkeit reichen. So hat eine Studie über US-Kredite gezeigt, dass Kreditnehmer, die einer Minderheit angehören, höhere Zinssätze zahlen mussten (+8 %) und Kredite häufiger abgelehnt wurden (+14 %) als Kreditnehmer aus privilegierteren Gruppen. Die starre Natur der Kreditsysteme bedeutet, dass sie sich nur langsam an die sich verändernden wirtschaftlichen Gegebenheiten und das sich verändernde Verbraucherverhalten anpassen können, so dass einige Personen nicht ausreichend berücksichtigt und übersehen werden. Um dieses Problem zu lösen, setzen Banken und andere Kreditgeber auf künstliche Intelligenz, um immer ausgefeiltere Modelle zur Bewertung des Kreditrisikos zu entwickeln. In diesem Artikel befassen wir uns mit den Grundlagen der Kreditwürdigkeitsprüfung, den Herausforderungen aktueller Systeme und der Frage, wie künstliche Intelligenz (KI), insbesondere generative KI (genAI), genutzt werden kann, um Verzerrungen zu verringern und die Genauigkeit zu verbessern. Von der Einbeziehung alternativer Datenquellen bis hin zur Entwicklung von Modellen des maschinellen Lernens (ML) werden wir das transformative Potenzial der KI bei der Neugestaltung der Zukunft der Kreditwürdigkeitsprüfung aufdecken. Auf unserer KI-Ressourcenseite erfahren Sie mehr über die Entwicklung von KI-gestützten Apps mit MongoDB. Was ist eine Kreditwürdigkeitsprüfung? Die Kreditwürdigkeitsprüfung ist ein integraler Bestandteil der Finanzlandschaft und dient als numerischer Maßstab für die Kreditwürdigkeit einer Person. Diese wichtige Kennzahl wird von Kreditgebern verwendet, um das potenzielle Risiko zu bewerten, das mit der Vergabe von Krediten oder Darlehen an Privatpersonen oder Unternehmen verbunden ist. Traditionell verlassen sich Banken auf vordefinierte Regeln und statistische Modelle, die oft auf linearer oder logistischer Regression basieren. Die Modelle basieren auf historischen Kreditdaten und konzentrieren sich auf Faktoren wie Zahlungsverhalten, Kreditnutzung und Länge der Kredithistorie. Die Beurteilung neuer Kreditantragsteller stellt jedoch eine Herausforderung dar, sodass eine genauere Profilerstellung erforderlich ist. Um den traditionell diskriminierten unter- oder unversorgten Segmenten entgegenzukommen, integrieren Fintechs und digitale Banken zunehmend Informationen, die über die traditionelle Kredithistorie hinausgehen, mit alternativen Daten, um einen umfassenderen Überblick über das Finanzverhalten einer Person zu erhalten. Herausforderungen bei der traditionellen Kreditwürdigkeitsprüfung Bonitätsbewertungen sind aus dem modernen Leben nicht mehr wegzudenken, denn sie sind ein entscheidender Faktor bei verschiedenen Finanztransaktionen, z. B. bei der Sicherung von Krediten, der Anmietung einer Wohnung, dem Abschluss einer Versicherung und manchmal sogar bei der Einstellungsprüfung. Da das Bemühen um eine Kreditvergabe eine verworrene Angelegenheit sein kann, finden Sie hier einige der Herausforderungen oder Einschränkungen bei den traditionellen Kreditbewertungsmodellen, die oft den Weg zur Genehmigung eines Kreditantrags verstellen. Begrenzte Kredithistorie: Viele Personen, insbesondere diejenigen, die neu im Kreditgeschäft sind, stoßen auf eine große Hürde – eine begrenzte oder nicht vorhandene Kredithistorie. Herkömmliche Kreditbewertungsmodelle stützen sich stark auf das Kreditverhalten in der Vergangenheit, sodass es für Personen ohne eine solide Kredithistorie schwierig ist, ihre Kreditwürdigkeit nachzuweisen. Etwa 45 Millionen Amerikaner verfügen nicht über einen Kreditscore, weil es diese Daten für sie nicht gibt. Unregelmäßiges Einkommen: Unregelmäßiges Einkommen, typischerweise in Form von Teilzeitarbeit oder Freiberuflichkeit, stellt eine Herausforderung für herkömmliche Kreditwürdigkeitsmodelle dar und kann dazu führen, dass Personen als höheres Risiko eingestuft werden und Anträge abgelehnt oder Kreditlimits eingeschränkt werden. Im Jahr 2023 gibt es in den Vereinigten Staaten unterschiedliche Datenquellen darüber, wie viele Menschen selbständig sind. Aus einer Quelle geht hervor, dass mehr als 27 Millionen Amerikaner Steuerunterlagen nach „Schedule C“ eingereicht haben, die das Nettoeinkommen oder den Verlust aus einem Unternehmen abdecken – was die Notwendigkeit anderer Methoden der Kreditwürdigkeitsprüfung für Selbständige unterstreicht. Hohe Inanspruchnahme bestehender Kredite: Eine starke Inanspruchnahme bestehender Kredite wird oft als Zeichen einer potenziellen finanziellen Belastung wahrgenommen und beeinflusst die Kreditentscheidungen. Es kann vorkommen, dass Kreditanträge abgelehnt oder zu ungünstigeren Konditionen bewilligt werden, weil man Bedenken hat, ob der Antragsteller in der Lage ist, zusätzliche Kredite vernünftig zu verwalten. Mangelnde Klarheit bei den Ablehnungsgründen: Die Gründe für Ablehnungen zu verstehen, hindert die Antragsteller daran, die eigentlichen Ursachen anzugehen. Eine Studie aus dem Vereinigten Königreich, die zwischen April 2022 und April 2023 durchgeführt wurde, ergab, dass zu den Hauptgründen für eine Ablehnung „schlechte Kreditvergangenheit“ (38 %), „konnte sich die Rückzahlungen nicht leisten“ (28 %) und „zu viele andere Kredite“ (19 %) gehörten. 10 % gaben an, dass ihnen die Gründe nicht mitgeteilt wurden. Die Gründe, selbst wenn sie angegeben werden, sind oft zu vage, sodass die Antragsteller im Unklaren gelassen werden, was es ihnen erschwert, die Ursache zu beseitigen und ihre Kreditwürdigkeit für zukünftige Anträge zu verbessern. Der Mangel an Transparenz ist nicht nur ein Ärgernis für die Kunden, er kann auch zu einer Strafe für die Banken führen. So wurde zum Beispiel eine Berliner Bank im Jahr 2023 zu einer Geldstrafe von 300.000 Euro verurteilt , weil sie einen Kreditkartenantrag abgelehnt hatte und es dabei an Transparenz mangelte. Mangel an Flexibilität: Das veränderte Verbraucherverhalten, insbesondere bei den jüngeren Generationen, die digitale Transaktionen bevorzugen, stellt traditionelle Modelle in Frage. Faktoren wie das Aufkommen der Gig-Economy, nicht-traditionelle Beschäftigungsverhältnisse, Schulden für Studiendarlehen und hohe Lebenshaltungskosten erschweren die Beurteilung der Einkommensstabilität und finanziellen Situation. Herkömmliche Kreditrisikovorhersagen sind bei beispiellosen Störungen wie COVID-19 nur begrenzt möglich, da dies in den Bewertungsmodellen nicht berücksichtigt wird. Diese Herausforderungen unterstreichen den Bedarf an alternativen Kreditbewertungsmodellen, die sich an das sich verändernde Finanzverhalten anpassen, mit nicht-traditionellen Datenquellen umgehen können und eine umfassendere und genauere Bewertung der Kreditwürdigkeit in der dynamischen Finanzlandschaft von heute ermöglichen. Kreditwürdigkeitsprüfung mit alternativen Daten Die alternative Kreditwürdigkeitsprüfung bezieht sich auf die Verwendung von nicht-traditionellen Datenquellen (auch bekannt als alternative Daten) und Methoden zur Bewertung der Kreditwürdigkeit einer Person. Während sich die traditionelle Kreditwürdigkeitsprüfung stark auf die Kredithistorie der großen Kreditbüros stützt, bezieht die alternative Kreditwürdigkeitsprüfung eine breitere Palette von Faktoren ein, um ein umfassenderes Bild des Finanzverhaltens einer Person zu erstellen. Im Folgenden sind einige der beliebtesten alternativen Datenquellen aufgeführt: Zahlungen für Versorgungsleistungen: Abgesehen von der Kredithistorie sind regelmäßige Zahlungen für Versorgungsleistungen wie Strom und Wasser ein aussagekräftiger Indikator für finanzielles Verantwortungsbewusstsein und ein Zeichen dafür, dass Sie Ihren finanziellen Verpflichtungen nachkommen, was entscheidende Einblicke über die traditionellen Kennzahlen hinaus ermöglicht. Miethistorie: Für Personen ohne Hypothek erweist sich der Zahlungsverlauf von Mieten als eine wichtige alternative Datenquelle. Der Nachweis konsistenter und pünktlicher Mietzahlungen vermittelt ein umfassendes Bild von finanzieller Disziplin und Zuverlässigkeit. Nutzungsmuster von Mobiltelefonen: Die Allgegenwart von Mobiltelefonen erschließt eine Fülle alternativer Daten. Die Analyse von Anruf- und Textmustern bietet Einblicke in das Netzwerk, die Stabilität und die sozialen Verbindungen einer Person und liefert damit wertvolle Informationen für die Bonitätsbewertung. Online-Kaufverhalten: Die Untersuchung der Häufigkeit, Art und Höhe der Ausgaben für Online-Einkäufe bietet wertvolle Einblicke in das Ausgabeverhalten und trägt zu einem differenzierteren Verständnis der finanziellen Gewohnheiten bei. Bildungs- und Beschäftigungshintergrund: Die alternative Kreditwürdigkeitsprüfung berücksichtigt den Bildungs- und Beschäftigungsverlauf einer Person. Positive Indikatoren, wie z. B. Bildungsabschlüsse und stabile Beschäftigungsverhältnisse, spielen eine entscheidende Rolle bei der Bewertung der finanziellen Stabilität. Diese alternativen Datenquellen stellen eine Verlagerung hin zu einem umfassenderen, nuancierteren und ganzheitlicheren Ansatz für Bonitätsbewertungen dar. Da die Finanztechnologie immer weiter voranschreitet, gewährleistet die Nutzung dieser alternativen Datensätze eine umfassendere Bewertung der Kreditwürdigkeit und stellt einen entscheidenden Schritt in der Entwicklung von Kreditbewertungsmodellen dar. Alternative Kreditwürdigkeitsprüfung mit künstlicher Intelligenz Neben der Verwendung alternativer Daten hat sich der Einsatz von KI als alternative Methode aus einer Reihe von Gründen als transformative Kraft erwiesen, um die Herausforderungen der traditionellen Kreditwürdigkeitsprüfung zu bewältigen: Fähigkeit zur Abschwächung von Bias: Wie herkömmliche statistische Modelle werden auch KI-Modelle, einschließlich LLMs (Large Language Models), die auf historischen Daten trainiert wurden, die einen Bias aufweisen, diesen vererben, was zu diskriminierenden Ergebnissen führt. LLMs konzentrieren sich möglicherweise mehr auf bestimmte Merkmale als auf andere oder sind nicht in der Lage, den breiteren Kontext der finanziellen Situation einer Person zu verstehen, was zu voreingenommenen Entscheidungen führt. Es gibt jedoch verschiedene Techniken, um den Bias von KI-gestützten Modellen abzumildern: Strategien zur Eindämmung: Die Initiativen beginnen mit der Verwendung vielfältiger und repräsentativer Trainingsdaten, um zu vermeiden, dass ein bestehender Bias verstärkt wird. Unzureichende oder unwirksame Strategien zur Risikominderung können dazu führen, dass KI-Kreditbewertungsmodelle weiterhin verzerrte Ergebnisse liefern. Eine sorgfältige Auswahl der gesammelten Daten und die Entwicklung des Modells sind entscheidend, um diesen Bias abzuschwächen. Die Einbeziehung alternativer Daten für die Kreditwürdigkeitsprüfung spielt eine entscheidende Rolle bei der Reduzierung von Bias. Strenge Tools zur Bias-Erkennung, Fairness-Einschränkungen und Regularisierungstechniken während des Trainings verbessern die Verantwortlichkeit des Modells: Die ausgewogene Darstellung von Merkmalen und der Einsatz von Nachbearbeitungstechniken und speziellen Algorithmen tragen zur Bias-Minderung bei. Eine umfassende Modellbewertung, kontinuierliche Überwachung und iterative Verbesserung in Verbindung mit der Einhaltung ethischer Richtlinien und Governance-Praktiken vervollständigen einen vielschichtigen Ansatz zur Reduzierung von Bias in KI-Modellen. Dies ist besonders wichtig, um Bedenken hinsichtlich möglicher demographischer oder sozioökonomischer Unausgewogenheiten in historischen Kreditdaten auszuräumen. Regelmäßige Bias-Audits: Führen Sie regelmäßige Audits durch, um Bias in LLMs zu erkennen und zu mildern. Dies kann die Analyse der Modellergebnisse auf Unterschiede zwischen demografischen Gruppen und die entsprechende Anpassung der Algorithmen umfassen. Transparenz und Erklärbarkeit: Erhöhen Sie die Transparenz und Erklärbarkeit in LLMs, um zu verstehen, wie Entscheidungen getroffen werden. Dies kann dazu beitragen, voreingenommene Entscheidungsprozesse zu erkennen und anzugehen. Trade Ledger, ein Software-as-a-Service (SaaS)-Tool für die Kreditvergabe, verwendet einen datengesteuerten Ansatz, um fundierte Entscheidungen mit größerer Transparenz und Rückverfolgbarkeit zu treffen, indem Daten aus mehreren Quellen mit unterschiedlichen Schemata in einer einzigen Datenquelle zusammengeführt werden. Fähigkeit, große und vielfältige Datensätze zu analysieren: Im Gegensatz zu herkömmlichen Modellen, die sich auf vordefinierte Regeln und historische Kreditdaten stützen, können KI-Modelle eine Vielzahl von Informationen, einschließlich nicht-traditioneller Datenquellen, verarbeiten, um eine umfassendere Bewertung der Kreditwürdigkeit einer Person zu erstellen und sicherzustellen, dass ein breiteres Spektrum an finanziellen Verhaltensweisen berücksichtigt wird. AI bringt eine beispiellose Anpassungsfähigkeit mit sich: Wenn sich die wirtschaftlichen Bedingungen ändern und das Verbraucherverhalten sich weiterentwickelt, können AI-gestützte Modelle sich schnell anpassen und aus neuen Daten lernen. Der Aspekt des kontinuierlichen Lernens gewährleistet, dass die Kreditwürdigkeitsprüfung angesichts der sich ständig verändernden Finanzlandschaft relevant und effektiv bleibt. Die häufigsten Einwände von Banken gegen den Verzicht auf KI bei der Kreditbewertung sind Transparenz und Erklärbarkeit bei Kreditentscheidungen. Die inhärente Komplexität einiger KI-Modelle, insbesondere von Deep-Learning-Algorithmen, kann zu Problemen bei der Bereitstellung klarer Erklärungen für Kreditentscheidungen führen. Glücklicherweise haben sich die Transparenz und Interpretierbarkeit von KI-Modellen erheblich verbessert. Techniken wie SHAPley Additive exPlanations (SHAP) Werte und Local Interpretable Model-Agnostic Explanations (LIME) Plots und verschiedene andere Fortschritte im Bereich der Explainable AI (XAI) ermöglichen es uns nun zu verstehen, wie ein Modell zu bestimmten Kreditentscheidungen kommt. Dies stärkt nicht nur das Vertrauen in den Kreditbewertungsprozess, sondern richtet sich auch gegen die gängige Kritik, dass KI-Modelle „Black Boxes“ seien. Finanzinstitute haben erkannt, wie wichtig es ist, alternative Daten zu nutzen, die oft in einem halb- oder unstrukturierten Format vorliegen. Sie arbeiten mit MongoDB, um ihre Kreditantragsprozesse durch eine schnellere, einfachere und flexiblere Methode für Zahlungen und Kreditangebote zu verbessern: Die Amar Bank, Indonesiens führende digitale Bank , bekämpft Vorurteile, indem sie Mikrokredite an Menschen vergibt, die von traditionellen Banken keine Finanzdienstleistungen erhalten können („unbanked and underserved“). Herkömmliche Underwriting-Prozesse waren für Kunden ohne Kredithistorie oder Sicherheiten unzureichend, so dass sie die Kreditentscheidungen durch die Nutzung unstrukturierter Daten optimiert haben. Mithilfe von MongoDB Atlas entwickelten sie ein prädiktives Analysemodell, das strukturierte und unstrukturierte Daten zur Bewertung der Kreditwürdigkeit von Kreditnehmern integriert. Die Skalierbarkeit von MongoDB und die Fähigkeit, verschiedene Datentypen zu verwalten, waren entscheidend für die Erweiterung und Optimierung des Kreditgeschäfts. Für die große Mehrheit der Inder ist es aufgrund der strengen Vorschriften und des Mangels an Kreditdaten schwierig, einen Kredit zu bekommen. Durch den Einsatz moderner Underwriting-Systeme trägt Slice , ein führender Innovator im indischen Fintech-Ökosystem, dazu bei, den Zugang zu Krediten in Indien zu erweitern, indem es den KYC-Prozess für ein reibungsloseres Krediterlebnis optimiert. Durch den Einsatz von MongoDB Atlas in verschiedenen Anwendungsfällen, u. a. als Echtzeitspeicher für ML-Funktionen, konnte Slice den Onboarding-Prozess umgestalten und die Verarbeitungszeiten auf unter eine Minute reduzieren. Slice nutzt den Echtzeit-Feature-Store mit MongoDB und ML-Modellen zur sofortigen Berechnung von über 100 Variablen und ermöglicht so die Bestimmung der Kreditwürdigkeit in weniger als 30 Sekunden. Die Kreditbewertung mit generativer KI transformieren Neben der Verwendung von alternativen Daten und KI bei der Kreditwürdigkeitsprüfung hat GenAI das Potenzial, die Kreditwürdigkeitsprüfung und -bewertung zu revolutionieren, da sie in der Lage ist, synthetische Daten zu erstellen und komplizierte Muster zu verstehen, was einen nuancierteren, adaptiven und prädiktiven Ansatz ermöglicht. Die Fähigkeit von GenAI, verschiedene Datensätze zu synthetisieren, adressiert eine der wichtigsten Einschränkungen der traditionellen Kreditwürdigkeitsprüfung – die Abhängigkeit von historischen Kreditdaten. Durch die Erstellung synthetischer Daten, die das reale Finanzverhalten widerspiegeln, ermöglichen GenAI-Modelle eine umfassendere Bewertung der Kreditwürdigkeit. Dieser Wandel fördert die finanzielle Inklusivität und öffnet einer breiteren Bevölkerungsschicht die Türen für den Zugang zu Kreditmöglichkeiten. Anpassungsfähigkeit spielt eine entscheidende Rolle, wenn es darum geht, die Dynamik der wirtschaftlichen Bedingungen und das sich ändernde Verbraucherverhalten zu steuern. Im Gegensatz zu herkömmlichen Modellen, die sich nur schwer an unvorhergesehene Störungen anpassen können, sorgt die Fähigkeit von GenAI, kontinuierlich zu lernen und sich anzupassen, dafür, dass die Kreditwürdigkeitsprüfung in Echtzeit effektiv bleibt und ein belastbareres und reaktionsfähigeres Instrument zur Bewertung des Kreditrisikos bietet. Zusätzlich zu ihren Vorhersagefähigkeiten kann GenAI zu Transparenz und Interpretierbarkeit bei der Kreditwürdigkeitsprüfung beitragen. Modelle können Erklärungen für ihre Entscheidungen liefern, die einen besseren Einblick in die Kreditwürdigkeitsprüfung ermöglichen und das Vertrauen zwischen Verbrauchern, Regulierungsbehörden und Finanzinstituten stärken. Ein Hauptproblem bei der Verwendung von GenAI ist jedoch das Problem der Halluzinationen, bei denen das Modell Informationen präsentieren kann, die entweder unsinnig oder schlichtweg falsch sind. Es gibt verschiedene Techniken, um dieses Risiko zu mindern. Eine davon ist der RAG-Ansatz ( Retrieval Augment Generation ). RAG minimiert Halluzinationen, indem es die Antworten des Modells auf sachliche Informationen aus aktuellen Quellen stützt und so sicherstellt, dass die Antworten des Modells die aktuellsten und genauesten verfügbaren Informationen widerspiegeln. Patronus AI beispielsweise nutzt RAG mit MongoDB Atlas , um Ingenieuren die Möglichkeit zu geben, die Leistung großer Sprachmodelle (LLMs) in realen Szenarien zu bewerten und zu benchmarken, gegnerische Testfälle in großem Umfang zu generieren und Halluzinationen und anderes unerwartetes und unsicheres Verhalten zu überwachen. Dies kann dazu beitragen, LLM-Fehler in großem Umfang zu erkennen und KI-Produkte sicher und vertrauensvoll einzusetzen. Ein weiterer Technologiepartner von MongoDB ist Robust Intelligence . Die KI-Firewall dieses Unternehmens schützt LLMs in der Produktion, indem sie Eingaben und Ausgaben in Echtzeit validiert. Es bewertet und mindert operationelle Risiken wie Halluzinationen, ethische Risiken wie Modellverzerrungen und toxische Ergebnisse sowie Sicherheitsrisiken wie Sofortinjektionen und -extraktionen personenbezogener Daten (Personal Identifiable Information, PII). Da die generative KI immer weiter reift, verspricht ihre Integration in die Kreditwürdigkeitsprüfung und die breiteren Kreditantragssysteme nicht nur einen technologischen Fortschritt, sondern einen grundlegenden Wandel in der Art und Weise, wie wir Kredite bewerten und vergeben. Ein entscheidender Moment in der Geschichte des Kredits Die Konvergenz von alternativen Daten, künstlicher Intelligenz und generativer KI verändert die Grundlagen der Kreditwürdigkeitsprüfung und markiert einen entscheidenden Moment in der Finanzbranche. Die Herausforderungen traditioneller Modelle werden durch die Einführung alternativer Kreditbewertungsmethoden überwunden, die eine umfassendere und nuanciertere Bewertung bieten. Generative KI stellt zwar die potenzielle Herausforderung der Halluzination dar, ist aber ein Vorreiter der Innovation. Sie revolutioniert nicht nur die technologischen Möglichkeiten, sondern definiert auch die Art und Weise, wie Kredite bewertet werden, grundlegend neu und fördert so eine neue Ära der finanziellen Inklusivität, Effizienz und Fairness. Wenn Sie mehr über die Erstellung von KI-angereicherten Anwendungen mit MongoDB erfahren möchten, werfen Sie einen Blick auf die folgenden Ressourcen: Digitalisierung des Kredit- und Leasinggeschäfts mit MongoDB Versehen Sie KI-gestützte Apps mit den richtigen Sicherheitskontrollen und in der Größe und Leistung, die vom Benutzer erwartet wird Entdecken Sie, wie Slice für Millionen von Menschen eine Kreditgenehmigung in weniger als einer Minute ermöglicht

February 20, 2024

利用生成式 AI 减少信用评分的偏差

信用评分在确定谁获得信贷以及以何种条件获得信贷方面发挥着关键作用。然而,尽管这一点很重要,但传统的信用评分系统长期以来一直受到一系列关键问题的困扰 — 从偏见和歧视到有限的数据考虑和可扩展性挑战。例如,一项针对 美国贷款的研究 表明,与来自特权群体的借款人相比,少数族裔借款人被收取的利率更高 (+8%),被拒绝贷款的频率也更高 (+14%)。 僵化的信贷系统反应迟缓,无法快速适应不断变化的经济形势和消费者行为,这会导致一些人得不到充分服务并被忽视。为了解决这一问题,银行和其他贷款机构正在寻求采用人工智能来开发日益复杂的信用风险评分模型。' 在本文中,我们将了解信用评分的基础知识、当前系统面临的挑战,并深入探讨如何利用人工智能 ( AI ),特别是生成式 AI (genAI) 来减少偏差并提高准确性。从替代数据源的整合到机器学习 (ML) 模型的开发,我们将揭示 AI 在重塑信用评分未来方面的变革潜力。 请查看我们的 AI 资源页面 ,了解有关使用 MongoDB 构建 AI 驱动的应用的更多信息。 什么是信用评分? 信用评分是金融领域不可或缺的一个方面,是衡量个人信用状况的一个数字标准。贷方利用这一重要指标来评估与向个人或企业提供信贷或贷款相关的潜在风险。 传统上,银行依赖于通常使用线性回归或逻辑回归构建的预定义规则和统计模型。这些模型以历史信用数据为基础,重点关注支付历史、信用利用率和信用历史长度等因素。 但是,评估新的信用申请人是一项挑战,因此需要更准确的分析评估。为了满足传统上受到歧视的、得不到充分服务或服务不足的群体的需求,金融科技公司和数字银行正越来越多地将传统信用记录以外的信息与其他数据结合起来,以便更全面地了解个人的金融行为。 传统信用评分面临的挑战 信用评分是现代生活中不可或缺的一部分,因为它在各种金融交易(包括获得贷款、租房、购买保险,甚至是就业筛选)中起着至关重要的决定性作用。追求信用可能是一段迷宫般的旅程,传统信用评分模型存在一些挑战或限制,这些挑战或限制通常会阻碍信用申请的批准。 有限的信用记录: 许多人,尤其是那些刚接触信用游戏的人,都会遇到一个重大障碍 — 信用记录有限或根本不存在。传统的信用评分模型严重依赖于过去的信用行为,这使得没有良好信用记录的个人很难证明自己的信用度。大约有 4,500 万美国人缺乏信用评分,仅仅是因为他们没有这些数据点。 收入不稳定: 非经常性收入(这在兼职工作或自由职业中很常见)对传统的信用评分模型提出了挑战,可能会给个人贴上更高风险的标签,并导致其申请被拒绝或信用额度受到限制。 关于 2023 年美国有多少人从事个体经营,数据来源各不相同。一个数据来源显示,有超过 2,700 万美国人提交了附表 C 纳税文件,其中涵盖了来自一项业务的净收入或损失 — 这突显了那些个体经营者对于不同信用评分方法的需求。 现有信用利用率高: 对现有信用的严重依赖往往被视为潜在财务压力的信号,从而影响信用决策。信用申请可能会面临拒绝或以不太有利的条件获得批准,这反映出对申请人明智地管理额外信用能力的担忧。 拒绝原因不明确: 即使了解申请被拒的原因也无法让申请人从根本上解决问题 — 在英国, 2022 年 4 月至 2023 年 4 月期间的一项研究显示,申请被拒的主要原因包括“信用记录不良”(38%)、“无力偿还贷款”(38%)、“有太多其他信贷”(19%),还有 10% 的人表示没有被告知原因。即使给出了原因,往往也太模糊,让申请人一筹莫展,难以解决根本问题并提高他们未来申请的信用度。缺乏透明度不仅会给客户带来麻烦,还可能导致银行受到处罚。例如,2023 年,柏林一家银行因在拒绝信用卡申请时缺乏透明度而被罚款 30 万欧元。 缺乏灵活性: 消费者行为的转变,尤其是年轻一代对数字交易的青睐,对传统模式提出了挑战。零工经济的兴起、非传统就业、学生贷款债务和高昂的生活成本等因素使评估收入稳定性和财务健康状况变得更加复杂。在像新冠疫情这样前所未有的破坏事件中,传统的信用风险预测是有限的,在评分模型中没有考虑到这一点。 认识到这些挑战,就需要有替代的信用评分模型,以适应不断变化的金融行为,处理非传统的数据来源,并在当今动态变化的金融环境中提供更具包容性和更准确的信用度评估。 使用替代数据进行信用评分 替代信用评分是指使用非传统数据源(又名替代数据)和方法来评估个人信用度。传统的信用评分在很大程度上依赖于主要征信机构的信用记录,而替代信用评分则纳入了更广泛的因素,以更全面地反映个人的金融行为。以下是一些常用的替代数据源: 公用事业付款: 除信用记录外,持续支付水电等公用事业费用也是衡量财务责任的有力指标,显示了履行财务义务的决心,提供了传统指标之外的重要见解。 租赁记录: 对于没有抵押贷款的人来说,租金支付历史记录是一个重要的替代数据来源。持续、及时支付租金的表现全面反映了对财务纪律的遵守和可靠性。 手机使用模式: 手机的普及解锁了大量的替代数据。通过分析通话和短信模式,可以深入了解个人的网络、稳定性和社交关系,为信用评估提供有价值的信息。 网上购物行为: 对网购的频率、类型和金额进行研究,为了解消费行为提供了宝贵的信息,有助于对财务习惯有更细致的了解。 教育和就业背景: 替代信用评分考虑了个人的教育和就业经历。教育成就和稳定就业等积极指标在评估金融稳定性方面发挥着至关重要的作用。 这些替代数据源代表着向更具包容性、更细致入微、更全面的信用评估方法的转变。随着金融技术的不断进步,利用这些替代数据集可确保对信用度进行更全面的评估,标志着信用评分模型的发展迈出了变革性的一步。 使用人工智能进行替代信用评分 除了使用替代数据外,作为一种替代方法,人工智能已成为应对传统信用评分挑战的变革力量,原因有很多: 减少偏见的能力: 与传统的统计模型一样,人工智能模型(包括大型语言模型)在有偏见的历史数据上进行训练后,也会继承这些数据中存在的偏见,从而导致歧视性的结果。大型语言模型可能更关注某些特征而忽略其他一些特征,或者不能从更广泛的背景去理解个人财务状况,从而导致决策存在偏见。但是,有多种技术可以减少 AI 模型的偏见: 缓解策略: 从使用多样化和有代表性的培训数据开始,避免强化现有的偏见。不充分或无效的缓解策略可能会导致 AI 信用评分模型中持续出现有偏见的结果。细心关注收集的数据和模型开发对于减少这种偏见至关重要。将替代数据纳入信用评分在减少偏见方面发挥着关键作用。 在训练过程中,严格的偏见检测工具、公平性约束和正则化技术可增强模型的问责性: 平衡特征表示并采用后处理技术和专门算法有助于减少偏见。对模型进行全面评估、持续监控和迭代改进,同时结合对道德准则和管理规范的遵守,可以从多个层面减少人工智能模型中的偏见。这对于解决与历史信用数据中可能存在的人口或社会经济偏见有关的问题尤为重要。 定期进行偏见审查: 定期进行审查以识别并减少大型语言模型中的偏见。这可能涉及分析模型输出结果,以发现不同人口群体之间的差异,并相应调整算法。 透明度和可解释性: 提高大型语言模型的透明度和可解释性,以了解决策是如何做出的。这可以帮助识别和解决有偏见的决策过程。 Trade Ledger 是一种贷款软件即服务 (SaaS) 工具,它使用数据驱动的方法,通过将具有不同模式的多个来源的数据整合到单个数据源中,以更高的透明度和可追溯性做出明智的决策。 能够分析海量且多样化的数据集: 与依赖预定义规则和历史信用数据的传统模型不同,AI 模型可以处理大量信息,包括非传统数据源,以对个人信用度进行更全面的评估,确保考虑到更广泛的金融行为。 AI 带来了无与伦比的适应性: 随着经济条件的变化和消费者行为的演变,AI 驱动的模型可以快速调整并从新数据中学习。持续学习可确保信用评分在瞬息万变的金融环境中保持相关性和有效性。 对于在信用评分中使用 AI,银行最常见的反对意见与信用决策的透明度和可解释性相关。一些 AI 模型,尤其是深度学习算法,其本身的复杂性可能会导致难以为信用决策提供清晰的解释。幸运的是,AI 模型的透明度和可解释性已经取得了显著的进步。现在, SHAPley Additive exPlanations (SHAP) 值和 Local Interpretable Model-Agnostic Explanations (LIME) 图等技术以及可解释 AI (XAI) 领域的其他一些进步,让我们能够了解模型是如何做出具体信用决策的。这不仅增强了对信用评分过程的信任,还解决了 AI 模型是“黑匣子”的普遍批评。 了解利用通常以半结构化或非结构化格式出现的替代数据的重要性后,金融机构与 MongoDB 合作,以更快、更简单、更灵活的方式进行付款和提供信用,以增强其信用申请流程: 作为印度尼西亚领先的一家数字银行 ,阿马尔银行正在为无法从传统银行获得金融服务(无银行账户和服务支持不足)的人群提供小额贷款,从而消除偏见。由于传统的承保流程不足以涵盖缺乏信用记录或抵押品的客户,因此该银行利用非结构化数据简化了贷款决策。该银行利用 MongoDB Atlas 开发了一个集成结构化和非结构化数据的预测性分析模型,用于评估借款人的信用水平。MongoDB 具备强大的可扩展性和多样化数据类型的管理能力,从而助力该银行扩展和优化贷款业务。 对于绝大多数印度人来说,由于严格的监管和缺乏信用数据,获得信贷批准通常困难重重。通过使用现代承保系统,印度金融科技生态系统的领先创新者 Slice 正在简化其 KYC 流程,以提供更顺畅的信贷体验,从而拓宽印度人获得信贷的渠道。通过在不同的使用案例中使用 MongoDB Atlas (包括作为实时 ML 特征存储),slice 改变了他们的引导流程,将处理时间缩短至不到一分钟。slice 使用具有 MongoDB 和 ML 模型的实时功能存储来即时计算 100 多个变量,从而可以在不到 30 秒的时间内确定信贷资格。 使用生成式 AI 改变信用评分 在信用评分中除了使用替代数据和 AI 外,还有 GenAI,GenAI 具有创建合成数据和理解复杂模式的能力,提供更细致、更具适应性和预测性的方法,因此有可能彻底改变信用评分和评估。 GenAI 综合不同数据集的能力解决了传统信用评分的主要限制之一 — 对历史信用数据的依赖。通过创建反映现实世界金融行为的合成数据,GenAI 模型可以对信用度进行更具包容性的评估。这一变革性转变促进了金融包容性,为更广泛的人群获得信贷机会打开了大门。 适应性在驾驭动态发展的经济条件和不断变化的消费行为方面发挥着举足轻重的作用。传统模型难以适应不可预见的干扰,与之不同的是,GenAI 的持续学习和适应能力可确保信用评分保持实时有效,提供了一个更具弹性和响应能力的信用风险评估工具。除了预测能力之外,GenAI 还可以提高信用评分的透明度和可解释性。模型可以为其决策提供解释,为信用评估提供更清晰的见解,并增强消费者、监管机构和金融机构之间的信任。 然而,在使用 GenAI 的过程中,一个关键问题是幻觉问题,即模型提供的信息可能是毫无意义或完全错误的。有几种技术可以降低这种风险,其中一 种是使用检索增强生成 (RAG) 方法。RAG 通过将模型的响应建立在最新来源的事实信息基础上,确保模型的响应反映最新、最准确的信息,从而最大限度地减少幻觉。 例如, Patronus AI 利用 RAG 和 MongoDB Atlas ,使工程师能够在现实场景中对大型语言模型 (LLM) 性能进行评分和基准测试,大规模生成对抗性测试用例,并监控幻觉及其他意外和不安全的行为。这有助于大规模检测 LLM 错误,并安全、自信地部署 AI 产品。 MongoDB 的另一个技术合作伙伴是 Robust Intelligence 。该公司的 AI 防火墙通过实时验证输入和输出来保护生产中的 LLM。它可以评估并降低幻觉等操作风险、包括模型偏见和有毒输出在内的道德风险,以及提示词注入和个人身份信息 (PII) 提取等安全风险。 随着生成式 AI 的不断成熟,将其融入信用评分和更广泛的信贷申请系统有望带来的不仅仅是技术进步,而是我们评估和发放信贷方式的根本性转变。 信贷史上的关键时刻 替代数据、人工智能和生成式 AI 的融合正在重塑信用评分的基础,标志着金融业进入了一个关键时刻。通过采用替代信用评分方法,提供更具包容性和更细致的评估,传统模式所面临的挑战正在被克服。生成式 AI 虽然会带来幻觉的潜在挑战,但它站在创新的前沿,不仅彻底改变了技术能力,而且从根本上重新定义了信用评估方式,开创了具有金融包容性、效率和公平的新时代。 如果您想了解有关使用 MongoDB 构建 AI 密集型应用程序的更多信息,请查看以下资源: 利用 MongoDB 实现借贷和租赁体验数字化 在适当的安全控制下打造 AI 密集型应用,同时达到用户期望的规模和性能水平 了解 slice 如何在不到一分钟的时间内为数百万人完成信贷审批

February 20, 2024

Reduzindo o viés na pontuação de crédito com IA generativa

A pontuação de crédito desempenha um papel fundamental na determinação de quem tem acesso ao crédito e em quais condições. No entanto, apesar de sua importância, os sistemas tradicionais de pontuação de crédito há muito tempo são afetados por uma série de problemas críticos, desde vieses e discriminação até a consideração limitada de dados e desafios de escalabilidade. Por exemplo, um estudo sobre empréstimos nos EUA mostrou que as minorias cobraram taxas de juros mais altas (+8%) e rejeitaram empréstimos com mais frequência (+14%) do que os mutuários de grupos mais privilegiados. A natureza rígida dos sistemas de crédito significa que eles podem ser lentos para se adaptar às mudanças nos cenários econômicos e à evolução dos comportamentos dos consumidores, deixando algumas pessoas desassistidas e negligenciadas. Para superar esse problema, os bancos e outros credores estão procurando adotar a inteligência artificial para desenvolver modelos cada vez mais sofisticados para avaliar o risco de crédito. Neste artigo, exploraremos os fundamentos da pontuação de crédito, os desafios que os sistemas atuais apresentam e nos aprofundaremos em como a inteligência artificial (IA), em especial a IA generativa (genAI), pode ser aproveitada para mitigar vieses e melhorar a precisão. Desde a incorporação de fontes de dados alternativas até o desenvolvimento de modelos de aprendizado de máquina (ML), descobriremos o potencial transformador da IA para remodelar o futuro da pontuação de crédito. Confira nossa página de recursos de IA para saber mais sobre como criar aplicativos baseados em IA com o MongoDB. O que é pontuação de crédito? A pontuação de crédito é um aspecto integral do cenário financeiro, servindo como um indicador numérico da capacidade de crédito de um indivíduo. Essa métrica vital é empregada pelos credores para avaliar o risco potencial associado à concessão de crédito e aos empréstimos a pessoas ou empresas. Tradicionalmente, os bancos se baseiam em regras predefinidas e modelos estatísticos, geralmente construídos por meio de regressão linear ou regressão logística. Os modelos são baseados em dados históricos de crédito, concentrando-se em fatores como histórico de pagamentos, utilização de crédito e duração do histórico de crédito. No entanto, a avaliação de novos solicitantes de crédito representa um desafio, levando à necessidade de um perfil mais preciso. Para atender aos segmentos carentes ou não atendidos, tradicionalmente discriminados, as fintechs e os bancos digitais estão incorporando cada vez mais informações além do histórico de crédito tradicional, com dados alternativos para criar uma visão mais abrangente do comportamento financeiro de um indivíduo. Desafios da pontuação de crédito tradicional As pontuações de crédito são parte integrante da vida moderna porque servem como um determinante crucial em várias transações financeiras, incluindo a obtenção de empréstimos, o aluguel de um apartamento, a obtenção de seguros e, às vezes, até mesmo em exames de emprego. Como a busca de crédito pode ser uma jornada labiríntica, aqui estão alguns dos desafios ou limitações dos modelos tradicionais de pontuação de crédito que, muitas vezes, obscurecem o caminho para a aprovação da solicitação de crédito. Histórico de crédito limitado: muitas pessoas, especialmente aquelas que são novas no jogo do crédito, encontram um obstáculo significativo – histórico de crédito limitado ou inexistente. Os modelos tradicionais de pontuação de crédito dependem muito do comportamento de crédito anterior, o que torna difícil para indivíduos sem um histórico de crédito sólido comprovar sua capacidade de crédito. Cerca de 45 milhões de americanos não têm pontuação de crédito simplesmente porque esses dados não existem para eles. Renda inconsistente: a renda irregular, típica do trabalho de meio período ou freelancer, representa um desafio para os modelos tradicionais de pontuação de crédito, rotulando potencialmente os indivíduos como de maior risco e levando a recusas de aplicação ou limites de crédito restritivos. Em 2023, nos Estados Unidos , as fontes de dados divergem sobre a quantidade de pessoas autônomas. Uma fonte mostra que mais de 27 milhões de americanos apresentaram documentos fiscais do Schedule C, que cobrem a renda líquida ou a perda de um negócio – destacando a necessidade de métodos diferentes de pontuação de crédito para os autônomos. Alta utilização do crédito existente: a forte dependência do crédito existente é frequentemente percebida como um sinal de possível tensão financeira, influenciando as decisões de crédito. As solicitações de crédito podem ser rejeitadas ou aprovadas com termos menos favoráveis, refletindo preocupações sobre a capacidade do solicitante de obter crédito adicional de forma criteriosa. Falta de clareza nos motivos da rejeição: a compreensão dos motivos por trás das rejeições impede que candidatos abordem as causas principais – no Reino Unido, um estudo entre abril de 2022 e abril de 2023 mostrou que os principais motivos para a rejeição incluíam "histórico de crédito ruim" (38%), "não podia arcar com os pagamentos" (28%), "ter muitos outros créditos" (19%) e 10% disseram que não foram informados do motivo. Os motivos, mesmo quando apresentados, costumam ser muito vagos, o que deixa os solicitantes no escuro, dificultando a solução da causa principal e o aprimoramento de sua capacidade de crédito para futuras aplicações. A falta de transparência não é apenas um problema para os clientes, mas também pode levar a uma penalidade para os bancos. Por exemplo, um banco de Berlim foi multado em 300 mil euros em 2023 por falta de transparência ao recusar uma aplicação de cartão de crédito. Falta de flexibilidade: as mudanças no comportamento do consumidor, especialmente entre as gerações mais jovens que preferem as transações digitais, desafiam os modelos tradicionais. Fatores como o aumento da economia do bico, empregos não tradicionais, dívidas de empréstimos estudantis e altos custos de vida complicam a avaliação da estabilidade da renda e da saúde financeira. As previsões tradicionais de risco de crédito são limitadas durante interrupções sem precedentes, como a COVID-19, não levando isso em consideração nos modelos de pontuação. O reconhecimento desses desafios destaca a necessidade de modelos alternativos de pontuação de crédito que possam se adaptar à evolução dos comportamentos financeiros, lidar com fontes de dados não tradicionais e fornecer uma avaliação mais inclusiva e precisa da capacidade de crédito no dinâmico cenário financeiro atual. Pontuação de crédito com dados alternativos A pontuação de crédito alternativa se refere ao uso de fontes de dados não tradicionais (também conhecidos como dados alternativos) e métodos para avaliar a capacidade de crédito de um indivíduo. Embora a pontuação de crédito tradicional dependa muito do histórico de crédito das principais agências de crédito, a pontuação de crédito alternativa incorpora uma gama mais ampla de fatores para criar um quadro mais abrangente do comportamento financeiro de uma pessoa. Abaixo estão algumas das fontes de dados alternativas populares: Pagamentos de serviços públicos: Além do histórico de crédito, os pagamentos consistentes de serviços públicos, como eletricidade e água, oferecem um indicador poderoso de responsabilidade financeira e revelam um compromisso com o cumprimento das obrigações financeiras, fornecendo insights cruciais além das métricas tradicionais. Histórico de aluguel: para aqueles que não têm uma hipoteca, o histórico de pagamento de aluguel surge como uma fonte de dados alternativa importante. A demonstração de pagamentos consistentes e pontuais de aluguéis mostra um quadro abrangente de disciplina e confiabilidade financeira. Padrões de uso de celulares: a onipresença dos telefones celulares revela uma grande variedade de dados alternativos. A análise dos padrões de chamadas e mensagens de texto fornece insights sobre a rede, a estabilidade e as conexões sociais de um indivíduo, contribuindo com informações valiosas para avaliações de crédito. Comportamento de compras on-line: o exame da frequência, do tipo e do valor gasto em compras on-line oferece informações valiosas sobre o comportamento de gastos, contribuindo para uma compreensão mais detalhada dos hábitos financeiros. Histórico educacional e profissional: a pontuação de crédito alternativa considera o histórico educacional e de emprego de um indivíduo. Indicadores positivos, como histórico escolar e emprego estável, desempenham um papel fundamental na avaliação da estabilidade financeira. Essas fontes de dados alternativas representam uma mudança em direção a uma abordagem mais inclusiva, matizada e holística das avaliações de crédito. À medida que a tecnologia financeira continua avançando, a utilização desses conjuntos de dados alternativos garante uma avaliação mais abrangente da capacidade de crédito, marcando uma etapa transformadora na evolução dos modelos de pontuação de crédito. Pontuação de crédito alternativa com inteligência artificial Além do uso de dados alternativos, o uso de IA como um método alternativo surgiu como uma força transformadora para enfrentar os desafios da pontuação de crédito tradicional, por vários motivos: Capacidade de atenuar a parcialidade: assim como os modelos estatísticos tradicionais, os modelos de IA, inclusive os LLMs, treinados em dados históricos enviesados herdarão as tendências presentes nesses dados, levando a resultados discriminatórios. Os LLMs podem se concentrar mais em determinados recursos do que em outros ou podem não ter a capacidade de entender o contexto mais amplo da situação financeira de um indivíduo, o que leva a uma tomada de decisão tendenciosa. No entanto, existem várias técnicas para mitigar o viés dos modelos de IA: Estratégias de mitigação: as iniciativas começam com o uso de dados de treinamento diversificados e representativos para evitar o reforço de vieses existentes. Estratégias de mitigação inadequadas ou ineficazes podem resultar em resultados tendenciosos que persistem nos modelos de pontuação de crédito de IA. A atenção cuidadosa aos dados coletados e ao desenvolvimento do modelo é fundamental para mitigar esse viés. A incorporação de dados alternativos para pontuação de crédito desempenha um papel crucial na redução de vieses. Ferramentas rigorosas de detecção de vieses, restrições de equidade e técnicas de regularização durante o treinamento aprimoram a responsabilidade do modelo: o equilíbrio da representação de recursos e o emprego de técnicas de pós-processamento e algoritmos especializados contribuem para a mitigação de vieses. Avaliação de modelos inclusivos, monitoramento contínuo e melhoria iterativa, juntamente com a aderência a diretrizes éticas e práticas de governança, completam uma abordagem multifacetada para reduzir o viés em modelos de IA. Isso é particularmente importante para tratar de preocupações relacionadas a vieses demográficos ou socioeconômicos que possam estar presentes em dados históricos de crédito. Auditorias regulares de viés: realize auditorias regulares para identificar e mitigar vieses em LLMs. Isso pode envolver a análise dos resultados do modelo quanto a disparidades entre grupos demográficos e o ajuste adequado dos algoritmos. Transparência e explicabilidade: aumente a transparência e a explicabilidade nos LLMs para entender como as decisões são tomadas. Isso pode ajudar a identificar e abordar processos decisórios enviesados. O Trade Ledger , uma ferramenta de software como serviço (SaaS) para empréstimos, usa uma abordagem orientada por dados para tomar decisões informadas com maior transparência e rastreabilidade, reunindo dados de várias fontes com diferentes esquemas em uma única fonte de dados. Capacidade de analisar conjuntos de dados vastos e diversificados: ao contrário dos modelos tradicionais que se baseiam em regras predefinidas e dados históricos de crédito, os modelos de IA podem processar uma infinidade de informações, inclusive fontes de dados não tradicionais, para criar uma avaliação mais abrangente da capacidade de crédito de um indivíduo, garantindo que uma gama mais ampla de comportamentos financeiros seja considerada. A IA traz uma adaptabilidade inigualável: à medida que as condições econômicas mudam e os comportamentos dos consumidores evoluem, os modelos alimentados por IA podem se ajustar rapidamente e aprender com novos dados. O aspecto do aprendizado contínuo garante que a pontuação de crédito permaneça relevante e eficaz diante de cenários financeiros em constante mudança. As objeções mais comuns dos bancos para não usar IA na pontuação de crédito são a transparência e a explicabilidade nas decisões de crédito. A complexidade inerente de alguns modelos de IA, especialmente os algoritmos de aprendizagem profunda, pode levar a desafios no fornecimento de explicações claras para as decisões de crédito. Felizmente, a transparência e a interpretabilidade dos modelos de IA têm visto avanços significativos. Técnicas como os valores SHAP (SHapley Additive exPlanations) e os gráficos LIME (Local Interpretable Model-Agnostic Explanations) e vários outros avanços no domínio da XAI (Explainable AI) agora nos permitem entender como o modelo chega a decisões de crédito específicas. Isso não apenas aumenta a confiança no processo de pontuação de crédito, mas também aborda a crítica comum de que os modelos de IA são "caixas pretas." Compreendendo a importância de aproveitar dados alternativos que geralmente vêm em um formato semi ou não estruturado, as instituições financeiras trabalham com o MongoDB para aprimorar seus processos de aplicação de crédito com uma maneira mais rápida, simples e flexível de fazer pagamentos e oferecer crédito: O Amar Bank, principal banco digital da Indonésia , está combatendo vieses através do fornecimento de microempréstimos para pessoas que não seriam capazes de obter serviços financeiros de bancos tradicionais (desbancarizados e mal servidos). Os processos tradicionais de subscrição eram inadequados para clientes sem histórico de crédito ou garantias, portanto, eles simplificaram as decisões de empréstimo aproveitando dados não estruturados. Utilizando o MongoDB Atlas, eles desenvolveram um modelo de análise preditiva integrando dados estruturados e não estruturados para avaliar a capacidade de crédito do mutuário. A escalabilidade e a capacidade do MongoDB de gerenciar diversos tipos de dados foram fundamentais para expandir e otimizar suas operações de empréstimo. Para a grande maioria dos indianos, a obtenção de crédito é normalmente um desafio devido a regulamentações rigorosas e à falta de dados de crédito. Por meio do uso de sistemas modernos de subscrição, a Slice , uma inovadora líder no ecossistema de fintech da Índia, está ajudando a ampliar a acessibilidade ao crédito na Índia, simplificando o processo de KYC para uma experiência de crédito mais tranquila. Ao utilizar o MongoDB Atlas em diferentes casos de uso, inclusive como um armazenamento de recursos de ML em tempo real, a slice transformou seu processo de integração, reduzindo o tempo de processamento para menos de um minuto. O slice usa o armazenamento de recursos em tempo real com MongoDB e modelos de ML para calcular mais de 100 variáveis instantaneamente, permitindo a determinação da elegibilidade de crédito em menos de 30 segundos. Transformando a pontuação de crédito com IA generativa Além do uso de dados alternativos e IA na pontuação de crédito, a GenAI tem o potencial de revolucionar a pontuação e a avaliação de crédito com sua capacidade de criar dados sintéticos e entender padrões complexos, oferecendo uma abordagem mais matizada, adaptativa e preditiva. A capacidade da GenAI de sintetizar diversos conjuntos de dados aborda uma das principais limitações da pontuação de crédito tradicional – a dependência de dados históricos de crédito. Ao criar dados sintéticos que espelham os comportamentos financeiros do mundo real, os modelos GenAI permitem uma avaliação mais inclusiva da capacidade de crédito. Essa mudança transformadora promove a inclusão financeira, abrindo portas para que um grupo demográfico mais amplo tenha acesso a oportunidades de crédito. A adaptabilidade desempenha um papel crucial na navegação pela natureza dinâmica das condições econômicas e na mudança dos comportamentos do consumidor. Ao contrário dos modelos tradicionais, que lutam para se ajustar a interrupções imprevistas, a capacidade da GenAI de aprender e se adaptar continuamente garante que a pontuação de crédito permaneça eficaz em tempo real, oferecendo uma ferramenta mais resiliente e responsiva para avaliar o risco de crédito. Além de sua capacidade de previsão, a GenAI pode contribuir para a transparência e a interpretabilidade da pontuação de crédito. Os modelos podem gerar explicações para suas decisões, fornecendo percepções mais claras sobre as avaliações de crédito e aumentando a confiança entre consumidores, órgãos reguladores e instituições financeiras. No entanto, uma das principais preocupações ao usar a GenAI é o problema da alucinação, em que o modelo pode apresentar informações sem sentido ou totalmente falsas. Há várias técnicas para mitigar esse risco, e uma delas é usar a abordagem RAG (Retrieval Augment Generation) . O RAG minimiza as alucinações ao fundamentar as respostas do modelo em informações factuais de fontes atualizadas, garantindo que as respostas do modelo reflitam as informações mais atuais e precisas disponíveis. A Patronus AI , por exemplo, utiliza o RAG com o MongoDB Atlas para permitir que os engenheiros pontuem e avaliem o desempenho de grandes modelos de linguagem (LLMs) em cenários do mundo real, gerem casos de teste adversários em escala e monitorem alucinações e outros comportamentos inesperados e inseguros. Isso pode ajudar a detectar erros de LLM em escala e implementar produtos de IA com segurança e confiança. Outro parceiro de tecnologia do MongoDB é o Robust Intelligence . O AI Firewall da empresa protege os LLMs em produção, validando entradas e saídas em tempo real. Ele avalia e mitiga riscos operacionais, como alucinações, riscos éticos, incluindo viés de modelo e resultados tóxicos, e riscos de segurança, como injeções imediatas e extrações de informações de identificação pessoal (PII). À medida que a IA generativa continua a amadurecer, sua integração à pontuação de crédito e aos sistemas mais amplos de aplicação de crédito promete não apenas um avanço tecnológico, mas uma transformação fundamental na forma como avaliamos e concedemos crédito. Um momento crucial na história do crédito A convergência de dados alternativos, inteligência artificial e IA generativa está reformulando os fundamentos da pontuação de crédito, marcando um momento crucial no setor financeiro. Os desafios dos modelos tradicionais estão sendo superados com a adoção de métodos alternativos de pontuação de crédito, oferecendo uma avaliação mais inclusiva e diferenciada. A IA generativa, embora apresente o possível desafio da alucinação, representa a vanguarda da inovação, não apenas revolucionando os recursos tecnológicos, mas redefinindo fundamentalmente a forma como o crédito é avaliado, promovendo uma nova era de inclusão financeira, eficiência e justiça. Se quiser saber mais sobre como criar aplicações aprimoradas com IA com o MongoDB, dê uma olhada nos recursos a seguir: Digitalização da experiência de empréstimo e leasing com o MongoDB Ofereça aplicativos aprimorados com IA com os controles de segurança corretos e dentro da escala e do desempenho esperados pelos usuários Descubra como o Slice permite a aprovação de crédito em menos de um minuto para milhões de pessoas

February 20, 2024

Reducing Bias in Credit Scoring with Generative AI

This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . Credit scoring plays a pivotal role in determining who gets access to credit and on what terms. Despite its importance, however, traditional credit scoring systems have long been plagued by a series of critical issues, from biases and discrimination, to limited data consideration and scalability challenges. For example, a study of US loans showed that minority borrowers were charged higher interest rates (+8%) and rejected loans more often (+14%) than borrowers from more privileged groups. The rigid nature of credit systems means that they can be slow to adapt to changing economic landscapes and evolving consumer behaviors, leaving some individuals underserved and overlooked. To overcome this, banks and other lenders are looking to adopt artificial intelligence to develop increasingly sophisticated models for scoring credit risk. In this article, we'll explore the fundamentals of credit scoring, the challenges current systems present, and delve into how artificial intelligence (AI), in particular, generative AI (genAI) can be leveraged to mitigate bias and improve accuracy. From the incorporation of alternative data sources to the development of machine learning (ML) models, we'll uncover the transformative potential of AI in reshaping the future of credit scoring. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. What is credit scoring? Credit scoring is an integral aspect of the financial landscape, serving as a numerical gauge of an individual's creditworthiness. This vital metric is employed by lenders to evaluate the potential risk associated with extending credit or lending money to individuals or businesses. Traditionally, banks rely on predefined rules and statistical models often built using linear regression or logistic regression. The models are based on historical credit data, focusing on factors such as payment history, credit utilization, and length of credit history. However, assessing new credit applicants poses a challenge, leading to the need for more accurate profiling. To cater to the underserved or unserved segments traditionally discriminated against, fintechs and digital banks are increasingly incorporating information beyond traditional credit history with alternative data to create a more comprehensive view of an individual's financial behavior. Challenges with traditional credit scoring Credit scores are integral to modern life because they serve as a crucial determinant in various financial transactions, including securing loans, renting an apartment, obtaining insurance, and even sometimes in employment screenings. Because the pursuit of credit can be a labyrinthine journey, here are some of the challenges or limitations with traditional credit scoring models that often cloud the path to credit application approval. Limited credit history: Many individuals, especially those new to the credit game, encounter a significant hurdle – limited or non-existent credit history. Traditional credit scoring models heavily rely on past credit behavior, making it difficult for individuals without a robust credit history to prove their creditworthiness. Roughly 45 million Americans lack credit scores simply because those data points do not exist for them. Inconsistent income: Irregular income, typical in part-time work or freelancing, poses a challenge for traditional credit scoring models, potentially labeling individuals as higher risk and leading to application denials or restrictive credit limits. In 2023 in the United States , data sources differ on how many people are self-employed. One source shows more than 27 million Americans filed Schedule C tax documents, which cover net income or loss from a business – highlighting the need for different methods of credit scoring for those self-employed. High utilization of existing credit: Heavy reliance on existing credit is often perceived as a signal of potential financial strain, influencing credit decisions. Credit applications may face rejection or approval with less favorable terms, reflecting concerns about the applicant's ability to judiciously manage additional credit. Lack of clarity in rejection reasons: Understanding the reasons behind rejections hinders applicants from addressing the root causes – in the UK, a study between April 2022 and April 2023 showed the main reasons for rejection included “poor credit history” (38%), “couldn’t afford the repayments” (28%), “having too much other credit" (19%) and 10% said they weren’t told why. The reasons even when given are often too vague which leaves applicants in the dark, making it difficult for them to address the root cause and enhance their creditworthiness for future applications. The lack of transparency is not only a trouble for customers, it can also lead to a penalty for banks. For example, a Berlin bank was fined €300k in 2023 for lacking transparency in declining a credit card application. Lack of flexibility: Shifts in consumer behavior, especially among younger generations preferring digital transactions, challenge traditional models. Factors like the rise of the gig economy, non-traditional employment, student loan debt, and high living costs complicate assessing income stability and financial health. Traditional credit risk predictions are limited during unprecedented disruptions like COVID-19, not taking this into account in scoring models. Recognizing these challenges highlights the need for alternative credit scoring models that can adapt to evolving financial behaviors, handle non-traditional data sources, and provide a more inclusive and accurate assessment of creditworthiness in today's dynamic financial landscape. Credit scoring with alternative data Alternative credit scoring refers to the use of non-traditional data sources (aka. alternative data) and methods to assess an individual's creditworthiness. While traditional credit scoring relies heavily on credit history from major credit bureaus, alternative credit scoring incorporates a broader range of factors to create a more comprehensive picture of a person's financial behavior. Below are some of the popular alternative data sources: Utility payments: Beyond credit history, consistent payments for utilities like electricity and water offer a powerful indicator of financial responsibility and reveal a commitment to meeting financial obligations, providing crucial insights beyond traditional metrics. Rental history: For those without a mortgage, rental payment history emerges as a key alternative data source. Demonstrating consistent and timely rent payments paints a comprehensive picture of financial discipline and reliability. Mobile phone usage patterns: The ubiquity of mobile phones unlocks a wealth of alternative data. Analyzing call and text patterns provides insights into an individual's network, stability, and social connections, contributing valuable information for credit assessments. Online shopping behavior: Examining the frequency, type, and amount spent on online purchases offers valuable insights into spending behaviors, contributing to a more nuanced understanding of financial habits. Educational and employment background: Alternative credit scoring considers an individual's educational and employment history. Positive indicators, such as educational achievements and stable employment, play a crucial role in assessing financial stability. These alternative data sources represent a shift towards a more inclusive, nuanced, and holistic approach to credit assessments. As financial technology continues to advance, leveraging these alternative data sets ensures a more comprehensive evaluation of creditworthiness, marking a transformative step in the evolution of credit scoring models. Alternative credit scoring with artificial intelligence Besides the use of alternative data, the use of AI as an alternative method has emerged as a transformative force to address the challenges of traditional credit scoring for a number of reasons: Ability to mitigate bias: Like traditional statistical models, AI models, including LLMs, trained on historical data that are biased will inherit biases present in that data, leading to discriminatory outcomes. LLMs might focus on certain features more than others or may lack the ability to understand the broader context of an individual's financial situation leading to biased decision-making. However, there are various techniques to mitigate the bias of AI models: Mitigation strategies: Initiatives begin with the use of diverse and representative training data to avoid reinforcing existing biases. Inadequate or ineffective mitigation strategies can result in biased outcomes persisting in AI credit scoring models. Careful attention to the data collected and model development is crucial in mitigating this bias. Incorporating alternative data for credit scoring plays a critical role in reducing biases. Rigorous bias detection tools, fairness constraints, and regularization techniques during training enhance model accountability: Balancing feature representation and employing post-processing techniques and specialized algorithms contribute to bias mitigation. Inclusive model evaluation, continuous monitoring, and iterative improvement, coupled with adherence to ethical guidelines and governance practices, complete a multifaceted approach to reducing bias in AI models. This is particularly significant in addressing concerns related to demographic or socioeconomic biases that may be present in historical credit data. Regular bias audits: Conduct regular audits to identify and mitigate biases in LLMs. This may involve analyzing model outputs for disparities across demographic groups and adjusting the algorithms accordingly. Transparency and explainability: Increase transparency and explainability in LLMs to understand how decisions are made. This can help identify and address biased decision-making processes. Trade Ledger , a lending software as a service (SaaS) tool, uses a data-driven approach to make informed decisions with greater transparency and traceability by bringing data from multiple sources with different schemas into a single data source. Ability to analyze vast and diverse datasets: Unlike traditional models that rely on predefined rules and historical credit data, AI models can process a myriad of information, including non-traditional data sources, to create a more comprehensive assessment of an individual's creditworthiness, ensuring that a broader range of financial behaviors is considered. AI brings unparalleled adaptability to the table: As economic conditions change and consumer behaviors evolve, AI-powered models can quickly adjust and learn from new data. The continuous learning aspect ensures that credit scoring remains relevant and effective in the face of ever-changing financial landscapes. The most common objections from banks to not using AI in credit scoring are transparency and explainability in credit decisions. The inherent complexity of some AI models, especially deep learning algorithms, may lead to challenges in providing clear explanations for credit decisions. Fortunately, the transparency and interpretability of AI models have seen significant advancements. Techniques like SHapley Additive exPlanations (SHAP) values and Local Interpretable Model-Agnostic Explanations (LIME) plots</a,> and several other advancements in the domain of Explainable AI (XAI) now allow us to understand how the model arrives at specific credit decisions. This not only enhances trust in the credit scoring process but also addresses the common critique that AI models are "black boxes." Understanding the criticality of leveraging alternative data that often comes in a semi or unstructured format, financial institutions work with MongoDB to enhance their credit application processes with a faster, simpler, and more flexible way to make payments and offer credit: Amar Bank, Indonesia's leading digital bank , is combatting bias by providing microloans to people who wouldn’t be able to get financial services from traditional banks (unbanked and underserved). Traditional underwriting processes were inadequate for customers lacking credit history or collateral so they have streamlined lending decisions by harnessing unstructured data. Leveraging MongoDB Atlas, they developed a predictive analytics model integrating structured and unstructured data to assess borrower creditworthiness. MongoDB's scalability and capability to manage diverse data types were instrumental in expanding and optimizing their lending operations. For the vast majority of Indians, getting credit is typically challenging due to stringent regulations and a lack of credit data. Through the use of modern underwriting systems, Slice, a leading innovator in India’s fintech ecosystem , is helping broaden the accessibility to credit in India by streamlining their KYC process for a smoother credit experience. By utilizing MongoDB Atlas across different use cases, including as a real-time ML feature store, slice transformed their onboarding process, slashing processing times to under a minute. slice uses the real-time feature store with MongoDB and ML models to compute over 100 variables instantly, enabling credit eligibility determination in less than 30 seconds. Transforming credit scoring with generative AI Besides the use of alternative data and AI in credit scoring, GenAI has the potential to revolutionize credit scoring and assessment with its ability to create synthetic data and understand intricate patterns, offering a more nuanced, adaptive, and predictive approach. GenAI’s capability to synthesize diverse data sets addresses one of the key limitations of traditional credit scoring – the reliance on historical credit data. By creating synthetic data that mirrors real-world financial behaviors, GenAI models enable a more inclusive assessment of creditworthiness. This transformative shift promotes financial inclusivity, opening doors for a broader demographic to access credit opportunities. Adaptability plays a crucial role in navigating the dynamic nature of economic conditions and changing consumer behaviors. Unlike traditional models, which struggle to adjust to unforeseen disruptions, GenAI’s ability to continuously learn and adapt ensures that credit scoring remains effective in real-time, offering a more resilient and responsive tool for assessing credit risk. In addition to its predictive prowess, GenAI can contribute to transparency and interpretability in credit scoring. Models can generate explanations for their decisions, providing clearer insights into credit assessments, and enhancing trust among consumers, regulators, and financial institutions. One key concern however in making use of GenAI is the problem of hallucination, where the model may present information that is either nonsensical or outright false. There are several techniques to mitigate this risk and one approach is using the Retrieval Augment Generation (RAG) approach. RAG minimizes hallucinations by grounding the model’s responses in factual information from up-to-date sources, ensuring the model’s responses reflect the most current and accurate information available. Patronus AI , for example, leverages RAG with MongoDB Atlas to enable engineers to score and benchmark large language models (LLMs) performance on real-world scenarios, generate adversarial test cases at scale, and monitor hallucinations and other unexpected and unsafe behavior. This can help to detect LLM mistakes at scale and deploy AI products safely and confidently. Another technology partner of MongoDB is Robust Intelligence . The firm’s AI Firewall protects LLMs in production by validating inputs and outputs in real-time. It assesses and mitigates operational risks such as hallucinations, ethical risks including model bias and toxic outputs, and security risks such as prompt injections and personally identifiable information (PII) extractions. As generative AI continues to mature, its integration into credit scoring and the broader credit application systems promises not just a technological advancement, but a fundamental transformation in how we evaluate and extend credit. A pivotal moment in the history of credit The convergence of alternative data, artificial intelligence, and generative AI is reshaping the foundations of credit scoring, marking a pivotal moment in the financial industry. The challenges of traditional models are being overcome through the adoption of alternative credit scoring methods, offering a more inclusive and nuanced assessment. Generative AI, while introducing the potential challenge of hallucination, represents the forefront of innovation, not only revolutionizing technological capabilities but fundamentally redefining how credit is evaluated, fostering a new era of financial inclusivity, efficiency, and fairness. If you would like to discover more about building AI-enriched applications with MongoDB, take a look at the following resources: Digitizing the lending and leasing experience with MongoDB Deliver AI-enriched apps with the right security controls in place, and at the scale and performance users expect Discover how slice enables credit approval in less than a minute for millions Solution: Credit card application with generative AI

February 20, 2024