Vector Search

38 results

Announcing Hybrid Search Support for LlamaIndex

MongoDB is excited to announce enhancements to our LlamaIndex integration. By combining MongoDB’s robust database capabilities with LlamaIndex’s innovative framework for context-augmented large language models (LLMs), the enhanced MongoDB-LlamaIndex integration unlocks new possibilities for generative AI development. Specifically, it supports vector (powered by Atlas Vector Search ), full-text (powered by Atlas Search ), and hybrid search, enabling developers to blend precise keyword matching with semantic search for more context-aware applications, depending on their use case. Building AI applications with LlamaIndex LlamaIndex is one of the world’s leading AI frameworks for building with LLMs. It streamlines the integration of external data sources, allowing developers to combine LLMs with relevant context from various data formats. This makes it ideal for building application features like retrieval-augmented generation (RAG), where accurate, contextual information is critical. LlamaIndex empowers developers to build smarter, more responsive AI systems while reducing the complexities involved in data handling and query management. Advantages of building with LlamaIndex include: Simplified data ingestion with connectors that integrate structured databases, unstructured files, and external APIs, removing the need for manual processing or format conversion. Organizing data into structured indexes or graphs , significantly enhancing query efficiency and accuracy, especially when working with large or complex datasets. An advanced retrieval interface that responds to natural language prompts with contextually enhanced data, improving accuracy in tasks like question-answering, summarization, or data retrieval. Customizable APIs that cater to all skill levels—high-level APIs enable quick data ingestion and querying for beginners, while lower-level APIs offer advanced users full control over connectors and query engines for more complex needs. MongoDB's LlamaIndex integration Developers are able to build powerful AI applications using LlamaIndex as a foundational AI framework alongside MongoDB Atlas as the long term memory database. With MongoDB’s developer-friendly document model and powerful vector search capabilities within MongoDB Atlas, developers can easily store and search vector embeddings for building RAG applications. And because of MongoDB’s low-latency transactional persistence capabilities, developers can do a lot more with MongoDB integration in LlamIndex to build AI applications in an enterprise-grade manner. LlamaIndex's flexible architecture supports customizable storage components, allowing developers to leverage MongoDB Atlas as a powerful vector store and a key-value store. By using Atlas Vector Search capabilities, developers can: Store and retrieve vector embeddings efficiently ( llama-index-vector-stores-mongodb ) Persist ingested documents ( llama-index-storage-docstore-mongodb ) Maintain index metadata ( llama-index-storage-index-store-mongodb ) Store Key-value pairs ( llama-index-storage-kvstore-mongodb ) Figure adapted from Liu, Jerry and Agarwal, Prakul (May 2023). “Build a ChatGPT with your Private Data using LlamaIndex and MongoDB”. Medium. https://medium.com/llamaindex-blog/build-a-chatgpt-with-your-private-data-using-llamaindex-and-mongodb-b09850eb154c Adding hybrid and full-text search support Developers may use different approaches to search for different use cases. Full-text search retrieves documents by matching exact keywords or linguistic variations, making it efficient for quickly locating specific terms within large datasets, such as in legal document review where exact wording is critical. Vector search, on the other hand, finds content that is ‘semantically’ similar, even if it does not contain the same keywords. Hybrid search combines full-text search with vector search to identify both exact matches and semantically similar content. This approach is particularly valuable in advanced retrieval systems or AI-powered search engines, enabling results that are both precise and aligned with the needs of the end-user. It is super simple for developers to try out powerful retrieval capabilities on their data and improve the accuracy of their AI applications with this integration. In the LlamaIndex integration, the MongoDBAtlasVectorSearch class is used for vector search. All you have to do is enable full-text search, using VectorStoreQueryMode.TEXT_SEARCH in the same class. Similarly, to use Hybrid search, enable VectorStoreQueryMode.HYBRID . To learn more, check out the GitHub repository . With the MongoDB-LlamaIndex integration’s support, developers no longer need to navigate the intricacies of Reciprocal Rank Fusion implementation or to determine the optimal way to combine vector and text searches—we’ve taken care of the complexities for you. The integration also includes sensible defaults and robust support, ensuring that building advanced search capabilities into AI applications is easier than ever. This means that MongoDB handles the intricacies of storing and querying your vectorized data, so you can focus on building! We’re excited for you to work with our LlamaIndex integration. Here are some resources to expand your knowledge on this topic: Check out how to get started with our LlamaIndex integration Build a content recommendation system using MongoDB and LlamaIndex with our helpful tutorial Experiment with building a RAG application with LlamaIndex, OpenAI, and our vector database Learn how to build with private data using LlamaIndex, guided by one of its co-founders

October 17, 2024

Vector Quantization: Scale Search & Generative AI Applications

We are excited to announce a robust set of vector quantization capabilities in MongoDB Atlas Vector Search . These capabilities will reduce vector sizes while preserving performance, enabling developers to build powerful semantic search and generative AI applications with more scale—and at a lower cost. In addition, unlike relational or niche vector databases, MongoDB’s flexible document model—coupled with quantized vectors—allows for greater agility in testing and deploying different embedding models quickly and easily. Support for scalar quantized vector ingestion is now generally available, and will be followed by several new releases in the coming weeks. Read on to learn how vector quantization works and visit our documentation to get started! The challenges of large-scale vector applications While the use of vectors has opened up a range of new possibilities , such as content summarization and sentiment analysis, natural language chatbots, and image generation, unlocking insights within unstructured data can require storing and searching through billions of vectors—which can quickly become infeasible. Vectors are effectively arrays of floating-point numbers representing unstructured information in a way that computers can understand (ranging from a few hundred to billions of arrays), and as the number of vectors increases, so does the index size required to search over them. As a result, large-scale vector-based applications using full-fidelity vectors often have high processing costs and slow query times, hindering their scalability and performance. Vector quantization for cost-effectiveness, scalability, and performance Vector quantization, a technique that compresses vectors while preserving their semantic similarity, offers a solution to this challenge. Imagine converting a full-color image into grayscale to reduce storage space on a computer. This involves simplifying each pixel's color information by grouping similar colors into primary color channels or "quantization bins," and then representing each pixel with a single value from its bin. The binned values are then used to create a new grayscale image with smaller size but retaining most original details, as shown in Figure 1. Figure 1: Illustration of quantizing an RGB image into grayscale Vector quantization works similarly, by shrinking full-fidelity vectors into fewer bits to significantly reduce memory and storage costs without compromising the important details. Maintaining this balance is critical, as search and AI applications need to deliver relevant insights to be useful. Two effective quantization methods are scalar (converting a float point into an integer) and binary (converting a float point into a single bit of 0 or 1). Current and upcoming quantization capabilities will empower developers to maximize the potential of Atlas Vector Search. The most impactful benefit of vector quantization is increased scalability and cost savings through reduced computing resources and efficient processing of vectors. And when combined with Search Nodes —MongoDB’s dedicated infrastructure for independent scalability through workload isolation and memory-optimized infrastructure for semantic search and generative AI workloads— vector quantization can further reduce costs and improve performance, even at the highest volume and scale to unlock more use cases. "Cohere is excited to be one of the first partners to support quantized vector ingestion in MongoDB Atlas,” said Nils Reimers, VP of AI Search at Cohere. “Embedding models, such as Cohere Embed v3, help enterprises see more accurate search results based on their own data sources. We’re looking forward to providing our joint customers with accurate, cost-effective applications for their needs.” In our tests, compared to full-fidelity vectors, BSON-type vectors —MongoDB’s JSON-like binary serialization format for efficient document storage—reduced storage size by 66% (from 41 GB to 14 GB). And as shown in Figures 2 and 3, the tests illustrate significant memory reduction (73% to 96% less) and latency improvements using quantized vectors, where scalar quantization preserves recall performance and binary quantization’s recall performance is maintained with rescoring–a process of evaluating a small subset of the quantized outputs against full-fidelity vectors to improve the accuracy of the search results. Figure 2: Significant storage reduction + good recall and latency performance with quantization on different embedding models Figure 3: Remarkable improvement in recall performance for binary quantization when combining with rescoring In addition, thanks to the reduced cost advantage, vector quantization facilitates more advanced, multiple vector use cases that would have been too computationally-taxing or cost-prohibitive to implement. For example, vector quantization can help users: Easily A/B test different embedding models using multiple vectors produced from the same source field during prototyping. MongoDB’s document model —coupled with quantized vectors—allows for greater agility at lower costs. The flexible document schema lets developers quickly deploy and compare embedding models’ results without the need to rebuild the index or provision an entirely new data model or set of infrastructure. Further improve the relevance of search results or context for large language models (LLMs) by incorporating vectors from multiple sources of relevance, such as different source fields (product descriptions, product images, etc.) embedded within the same or different models. How to get started, and what’s next Now, with support for the ingestion of scalar quantized vectors, developers can import and work with quantized vectors from their embedding model providers of choice (such as Cohere, Nomic, Jina, Mixedbread, and others)—directly in Atlas Vector Search. Read the documentation and tutorial to get started. And in the coming weeks, additional vector quantization features will equip developers with a comprehensive toolset for building and optimizing applications with quantized vectors: Support for ingestion of binary quantized vectors will enable further reduction of storage space, allowing for greater cost savings and giving developers the flexibility to choose the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring will provide native capabilities for scalar quantization as well as binary quantization with rescoring in Atlas Vector Search, making it easier for developers to take full advantage of vector quantization within the platform. With support for quantized vectors in MongoDB Atlas Vector Search, you can build scalable and high-performing semantic search and generative AI applications with flexibility and cost-effectiveness. Check out these resources to get started documentation and tutorial . Head over to our quick-start guide to get started with Atlas Vector Search today.

October 7, 2024

MongoDB.local London 2024: Better Applications, Faster

Since we kicked off MongoDB’s series of 2024 events in April, we’ve connected with thousands of customers, partners, and community members in cities around the world—from Mexico City to Mumbai. Yesterday marked the nineteenth stop of the 2024 MongoDB.local tour, and we had a blast welcoming folks across industries to MongoDB.local London, where we discussed the latest technology trends, celebrated customer innovations, and unveiled product updates that make it easier than ever for developers to build next-gen applications. Over the past year, MongoDB’s more than 50,000 customers have been telling us that their needs are changing. They’re increasingly focused on three areas: Helping developers build faster and more efficiently Empowering teams to create AI-powered applications Moving from legacy systems to modern platforms Across these areas, there’s a common need for a solid foundation: each requires a resilient, scalable, secure, and highly performant database. The updates we shared at MongoDB.local London reflect these priorities. MongoDB is committed to ensuring that our products are built to exceed our customers’ most stringent requirements, and that they provide the strongest possible foundation for building a wide range of applications, now and in the future. Indeed, during yesterday’s event, Sahir Azam, MongoDB’s Chief Product Officer, discussed the foundational role data plays in his keynote address. He also shared the latest advancement from our partner ecosystem, an AI solution powered by MongoDB, Amazon Web Services, and Anthropic that makes it easier for customers to deploy gen AI customer care applications. MongoDB 8.0: The best version of MongoDB ever The biggest news at .local London was the general availability of MongoDB 8.0 , which provides significant performance improvements, reduced scaling costs, and adds additional scalability, resilience, and data security capabilities to the world’s most popular document database. Architectural optimizations in MongoDB 8.0 have significantly reduced memory usage and query times, and MongoDB 8.0 has more efficient batch processing capabilities than previous versions. Specifically, MongoDB 8.0 features 36% better read throughput, 56% faster bulk writes, and 20% faster concurrent writes during data replication. In addition, MongoDB 8.0 can handle higher volumes of time series data and can perform complex aggregations more than 200% faster—with lower resource usage and costs. Last (but hardly least!), Queryable Encryption now supports range queries, ensuring data security while enabling powerful analytics. For more on MongoDB.local London’s product announcements—which are designed to accelerate application development, simplify AI innovation, and speed developer upskilling—please read on! Accelerating application development Improved scaling and elasticity on MongoDB Atlas capabilities New enhancements to MongoDB Atlas’s control plane allow customers to scale clusters faster, respond to resource demands in real-time, and optimize performance—all while reducing operational costs. First, our new granular resource provisioning and scaling features—including independent shard scaling and extended storage and IOPS on Azure—allow customers to optimize resources precisely where needed. Second, Atlas customers will experience faster cluster scaling with up to 50% quicker scaling times by scaling clusters in parallel by node type. Finally, MongoDB Atlas users will enjoy more responsive auto-scaling, with a 5X improvement in responsiveness thanks to enhancements in our scaling algorithms and infrastructure. These enhancements are being rolled out to all Atlas customers, who should start seeing benefits immediately. IntelliJ plugin for MongoDB Announced in private preview, the MongoDB for IntelliJ Plugin is designed to functionally enhance the way developers work with MongoDB in IntelliJ IDEA, one of the most popular IDEs among Java developers. The plugin allows enterprise Java developers to write and test Java queries faster, receive proactive performance insights, and reduce runtime errors right in their IDE. By enhancing the database-to-IDE integration, JetBrains and MongoDB have partnered to deliver a seamless experience for their shared user-base and unlock their potential to build modern applications faster. Sign up for the private preview here . MongoDB Copilot Participant for VS Code (Public Preview) Now in public preview, the new MongoDB Participant for GitHub Copilot integrates domain-specific AI capabilities directly with a chat-like experience in the MongoDB Extension for VS Code .

October 3, 2024

Top Use Cases for Text, Vector, and Hybrid Search

Search is how we discover new things. Whether you’re looking for a pair of new shoes, the latest medical advice, or insights into corporate data, search provides the means to unlock the truth. Search habits—and the accompanying end-user expectations—have evolved along with changes to the search experiences offered by consumer apps like Google and Amazon. The days of the standard of 10 blue links may well be behind us, as new paradigms like vector search and generative AI (gen AI) have upended long-held search norms. But are all forms of search created equal, or should we be seeking out the right “flavor” of search for specific jobs? In this blog post, we will define and dig into various flavors of search, including text, vector and AI-powered search, and hybrid search, and discuss when to use each, including sample use cases where one type of search might be superior to others. Information retrieval revolutionized with text search The concept of text search has been baked into user behavior from the early days of the web, with the rudimentary text box entry and 10 blue link results based on text relevance to the initial query. This behavior and associated business model has produced trillions in revenue and has become one of the fiercest battlegrounds across the internet . Text search allows users to quickly find specific information within a large set of data by entering keywords or phrases. When a query is entered, the text search engine scans through indexed documents to locate and retrieve the most relevant results based on the keywords. Text search is a good solution for queries requiring exact matches where the overarching meaning isn't as critical. Some of the most common uses include: Catalog and content search: Using the search bar to find specific products or content based on keywords from customer inquiries. For example, a customer searching for "size 10 men trainers" or “installation guide” can instantly find the exact items they’re looking for, like how Nextar tapped into Atlas Search to enable physical retailers to create online catalogs. In-application search: This is well-suited for organizations with straightforward offerings to make it easier for users to locate key resources, but that don’t require advanced features like semantic retrieval or contextual re-ranking. For instance, if a user searches for "songs key of G," they can quickly receive relevant materials. This streamlines asset retrieval, allowing users to focus on the task they are trying to achieve and boosts overall satisfaction. For a company like Yousician , Atlas Search enabled their 20 million monthly active users to tackle their music lessons with ease. Customer 360: Unifying data from different sources to create a single, holistic view. Consolidated information such as user preferences, purchase history, and interaction data can be used to enhance business visibility and simplify the management, retrieval, and aggregation of user data. Consider a support agent searching for all information related to customer “John Doe." They can quickly access relevant attributes and interaction history, ensuring more accurate and efficient service. Helvetia was able to achieve success after migrating to MongoDB and using Atlas Search to deliver a single, 360-degree real-time view across all customer touchpoints and insurance products. AI and a new paradigm with vector search With advances in technology, vector search has emerged to help solve the challenge of providing relevant results even when the user may not know what they’re looking for. Vector search allows you to take any type of media or content, convert it into a vector using machine learning algorithms, and then search to find results similar to the target term. The similarity aspect is determined by converting your data into numerical high-dimensional vectors, and then calculating the distance between them to determine relevance—the closer the vector, the higher the relevance. There is a wide range of practical, powerful use cases powered by vector search—notably semantic search and retrieval-augmented generation (RAG) for gen AI. Semantic search focuses on meaning and prioritizes user intent by deciphering not just what users type but why they're searching, in order to provide more accurate and context-oriented search results. Some examples of semantic search include: Content/knowledge base search: Vast amounts of organizational data, structured and unstructured, with hidden insights, can benefit significantly from semantic search. Questions like “What’s our remote work policy?” can return accurate results even when the source materials do not contain the “remote” keyword, but rather have “return to office” or “hybrid” or other keywords. A real-world example of content search is the National Film and Sound Archive of Australia , which uses Atlas Vector Search to power semantic search across petabytes of text, audio, and visual content in its collections. Recommendation engines: Understanding users’ interests and intent is a strong competitive advantage–like how Netflix provides a personalized selection of shows and movies based on your watch history, or how Amazon recommends products based on your purchase history. This is particularly powerful in e-commerce, media & entertainment, financial services, and product/service-oriented industries where the customer experience tightly influences the bottom line. A success story is Delivery Hero , which leverages vector search-powered real-time recommendations to increase customer satisfaction and revenue. Anomaly detection: Identifying and preventing fraud, security breaches, and other system anomalies is paramount for all organizations. By grouping similar vectors and using vector search to identify outliers, potential threats can be detected early, enabling timely responses. Companies like VISO TRUST and Extrac are among the innovators that build their core offerings using semantic search for security and risk management. With the rise of large language models (LLMs), vector search is increasingly becoming essential in gen AI application development. It augments LLMs by providing domain-specific context outside of what the LLMs “know,” ensuring relevance and accuracy of the gen AI output. In this case, the semantic search outputs are used to enhance RAG. By providing relevant information from a vector database, vector search helps the RAG model generate responses that are more contextually relevant. By grounding the generated text in factual information, vector search helps reduce hallucinations and improve the accuracy of the response. Some common RAG applications are for chatbots and virtual assistants, which provide users with relevant responses and carry out tasks based on the user query, delivering enhanced user experience. Two real-world examples of such chatbot implementations are from our customers Okta and Kovai . Another popular application is using RAG to help generate content like articles, blog posts, scripts, code, and more, based on user prompts or data. This significantly accelerates content production, allowing organizations including Novo Nordisk and Scalestack to save time and produce content at scale, all at an accuracy level that was not possible without RAG. Beyond RAG, an emerging vector search usage is in agentic systems . Such a system is an architecture encompassing one or more AI agents with autonomous decision-making capabilities, able to access and use various system components and resources to achieve defined objectives while adapting to environmental feedback. Vector search enables efficient and semantically meaningful information retrieval in these systems, facilitating relevant context for LLMs, optimized tool selection, semantic understanding, and improved relevance ranking. Hybrid search: The best of both worlds Hybrid search combines the strengths of text search with the advanced capabilities of vector search to deliver more accurate and relevant search results. Hybrid search shines in scenarios where there's a need for both precision (where text search excels) and recall (where vector search excels), and where user queries can vary from simple to complex, including both keyword and natural language queries. Hybrid search delivers a more comprehensive, flexible information retrieval process, helping RAG models access a wider range of relevant information. For example, in a customer support context, hybrid search can ensure that the RAG model retrieves not only documents containing exact keywords but also semantically similar content, resulting in more informative and helpful responses. Hybrid search can also help reduce information overload by prioritizing the most relevant results. This allows RAG models to focus on processing and understanding the most critical information, leading to faster, more accurate responses, and improving the user experience. Powering your AI and search applications with MongoDB As your organization continues to innovate in the rapidly evolving technology ecosystem, building robust AI and search applications supporting customer, employee, and stakeholder experiences can deliver powerful competitive advantages. With MongoDB, you can efficiently deploy full-text search , vector search , and hybrid search capabilities. Start building today—simplify your developer experience while increasing impact in MongoDB’s fully-managed, secure vector database, integrated with a vast AI partner ecosystem , including all major cloud providers, generative AI model providers, and system integrators. Head over to our quick-start guide to get started with Atlas Vector Search today.

September 16, 2024

Boosting Customer Lifetime Value with Agmeta and MongoDB

Nobody likes calling customer service. The phone trees, the wait times, the janky music, and how often your issue just isn’t resolved can make the whole process one most people would rather avoid. For business owners, the customer contact center can also be a source of frustration, simultaneously creating customer churn and unhappiness, while also acting as a black hole of information as to why that churn occurred. It doesn’t have to be this way. What if instead, customer service centers offered valuable ways to increase the Customer Lifetime Value (CLTV) of customers, pipelines of upsell opportunities, and valuable sources of information? That’s the goal of Agmeta.AI , a startup dedicated to giving businesses actionable insights to fight churn, identify key customers primed for upsell, and improve customer service overall. Lost in translation “We started with a very simple thesis±people call into contact centers because they have a problem. That is a real make-or-break moment. The opportunity for churn is very high… or that customer can be a great target for upselling,” said Samir Agarwal, CEO and co-founder of Agmeta. “All of this data sits in a contact center, and businesses don't ever get to see it,” he added. According to Samir, even the businesses that think they are collecting useful information on customer service interactions are instead collecting incorrect or incomplete information. Or worse, they’re analyzing the information they do record incorrectly. Every business today talks about the importance of customer experience (CX), but the challenge businesses face is how they quantify that CX. Many contact centers substitute call sentiment for CX, or use keywords to determine canned responses. For example, imagine if a customer calls into a service center and they have what appears to be a positive conversation with an agent. They use words and phrases like “thank you,” and “yes, I understand,” and reply “no, I do not have anything else to ask” at the end of a call in which their complaint is not resolved. After putting the phone down, the customer goes on to cancel the service, or worse, initiate a chargeback request with their credit card provider. In some businesses the customer service agent may manually mark such a call as positive’ The agent, after all, ‘answered all the customers' concerns.’ As this example illustrates, the sentiment of a call should not be confused with the measure of customer experience. Another common way businesses try to gather feedback is by sending a post-call survey. However, a problem with this approach is that industry response rates for surveys are close to 3%. This implies that decisions get made on that small sample, and may not take into account the other 97% of the customers who didn’t respond to the survey. Survey results are also frequently skewed, as those most likely to respond are also the ones who were most unhappy with the contact center interaction and want their voices heard. The MongoDB advantage Using machine learning and generative AI, backed by MongoDB Atlas , Agmeta’s software understands not only the content of the call, but the context too. Taking our example above, Agmeta’s software would detect that the customer is unhappy, despite their polite and ‘positive’ sounding conversation with the agent, and flag the customer as a potential churn or chargeback candidate in need of immediate attention. “We will give you a CSAT (customer satisfaction) score and a reason for that CSAT score within seconds of the call ending±for 100% of the interactions,” said Samir. For Agmeta to work, Samir and his team had to have a database ready to accept all kinds of data, including voice recordings, unstructured text, and constantly evolving schema. “We didn’t have a fixed schema, we needed a database that was as flexible as Agmeta needed to be. I’ve known of MongoDB forever, so when I started to look at databases it seemed an obvious choice to me,” he said. The ability to quickly and easily work with vectorized data for gen AI was also crucial. “MongoDB provides vector search capabilities in an operational database. Rather than having to add a bolt on a vector database and figure out the ETL, MongoDB solved this issue for me in a single product. The way I look at it, if you do a good job on Vector search, then my life as an entrepreneur and software builder becomes much easier,” Samir said. After assessing database options and multiple LLMs, Samir and his team chose to pair MongoDB Atlas with Google Cloud, taking advantage of Gemini on Google’s generative AI platform. “With Atlas on Google Cloud, there are zero worries about database administration, maintenance, and availability. This frees us up to focus on creating business value,” Samir said. “Another benefit of using MongoDB is the flexibility to use the customer’s MongoDB setup which gives the customer the peace of mind from the perspective of security and privacy of their data.” Customer service first With the power of generative AI and MongoDB, Agmeta can deliver a CSAT score that measures the customers’ true takeaway from the call. The CSAT score is a multi-dimensional score that takes into account areas including resolution (as the customer sees it), politeness, the onus on the customer, and many other attributes. In the short term, the primary use for this technology is to detect and flag those customers at risk of churn, filing a charge dispute with their card provider, or potentially upselling, giving businesses an opportunity to “see” what they could never find out before. “When we talk to customers, the number one thing they are concerned about is customer churn. Right now they operate completely blind with no idea why people are leaving them,” said Samir. “One large telecoms customer Agmeta is in talks with had no idea where their churn was happening. But when we described being able to assign every customer a CSAT score, they were very excited,” he added. And it’s not just about preventing churn. Businesses can identify happy customers too, targeting them for upsell opportunities. “One of the things we do is spot patterns of unanswered questions from product support interactions,” Samir added. “When we see ‘Oh look, suddenly there are a lot more calls because of a release,’ then we can flag this to product teams as a must-fix issue.” The future of customer service Agmeta aims to amalgamate customer information with current and past experiences to provide businesses a more holistic±and nuanced—picture of their customers, and more precise next steps they can take. “What we want to do is look back in time and see what else happened with this customer,” Samir said. “The goal is to provide businesses with targeted directives to minimize churn and grow customer lifetime value.” Retrieval-augmented generation plays a key role in Agmeta’s vision. This also means an expanded role for both MongoDB’s vector database as the source of information against which semantic searches can be run, as well as Gemini for both analysis and presentation of the directives for the business. You can learn more about how innovators across the world are using MongoDB by reviewing our Building AI case studies . If your team is building AI apps, sign up for the AI Innovators Program . Successful companies get access to free Atlas credits and technical enablement, as well as connections into the broader AI ecosystem. Additionally, if your company is interested in being featured in a story like this, we'd love to hear from you! Reach out to us at ai_adopters@mongodb.com . Head over to our quick-start guide to get started with Atlas Vector Search today.

September 10, 2024

Find Hidden Insights in Vector Databases: Semantic Clustering

Vector databases, a powerful class of databases designed to optimize the storage, processing, and retrieval of large volume, multi-dimensional data, have increasingly been instrumental to generative AI (gen AI) applications, with Forrester predicted a 200% increase in the adoption of vector databases in 2024. But their power extends far beyond these applications. Semantic vector clustering, a technique within vector databases, can unlock hidden knowledge within your organization’s data, democratizing insights across teams. Mining diverse data for hidden knowledge Imagine your organization’s data as a library of diverse knowledge—a treasure trove of information waiting to be unearthed. Traditionally, uncovering valuable insights from data often relied on asking the right questions, which can be a challenge for developers, data scientists, and business leaders alike. They might spend vast amounts of time sifting through limited, siloed datasets, potentially missing hidden gems buried within the organization's vast data troves. Simply put, without knowing the right questions to ask, these valuable insights often remain undiscovered, leading to missed opportunities or losses. Enter vector databases and semantic vector clustering. A vector database is designed to store and manage unstructured data efficiently. Within a vector database, semantic vector clustering is a technique for organizing information by grouping vectors with similar meaning together. Text analysis, sentiment analysis, knowledge classification, and uncovering semantic connections between data sets—these are just a few examples of how semantic vector clustering empowers organizations to vastly improve data mining. Semantic vector clustering offers a multifaceted approach to organizational improvement. By analyzing text data, it can illuminate customer and employee sentiments, behaviors, and preferences, informing strategic decisions, enhancing customer service, and optimizing employee satisfaction. Furthermore, it revolutionizes knowledge management by categorizing information into easily accessible clusters, thereby boosting collaboration and efficiency. Finally, by bridging data silos and uncovering hidden relationships, semantic vector clustering facilitates informed decision-making and breaks down organizational barriers. For example, the business can gain significant insights from its customer interaction data which is routinely kept, classified, or summarized. Those data points (texts, numbers, images, videos, etc.) can be vectorized and semantic vector clustering applied to identify the most prominent customer patterns (the densest vector clusters) from those interactions, classifications, or summaries. From the identified patterns, the business can take further actions or make more informed decisions that they wouldn’t have been able to do otherwise. The power of semantic vector clustering So, how does semantic vector clustering achieve all this? Discover semantic structures: Clustering groups similar LLM-embedded vector sets together. This allows for fast retrieval of themes. Beyond clustering regular vectors (individual data points or concepts), clustering RAG vectors (summarization of themes and concepts) can provide superior LLM contexts compared to basic semantic search. Reduce data complexity via clustering: Data points are grouped based on overall similarity, effectively reducing the complexity of the data. This reveals patterns and summarizes key features, making it easier to grasp the bigger picture. Imagine organizing the library by theme or genre, making it easier to navigate vast amounts of information. Semantic auto-aggregation: Here is the coolest part. We can classify groups of vectors into hierarchies by effectively semantically "auto-aggregating" them. This means that the data itself “figures out” these groups and "self-organizes." Imagine a library with an efficient automated catalog system, allowing researchers to find what they need quickly and easily. Vector clustering can be used to create hierarchies, essentially "auto-aggregating" groups of vectors semantically. Think of it as automatically organizing sections of the library based on thematic connections without a set of pre-built questions. This allows you to identify patterns within a vast, semantically-diverse data within your organization. Unlock hidden insights in your vector database The semantic clustering of vector embeddings is a powerful tool to go beyond the surface of data and identify meanings that otherwise would not have been discovered. By unlocking hidden relationships and patterns, you can extract valuable insights that drive better decision-making, enhance customer experiences, and improve overall business efficiency—all enabled through MongoDB’ secure, unified, and fully-managed vector database capabilities. Check out our tutorial to learn how to get started. Head over to our quick-start guide to get started with Atlas Vector Search today. Add vector search to your arsenal for more accurate and cost-efficient RAG applications by enrolling in the MongoDB and DeepLearning.AI course " Prompt Compression and Query Optimization " for free today.

August 19, 2024

MongoDB AI Course in Partnership with Andrew Ng and DeepLearning.AI

MongoDB is committed to empowering developers and meeting them where they are. With a thriving community of 7 million developers across 117 regions, MongoDB has become a cornerstone in the world of database technology. Building on this foundation, we're excited to announce our collaboration with AI pioneer Andrew Ng and DeepLearning.AI, a leading educational technology company specializing in AI and machine learning. Together, we've created an informative course that bridges the gap between database technology and modern AI applications, further enhancing our mission to support developers in their journey to build innovative solutions. Introducing "Prompt Compression and Query Optimization" MongoDB’s latest course on DeepLearning.AI, Prompt Compression and Query Optimization , covers the prominent form factor of modern AI applications today: Retrieval Augmented Generation (RAG) . This course showcases how MongoDB Atlas Vector Search capabilities enable developers to build sophisticated AI applications, leveraging MongoDB as an operational and vector database. To ensure that learners taking this course are not just introduced to vector search, the course presents an approach to reducing the operational cost of running AI applications in production by a technique known as prompt compression. “RAG, or retrieval augmented generation, has moved from being an interesting new idea a few months ago to becoming a mainstream large-scale application.” — Andrew Ng, DeepLearning.AI Key course highlights RAG Applications: Learn to build and optimize the most prominent form of AI applications using MongoDB Atlas and the MongoDB Query Language(MQL). MongoDB Atlas Vector Search: Leverage the power of vector search for efficient information retrieval. MongoDB Document Model: Explore MongoDB's flexible, JSON-like document model, which represents complex data structures and is ideal for storing and querying diverse AI-related data. Prompt Compression: Use techniques to reduce the operational costs of AI applications in production environments. In this course, you'll learn techniques to enhance your RAG applications' efficiency, search relevance, and cost-effectiveness. As AI applications become more sophisticated, efficient data retrieval and processing becomes crucial. This course bridges the gap between traditional database operations and modern vector search capabilities, enabling you to confidently build robust, scalable AI applications that can handle real-world challenges. MongoDB's document model: The perfect fit for AI A key aspect of this course is that it introduces learners to MongoDB's document model and its numerous benefits for AI applications: Python-Compatible Structure: MongoDB's BSON format aligns seamlessly with Python dictionaries, enabling effortless data representation and manipulation. Schema Flexibility: Adapt to varied data structures without predefined schemas, matching the dynamic nature of AI applications. Nested Data Structures: Easily represent complex, hierarchical data often found in AI models and datasets. Efficient Data Ingestion: Directly ingest data without complex transformations, speeding up the data preparation process. Leveraging the combined insights from MongoDB and DeepLearning.AI, this course offers a perfect blend of practical database knowledge and advanced AI concepts. Who should enroll? This course is ideal for developers who: Are familiar with vector search concepts Building RAG applications and Agentic Systems Have a basic understanding of Python and MongoDB and are curious about AI Want to optimize their RAG applications for better performance and cost-efficiency This course offers an opportunity to grasp techniques in AI application development. You'll gain the skills to build more efficient, powerful, cost-effective RAG applications, from advanced query optimization to innovative prompt compression. With hands-on code, detailed walkthroughs, and real-world applications, you'll be equipped to tackle complex AI challenges using MongoDB's robust features. Take advantage of this chance to stay ahead in the rapidly evolving field of AI. Whether you're a seasoned developer or just starting your AI journey, this course will provide invaluable insights and practical skills to enhance your capabilities. Improve your AI application development skills with MongoDB's practical course. Learn to build efficient RAG applications using vector search and prompt compression. Enroll now and enhance your developer toolkit.

August 8, 2024

Anti-Money Laundering and Fraud Prevention With MongoDB Vector Search and OpenAI

Fraud and anti-money laundering (AML) are major concerns for both businesses and consumers, affecting sectors like financial services and e-commerce. Traditional methods of tackling these issues, including static, rule-based systems and predictive artificial intelligence (AI) methods, work but have limitations, such as lack of context and feature engineering overheads to keeping the models relevant, which can be time-consuming and costly. Vector search can significantly improve fraud detection and AML efforts by addressing these limitations, representing the next step in the evolution of machine learning for combating fraud. Any organization that is already benefiting from real-time analytics will find that this breakthrough in anomaly detection takes fraud and AML detection accuracy to the next level. In this post, we examine how real-time analytics powered by Atlas Vector Search enables organizations to uncover deeply hidden insights before fraud occurs. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. The evolution of fraud and risk technology Over the past few decades, fraud and risk technology have evolved in stages, with each stage building upon the strengths of previous approaches while also addressing their weaknesses: Risk 1.0: In the early stages (the late 1990s to 2010), risk management relied heavily on manual processes and human judgment, with decision-making based on intuition, past experiences, and limited data analysis. Rule-based systems emerged during this time, using predefined rules to flag suspicious activities. These rules were often static and lacked adaptability to changing fraud patterns . Risk 2.0: With the evolution of machine learning and advanced analytics (from 2010 onwards), risk management entered a new era with 2.0. Predictive modeling techniques were employed to forecast future risks and detect fraudulent behavior. Systems were trained on historical data and became more integrated, allowing for real-time data processing and the automation of decision-making processes. However, these systems faced limitations such as, Feature engineering overhead: Risk 2.0 systems often require manual feature engineering. Lack of context: Risk 1.0 and Risk 2.0 may not incorporate a wide range of variables and contextual information. Risk 2.0 solutions are often used in combination with rule-based approaches because rules cannot be avoided. Companies have their business- and domain-specific heuristics and other rules that must be applied. Here is an example fraud detection solution based on Risk 1.0 and Risk 2.0 with a rules-based and traditional AI/ML approach. Risk 3.0: The latest stage (2023 and beyond) in fraud and risk technology evolution is driven by vector search. This advancement leverages real-time data feeds and continuous monitoring to detect emerging threats and adapt to changing risk landscapes, addressing the limitations of data imbalance, manual feature engineering, and the need for extensive human oversight while incorporating a wider range of variables and contextual information. Depending on the particular use case, organizations can combine or use these solutions to effectively manage and mitigate risks associated with Fraud and AML. Now, let us look into how MongoDB Atlas Vector Search (Risk 3.0) can help enhance existing fraud detection methods. How Atlas Vector Search can help A vector database is an organized collection of information that makes it easier to find similarities and relationships between different pieces of data. This definition uniquely positions MongoDB as particularly effective, rather than using a standalone or bolt-on vector database. The versatility of MongoDB’s developer data platform empowers users to store their operational data, metadata, and vector embeddings on MongoDB Atlas and seamlessly use Atlas Vector Search to index, retrieve, and build performant gen AI applications. Watch how you can revolutionize fraud detection with MongoDB Atlas Vector Search. The combination of real-time analytics and vector search offers a powerful synergy that enables organizations to discover insights that are otherwise elusive with traditional methods. MongoDB facilitates this through Atlas Vector Search integrated with OpenAI embedding, as illustrated in Figure 1 below. Figure 1: Atlas Vector Search in action for fraud detection and AML Business perspective: Fraud detection vs. AML Understanding the distinct business objectives and operational processes driving fraud detection and AML is crucial before diving into the use of vector embeddings. Fraud Detection is centered on identifying unauthorized activities aimed at immediate financial gain through deceptive practices. The detection models, therefore, look for specific patterns in transactional data that indicate such activities. For instance, they might focus on high-frequency, low-value transactions, which are common indicators of fraudulent behavior. AML , on the other hand, targets the complex process of disguising the origins of illicitly gained funds. The models here analyze broader and more intricate transaction networks and behaviors to identify potential laundering activities. For instance, AML could look at the relationships between transactions and entities over a longer period. Creation of Vector Embeddings for Fraud and AML Fraud and AML models require different approaches because they target distinct types of criminal activities. To accurately identify these activities, machine learning models use vector embeddings tailored to the features of each type of detection. In this solution highlighted in Figure 1, vector embeddings for fraud detection are created using a combination of text, transaction, and counterparty data. Conversely, the embeddings for AML are generated from data on transactions, relationships between counterparties, and their risk profiles. The selection of data sources, including the use of unstructured data and the creation of one or more vector embeddings, can be customized to meet specific needs. This particular solution utilizes OpenAI for generating vector embeddings, though other software options can also be employed. Historical vector embeddings are representations of past transaction data and customer profiles encoded into a vector format. The demo database is prepopulated with synthetically generated test data for both fraud and AML embeddings. In real-world scenarios, you can create embeddings by encoding historical transaction data and customer profiles as vectors. Regarding the fraud and AML detection workflow , as shown in Figure 1, incoming transaction fraud and AML aggregated text are used to generate embeddings using OpenAI. These embeddings are then analyzed using Atlas Vector Search based on the percentage of previous transactions with similar characteristics that were flagged for suspicious activity. In Figure 1, the term " Classified Transaction " indicates a transaction that has been processed and categorized by the detection system. This classification helps determine whether the transaction is considered normal, potentially fraudulent, or indicative of money laundering, thus guiding further actions. If flagged for fraud: The transaction request is declined. If not flagged: The transaction is completed successfully, and a confirmation message is shown. For rejected transactions, users can contact case management services with the transaction reference number for details. No action is needed for successful transactions. Combining Atlas Vector Search for fraud detection With the use of Atlas Vector Search with OpenAI embeddings, organizations can: Eliminate the need for batch and manual feature engineering required by predictive (Risk 2.0) methods. Dynamically incorporate new data sources to perform more accurate semantic searches, addressing emerging fraud trends. Adopt this method for mobile solutions, as traditional methods are often costly and performance-intensive. Why MongoDB for AML and fraud prevention Fraud and AML detection require a holistic platform approach as they involve diverse data sets that are constantly evolving. Customers choose MongoDB because it is a unified data platform (as shown in Figure 2 below) that eliminates the need for niche technologies, such as a dedicated vector database. What’s more, MongoDB’s document data model incorporates any kind of data—any structure (structured, semi-structured, and unstructured), any format, any source—no matter how often it changes, allowing you to create a holistic picture of customers to better predict transaction anomalies in real time. By incorporating Atlas Vector Search, institutions can: Build intelligent applications powered by semantic search and generative AI over any type of data. Store vector embeddings right next to your source data and metadata. Vectors inserted or updated in the database are automatically synchronized to the vector index. Optimize resource consumption, improve performance, and enhance availability with Search Nodes . Remove operational heavy lifting with the battle-tested, fully managed MongoDB Atlas developer data platform. Figure 2: Unified risk management and fraud detection data platform Given the broad and evolving nature of fraud detection and AML, these areas typically require multiple methods and a multimodal approach. Therefore, a unified risk data platform offers several advantages for organizations that are aiming to build effective solutions. Using MongoDB, you can develop solutions for Risk 1.0, Risk 2.0, and Risk 3.0, either separately or in combination, tailored to meet your specific business needs. The concepts are demonstrated with two examples: a card fraud solution accelerator for Risk 1.0 and Risk 2.0 and a new Vector Search solution for Risk 3.0, as discussed in this blog. It's important to note that the vector search-based Risk 3.0 solution can be implemented on top of Risk 1.0 and Risk 2.0 to enhance detection accuracy and reduce false positives. If you would like to discover more about how MongoDB can help you supercharge your fraud detection systems, take a look at the following resources: Revolutionizing Fraud Detection with Atlas Vector Search Card Fraud solution accelerator (Risk 1.0 and Risk 2.0) Risk 3.0 fraud detection solution GitHub repository Add vector search to your arsenal for more accurate and cost-efficient RAG applications by enrolling in the DeepLearning.AI course " Prompt Compression and Query Optimization " for free today.

July 17, 2024

MongoDB Atlas Once Again Voted Most Loved Vector Database

The 2024 Retool State of AI report has just been released, and for the second year in a row, MongoDB Atlas was named the most loved vector database. Atlas Vector Search received the highest net promoter score (NPS), a measure of how likely a user is to recommend a solution to their peers. This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . Interested in discovering how to leverage AI to boost productivity, streamline development, and solve real engineering challenges? Check out our on-demand webinar with Retool to learn more. The Retool State of AI report is a global annual survey of developers, tech leaders, and IT decision-makers that provides insights into the current and future state of AI, including vector databases, retrieval-augmented generation (RAG) , AI adoption, and challenges innovating with AI. MongoDB Atlas commanded the highest NPS in Retool’s inaugural 2023 report, and it was the second most widely used vector database within just five months of its release. This year, MongoDB came in a virtual tie for the most popular vector database, with 21.1% of the vote, just a hair behind pgvector (PostgreSQL), which received 21.3%. The survey also points to the increasing adoption of RAG as the preferred approach for generating more accurate answers with up-to-date and relevant context that large language models ( LLMs ) aren't trained on. Although LLMs are trained on huge corpuses of data, not all of that data is up to date, nor does it reflect proprietary data. And in those areas where blindspots exist, LLMs are notorious for confidently providing inaccurate "hallucinations." Fine-tuning is one way to customize the data that LLMs are trained on, and 29.3% of Retool survey respondents leverage this approach. But among enterprises with more than 5,000 employees, one-third now leverage RAG for accessing time-sensitive data (such as stock market prices) and internal business intelligence, like customer and transaction histories. This is where MongoDB Atlas Vector Search truly shines. Customers can easily utilize their stored data in MongoDB to augment and dramatically improve the performance of their generative AI applications, during both the training and evaluation phases. In the course of one year, vector database utilization among Retool survey respondents rose dramatically, from 20% in 2023 to an eye-popping 63.6% in 2024. Respondents reported that their primary evaluation criteria for choosing a vector database were performance benchmarks (40%), community feedback (39.3%), and proof-of-concept experiments (38%). One of the pain points the report clearly highlights is difficulty with the AI tech stack . More than 50% indicated they were either somewhat satisfied, not very satisfied, or not at all satisfied with their AI stack. Respondents also reported difficulty getting internal buy-in, which is often complicated by procurement efforts when a new solution needs to be onboarded. One way to reduce much of this friction is through an integrated suite of solutions that streamlines the tech stack and eliminates the need to onboard multiple unknown vendors. Vector search is a native feature of MongoDB's developer data platform, Atlas, so there's no need to bolt on a standalone solution. If you're already using MongoDB Atlas , creating AI-powered experiences involves little more than adding vector data into your existing data collections in Atlas. If you're a developer and want to start using Atlas Vector Search to start building generative AI-powered apps, we have several helpful resources: Learn how to build an AI research assistant agent that uses MongoDB as the memory provider, Fireworks AI for function calling, and LangChain for integrating and managing conversational components. Get an introduction to LangChain and MongoDB Vector Search and learn to create your own chatbot that can read lengthy documents and provide insightful answers to complex queries. Watch Sachin Smotra of Dataworkz as he delves into the intricacies of scaling RAG (retrieval-augmented generation) applications. Read our tutorial that shows you how to combine Google Gemini's advanced natural language processing with MongoDB, facilitated by Vertex AI Extensions to enhance the accessibility and usability of your database. Browse our Resources Hub for articles, analyst reports, case studies, white papers, and more. Interested in discovering how to leverage AI to boost productivity, streamline development, and solve real engineering challenges? Check out our on-demand webinar with Retool to learn more. Want to find out more about recent AI trends and adoption? Read the full 2024 Retool State of AI report . Head over to our quick-start guide to get started with Atlas Vector Search today.

June 21, 2024

MongoDB Atlas、最も利用されているベクトルデータベースとして 2 年連続

MongoDB Atlas は、 Retool 社の 2024 年版「The State of AI 」 レポートにおいて、2 年連続で最も利用されているベクトルデータベースとなりました。また、 Atlas Vector Search は、NPS 顧客満足度調査において最高の評価を受けています。NPS は、お客さまの「他の人に勧めたい」度合いを数値化したものです。 AI を活用して生産性を高め、開発を効率化し、実際のエンジニアリングの課題を解決しませんか?詳しくは、Retool 社との オンデマンド Web セミナー をご覧ください。 Retool 社の 2024 年版「The State of AI 」レポートは、開発者、技術リーダー、IT意思決定者を対象としたグローバルな年次調査で、ベクトルデータベース、 検索拡張生成(RAG) 、AI の採用、イノベーションの課題など、AI の現状と将来について豊富なインサイトを提供しています。 MongoDB Atlas は、Retool の初回 2023 年レポートで最高の NPS を獲得し、リリースからわずか 5か月で 2 番目に広く利用されているベクトルデータベースになりました。今年、MongoDB は 21.1% の票を獲得し、21.3% を獲得した pgvector(PostgreSQL)には僅差で及ばなかったものの、事実上同率で最も支持されているベクトルデータベースとなりました。 この調査はまた、 LLM(大規模言語モデル) がトレーニングされていない最新かつ適切なコンテキストを持つ、より正確な回答を生成するための好ましいアプローチとして、RAG の採用が増加していることを指摘しています。LLM は膨大なコーパスのデータに基づいて学習されますが、そのデータの全てが最新であるとは限りませんし、独自のデータが反映されているわけでもありません。また、情報の盲点が存在する領域において、LLM は自信を持って不正確な「幻覚 」を提供することで有名です。微調整は、LLM がトレーニングするデータをカスタマイズする 1 つの方法であり、Retool 社の調査回答者の 29.3% がこのアプローチを活用しています。しかし、従業員数 5,000 名以上の企業では、3 分の 1 が RAG を時間的制約のあるデータ(株式市場価格など)や、顧客履歴や取引履歴などの内部ビジネスインテリジェンスへのアクセスに活用しています。 MongoDB Atlas Vector Search が真に輝くのはこの点です。お客さまは、MongoDB に保存されたデータを容易に利用し、学習と評価の両方の段階で、生成 AI アプリケーションのパフォーマンスを強化し、劇的に向上させることができます。 1 年の間に、Retool 調査回答者のベクトルデータベース利用率は劇的に上昇し、2023 年の 20% から 2024 年には 63.6% という目を見張る数字になりました。回答者は、ベクトルデータベースを選択する際の主な評価基準として、パフォーマンスベンチマーク(40%)、コミュニティからのフィードバック(39.3%)、PoC 実験(38%)を挙げています。 このレポートが明確に強調しているペインポイントの 1 つは、 AI 技術スタック の難しさです。50% 以上が、自社の AI スタックに「やや満足」「あまり満足していない」「まったく満足していない」のいずれかを回答しています。回答者はまた、社内の賛同を得るのが困難であることも報告しており、新しいソリューションの導入が必要な場合、調達活動がしばしば複雑になります。このような摩擦を減らす 1 つの方法は、技術スタックを効率化し、複数の未知のベンダーを採用する必要性を排除する統合ソリューションスイートです。ベクトル検索は MongoDB の開発者向けデータプラットフォームである Atlas のネイティブ機能なので、スタンドアローンのソリューションを追加する必要はありません。既に MongoDB Atlas を使用している場合、AI を活用したエクスペリエンスを作成するには、Atlas の既存のデータコレクションにベクトルデータを追加するだけです。 Atlas Vector Search で生成 AI アプリを開発するために役立つリソース: MongoDB、Fireworks AI、LangChain を使用してメモリを持つ AI エージェントを構築 :MongoDB をメモリプロバイダーとして使用し、Fireworks AI で機能呼び出しを行い、LangChain で会話コンポーネントの統合と管理を行う AI リサーチアシスタントエージェントを構築する手法を解説します。 LangChain と MongoDB Atlas Vector Searchの紹介 :長文書を読み、複雑なクエリにインサイトに満ちた回答を提供できる独自のチャットボットを作成する手法を解説します。 動画「アイデアを生成 AI アプリに変える」 :Dataworkz 社の Sachin Smotra 氏が、RAG(検索拡張生成)アプリケーションのスケーリングの複雑さについて動画で解説しています。 Mongo DB チュートリアル :Google Gemini の高度な自然言語処理と MongoDB を組み合わせ、Vertex AI Extensions を活用してデータベースのアクセス性と使いやすさを向上させる方法を紹介します。 リソースハブ :最新記事、アナリストレポート、導入事例、ホワイトペーパーなどにアクセスできます。 AI を活用して生産性を高め、開発を効率化し、エンジニアリングの課題を解決しませんか? Retool 社とのオンデマンド Web セミナー をご覧ください。 最新の AI の動向と採用について詳しくは、 2024 Retool State of AI レポート でお読みいただけます。 Atlas Vector Search を今すぐ開始するには、 クイックスタートガイド をご覧ください。

June 21, 2024

Exact Nearest Neighbor Vector Search for Precise Retrieval

With its ability to efficiently handle high-dimensional, unstructured data, vector search delivers relevant results even when users don’t know what they’re looking for and uses machine learning models to find similar results across any data type. Rapidly emerging as a key technology for modern applications, vector search empowers developers to build next-generation search and generative AI applications faster and easier. MongoDB Atlas Vector Search goes beyond the approximate nearest neighbor (ANN) methods with the introduction of exact nearest neighbor (ENN) vector search . This innovative capability guarantees retrieval of the absolute closest vectors to your query, eliminating the accuracy limitations inherent in ANN. In sum, ENN vector search can help you unleash a new level of precision for your search and generative AI applications, improving benchmarking and moving to production faster. When exact nearest neighbor (ENN) vector search benefits developers While ANN shines in searching across large datasets, ENN vector search offers advantages in specific scenarios: Small-scale vector data: For datasets under 10,000 vectors, the linear time complexity of ENN vector search makes it a viable option, especially considering the added development complexity of tuning ANN parameters. Recall benchmarking of ANN queries: ANN queries are fast, particularly as the scale of your indexed vectors increases, but it may not be easy to know whether the retrieved documents by vector relevance correspond to the guaranteed closest vectors in your index. Using ENN can help provide that exact result set for comparison with your approximate result set, using jaccard similarity or other rank-aware recall metrics. This will allow you to have much greater confidence that your ANN queries are accurate since you can build quantitative benchmarks as your data evolves. Multi-tenant architectures: Imagine a scenario with millions of vectors categorized by tenants. You might search for the closest vectors within a specific tenant (identified by a tenant ID). In cases where the overall vector collection is large (in the millions) but the number of vectors per tenant is small (a few thousand), ANN's accuracy suffers when applying highly selective filters. ENN vector search thrives in this multi-tenant scenario, delivering precise results even with small result sets. Example use cases The small dataset size allows for exhaustive search within a reasonable timeframe, making exact nearest neighbor approach a viable option for finding the most similar data point, improving accuracy confidence in a number of use cases, such as: Multi-tenant data service: You might be building a business providing an agentic service that understands your customers’ data and takes actions on their behalf. When retrieving relevant proprietary data for that agent, it is critical that the right metadata filter be applied and that ENN be executed to retrieve the right sets of documents only corresponding to the appropriate data tenant IDs. Proof of concept development: For instance, a new recommendation engine might have a limited library compared to established ones. Here, ENN vector search can be used to recommend products to a small set of early adopters. Since the data is limited, an exhaustive search becomes practical, ensuring the user gets the most relevant recommendations from the available options. How ENN vector search works on MongoDB Atlas The ENN vector search feature in Atlas integrates seamlessly with the existing $vectorSearch stage within your Atlas aggregation pipelines. Its key characteristics include: Guaranteed accuracy: Unlike ANN, ENN always returns the closest vectors to your query, adhering to the specified limit. Eventual consistency: Similar to approximate vector search, ENN vector search follows an eventual consistency model. Simplified configuration: Unlike approximate vector search, where tuning numCandidates is crucial, ENN vector search only requires specifying the desired limit of returned vectors. Scalable recall evaluation: Atlas allows querying a large number of indexed vectors, facilitating the calculation of comprehensive recall sets for effective evaluation. Fast query execution: ENN vector search query execution can maintain sub-second latency for unfiltered queries up to 10,000 documents. It can also provide low-latency responses for highly selective filters that restrict a broad set of documents into 10,000 documents or less, ordered by vector relevance. Build more with ENN vector search ENN vector search can be a powerful tool when building a proof of concept for retrieval-augmented generation (RAG), semantic search, or recommendation systems powered by vector search. It simplifies the developer experience by minimizing overhead complexity and latency while giving you the flexibility to implement and benchmark precise retrieval. Explore more use cases and build applications faster, start experimenting with ENN vector search. Head over to our quick-start guide to get started with Atlas Vector Search today.

June 20, 2024

AI-Powered Media Personalization: MongoDB and Vector Search

In recent years, the media industry has grappled with a range of serious challenges, from adapting to digital platforms and on-demand consumption, to monetizing digital content, and competing with tech giants and new media upstarts. Economic pressures from declining sources of revenue like advertising, trust issues due to misinformation, and the difficulty of navigating regulatory environments have added to the complexities facing the industry. Additionally, keeping pace with technological advancements, ensuring cybersecurity, engaging audiences with personalized and interactive content, and addressing globalization issues all require significant innovation and investment to maintain content quality and relevance. In particular, a surge in digital content has saturated the media market, making it increasingly difficult to capture and retain audience attention. Furthermore, a decline in referral traffic—primarily from social media platforms and search engines—has put significant pressure on traditional media outlets. An industry survey from a sample of more than 300 digital leaders from more than 50 countries and territories shows that traffic to news sites from Facebook fell 48% in 2023, with traffic from X/Twitter declining by 27%. As a result, publishers are seeking ways to stabilize their user bases and to enhance engagement sustainably, with 77% looking to invest more in direct channels to deal with the loss of referrals. Enter artificial intelligence: generative AI-powered personalization has become a critical tool for driving the future of media channels. The approach we discuss here offers a roadmap for publishers navigating the shifting dynamics of news consumption and user engagement. Indeed, using AI for backend news automation ( 56% ) is considered the most important use of the technology by publishers. In this post, we’ll walk you through using MongoDB Atlas and Atlas Vector Search to transform how content is delivered to users. Check out our AI resource page to learn more about building AI-powered apps with MongoDB. The shift in news consumption Today's audiences rarely rely on a single news source. Instead, they use multiple platforms to stay informed, a trend that's been driven by the rise of social media, video-based news formats, and skepticism towards traditional media due to the prevalence (or fear) of "fake news." This diversification in news sources presents a dilemma for publishers, who have come to depend on traffic from social media platforms like Facebook and Twitter. However, both platforms have started to deprioritize news content in favor of posts from individual creators and non-news content, leading to a sharp decline in media referrals. The key to retaining audiences lies in making content personalized and engaging. AI-powered personalization and recommendation systems are essential tools for achieving this. Content suggestions and personalization By drawing on user data, behavior analytics, and the multi-dimensional vectorization of media content, MongoDB Atlas and Atlas Vector Search can be applied to multiple AI use cases to revolutionize media channels and improve end-user experiences. By doing so, media organizations can suggest content that aligns more closely with individual preferences and past interactions. This not only enhances user engagement but also increases the likelihood of converting free users into paying subscribers. The essence of leveraging Atlas and Vector Search is to understand the user. By analyzing interactions and consumption patterns, the solution not only grasps what content resonates but also predicts what users are likely to engage with in the future. This insight allows for crafting a highly personalized content journey. The below image shows a reference architecture highlighting where MongoDB can be leveraged to achieve AI-powered personalization. To achieve this, you can integrate several advanced capabilities: Content suggestions and personalization: The solution can suggest content that aligns with individual preferences and past interactions. This not only enhances user engagement but also increases the likelihood of converting free users into paying subscribers. By integrating MongoDB's vector search to perform k-nearest neighbor (k-NN) searches , you can streamline and optimize how content is matched. Vectors are embedded directly in MongoDB documents, which has several advantages. For instance: No complexities of a polyglot persistence architecture. No need to extract, transform, and load (ETL) data between different database systems, which simplifies the data architecture and reduces overhead. MongoDB’s built-in scalability and resilience can support vector search operations more reliably. Organizations can scale their operations vertically or horizontally, even choosing to scale search nodes independently from operational database nodes, flexibly adapting to the specific load scenario. Content summarization and reformatting: In an age of information overload, this solution provides concise summaries and adapts content formats based on user preferences and device specifications. This tailored approach addresses the diverse consumption habits of users across different platforms. Keyword extraction: Essential information is drawn from content through advanced keyword extraction, enabling users to grasp key news dimensions quickly and enhancing the searchability of content within the platform. Keywords are fundamental to how content is indexed and found in search engines, and they significantly influence the SEO (search engine optimization) performance of digital content. In traditional publishing workflows, selecting these keywords can be a highly manual and labor-intensive task, requiring content creators to identify and incorporate relevant keywords meticulously. This process is not only time-consuming but also prone to human error, with significant keywords often overlooked or underutilized, which can diminish the content's visibility and engagement. With the help of the underlying LLM, the solution extracts keywords automatically and with high sophistication. Automatic creation of Insights and dossiers: The solution can automatically generate comprehensive insights and dossiers from multiple articles. This feature is particularly valuable for users interested in deep dives into specific topics or events, providing them with a rich, contextual experience. This capability leverages the power of one or more Large Language Models (LLMs) to generate natural language output, enhancing the richness and accessibility of information derived from across multiple source articles. This process is agnostic to the specific LLMs used, providing flexibility and adaptability to integrate with any leading language model that fits the publisher's requirements. Whether the publisher chooses to employ more widely recognized models (like OpenAI's GPT series) or other emerging technologies, our solution seamlessly incorporates these tools to synthesize and summarize vast amounts of data. Here’s a deeper look at how this works: Integration with multiple sources: The system pulls content from a variety of articles and data sources, retrieved with MongoDB Atlas Vector Search. Found items are then compiled into dossiers, which provide users with a detailed and contextual exploration of topics, curated to offer a narrative or analytical perspective that adds value beyond the original content. Customizable output: The output is highly customizable. Publishers can set parameters based on their audience’s preferences or specific project requirements. This includes adjusting the level of detail, the use of technical versus layman terms, and the inclusion of multimedia elements to complement the text. This feature significantly enhances user engagement by delivering highly personalized and context-rich content. It caters to users looking for quick summaries as well as those seeking in-depth analyses, thereby broadening the appeal of the platform and encouraging deeper interaction with the content. By using LLMs to automate these processes, publishers can maintain a high level of productivity and innovation in content creation, ensuring they remain at the cutting edge of media delivery. Future directions As media consumption habits continue to evolve, AI-powered personalization stands out as a vital tool for publishers. By using AI to deliver tailored content and to automate back end processes, publishers can address the decline in traditional referrals and build stronger, more direct relationships with their audiences. If you would like to learn more about AI-Powered Media Personalization, visit the following resources: AI-Powered Personalization to Drive Next-Generation Media Channels AI-Powered Innovation in Telecommunications and Media GitHub Repository : Create a local version of this solution by following the instructions in the repository Head over to our quick-start guide to get started with Atlas Vector Search today.

June 13, 2024