ArtificialIntelligence

25 results

Next-Generation Mobility Solutions with Agentic AI and MongoDB Atlas

Driven by advancements in vehicle connectivity, autonomous systems, and electrification, the automotive and mobility industry is currently undergoing a significant transformation. Vehicles today are sophisticated machines, computers on wheels, that generate massive amounts of data, driving demand for connected and electric vehicles. Automotive players are embracing artificial intelligence (AI), battery electrical vehicles (BEVs), and software-defined vehicles (SDVs) to maintain their competitive advantage. However, managing fleets of connected vehicles can be a challenge. As cars get more sophisticated and are increasingly integrated with internal and external systems, the volume of data they produce and receive greatly increases. This data needs to be stored, transferred, and consumed by various downstream applications to unlock new business opportunities. This will only grow: the global fleet management market is projected to reach $65.7 billion by 2030, growing at a rate of almost 10.8% annually. A 2024 study conducted by Webfleet showed that 32% of fleet managers believe AI and machine learning will significantly impact fleet operations in the coming years; optimizing route planning and improving driver safety are the two most commonly cited use cases. As fleet management software providers continue to invest in AI, the integration of agentic AI can significantly help with things like route optimization and driver safety enhancement. For example, AI agents can process real-time traffic updates and weather conditions to dynamically adjust routes, ensuring timely deliveries while advising drivers on their car condition. This proactive approach contrasts with traditional reactive methods, improving vehicle utilization and reducing operational and maintenance costs. But what are agents? In short, they are operational applications that attempt to achieve goals by observing the world and acting upon it using the data and tools the application has at its disposal. The term "agentic" denotes having agency, as AI agents can proactively take steps to achieve objectives without constant human oversight. For example, rather than just reporting an anomaly based on telemetry data analysis, an agent for a connected fleet could autonomously cross-check that anomaly against known issues, decide whether it's critical or not, and schedule a maintenance appointment all on its own. Why MongoDB for agentic AI Agentic AI applications are dynamic by nature as they require the ability to create a chain of thought, use external tools, and maintain context across their entire workflow. These applications generate and consume diverse data types, including structured and unstructured data. MongoDB’s flexible document model is uniquely suited to handle both structured and unstructured data as vectors. It allows all of an agent’s context, chain-of-thought, tools metadata, and short-term and long-term memory to be stored in a single database. This means that developers can spend more time on innovation and rapidly iterate on agent designs without being constrained by rigid schemas of a legacy relational database. Figure 1. Major components of an AI agent. Figure 1 shows the major components of an AI agent. The agent will first receive a task from a human or via an automated trigger, and will then use a large language model (LLM) to generate a chain of thought or follow a predetermined workflow. The agent will use various tools and models during its run and store/retrieve data from a memory provider like MongoDB Atlas . Tools: The agent utilizes tools to interact with the environment. This can contain API methods, database queries, vector search, RAG application, anything to support the model Models: can be a large language model (LLM), vision language model (VLM), or a simple supervised machine learning model. Models can be general purpose or specialized, and agents may use more than one. Data: An agent requires different types of data to function. MongoDB’s document model allows you to easily model all of this data in one single database. An agentic AI spans a wide range of functional tools and context. The underlying data structures evolve throughout the agentic workflow and as an agent uses different tools to complete a task. It also builds up memory over time. Let us list down the typical data types you will find in an agentic AI application. Data types: Agent profile: This contains the identity of the agent. It includes instructions, goals and constraints. Short-term memory: This holds temporary, contextual information—recent data inputs or ongoing interactions—that the agent uses in real-time. For example, short-term memory could store sensor data from the last few hours of vehicle activity. In certain agentic AI frameworks like Langgraph, short term memory is implemented through a checkpointer. The checkpointer stores intermediate states of the agent’s actions and/or reasoning. This memory allows the agent to seamlessly pause and resume operations. Long-term memory: This is where the agent stores accumulated knowledge over time. This may include patterns, trends, logs and historical recommendations and decisions. By storing each of these data types into rich, nested documents in MongoDB, AI developers can create a single-view representation of an agent’s state and behavior. This enables fast retrieval and simplifies development. In addition to the document model advantage, building agentic AI solutions for mobility requires a robust data infrastructure. MongoDB Atlas offers several key advantages that make it an ideal foundation for these AI-driven architectures. These include: Scalability and flexibility: Connected Car platforms like fleet management systems need to handle extreme data volumes and variety. MongoDB Atlas is proven to scale horizontally across cloud clusters, letting you ingest millions of telemetry events per minute and store terabytes of telemetry data with ease. For example, the German company ZF uses MongoDB to process 90,000 vehicle messages per minute (over 50 GB of data per day) from hundreds of thousands of connected cars. The flexibility of the document model accelerates development and ensures your data model stays aligned with the real-world entities it represents. Built-in vector search: AI agents require a robust set of tools to work with. One of the most widely used tools is vector search, which allows agents to perform semantic searches on unstructured data like driver logs, error codes descriptions, and repair manuals. MongoDB Atlas Vector Search allows you to store and index high-dimensional vectors alongside your documents and to perform semantic search over unstructured data. In practice, this means your AI embeddings live right next to the relevant vehicle telemetry and operational data in the database, simplifying architectures for use cases like the connected car incident advisor, in which a new issue can be matched against past issues before passing contextual information to the LLM. For more, check out this example of how an automotive OEM leverages vector search for audio based diagnostics with MongoDB Atlas Vector Search. Time series collections and real-time data processing: MongoDB Atlas is designed for real-time applications. It provides time series collections for connected car telemetry data storage, change streams, and triggers that can react to new data instantly. This is crucial for agentic AI feedback loops, where ongoing data ingestion and learning are happening continuously. Best-in-class embedding models with Voyage AI: In early 2025, MongoDB acquired Voyage AI , a leader in embedding and reranking models. Voyage AI embedding models are currently being integrated into MongoDB Atlas, which means developers will no longer need to manage external embedding APIs, standalone vector stores, or complex search pipelines. AI retrieval will be built into the database itself, making semantic search, vector retrieval, and ranking as seamless as traditional queries. This will reduce the time required for developing agentic AI applications. Agentic AI in action: Connected fleet incident advisor Figure 2 shows a list of use cases in the Mobility sector, sorted by various capabilities that an agent might demonstrate. AI agents excel at managing multi-step tasks via context management across tasks, they automate repetitive tasks better than Robotic process automation (RPA), and they demonstrate human-like reasoning by revisiting and revising past decisions. These capabilities enable a wide range of applications both during the manufacturing of a vehicle and while it's on the road, connected and sending telemetry. We will review a use case in detail below, and will see how it can be implemented using MongoDB Atlas, LangGraph, Open AI, and Voyage AI. Figure 2. Major use cases of agentic AI in the mobility and manufacturing sectors. First, the AI agent connects to traditional fleet management software and supports the fleet manager in diagnosing and advising the drivers. This is an example of a multi-step diagnostic workflow that gets triggered when a driver submits a complaint about the vehicle's performance (for example, increased fuel consumption). Figure 3 shows the sequence diagram of the agent. Upon receiving the driver complaint, it creates a chain of thought that follows a multi-step diagnostic workflow where the system ingests vehicle data such as engine codes and sensor readings, generates embeddings using the Voyage AI voyage-3-large embedding model, and performs a vector search using MongoDB Atlas to find similar past incidents. Once relevant cases are identified, those–along with selected telemetry data–are passed to OpenAI gpt-4o LLM to generate a final recommendation for the driver (for example, to pull off immediately or to keep driving and schedule regular maintenance). All data, including telemetry, past issues, session logs, agent profiles, and recommendations are stored in MongoDB Atlas, ensuring traceability and the ability to refine diagnostics over time. Additionally, MongoDB Atlas is used as a checkpointer by LangGraph, which defines the agent's workflow. Figure 3. Sequence diagram for a connected fleet advisor agentic workflow. Figure 4 shows the agent in action, from receiving an issue to generating a recommendation. So by leveraging MongoDB’s flexible data model and powerful Vector Search capabilities, we can agentic AI can transform fleet management through predictive maintenance and proactive decision-making. Figure 4. The connected fleet advisor AI agent in action. To set up the use case shown in this article, please visit our GitHub repository . And to learn more about MongoDB’s role in the automotive industry, please visit our manufacturing and automotive webpage . Want to learn more about why MongoDB is the best choice for supporting modern AI applications? Check out our on-demand webinar, “ Comparing PostgreSQL vs. MongoDB: Which is Better for AI Workloads? ” presented by MongoDB Field CTO, Rick Houlihan.

April 4, 2025

Building Gen AI with MongoDB & AI Partners | February 2025

February was big for MongoDB—and, more importantly, for anyone looking to build AI applications that deliver highly accurate, relevant information (in other words, for everyone building AI apps). MongoDB announced the acquisition of Voyage AI , a pioneer in state-of-the-art embedding and reranking models that power next-generation AI applications. Because generative AI is by nature probabilistic, models can “hallucinate”, and generate false or misleading information. This can lead to serious risks, especially in cases or industries (e.g., financial services) where accurate information is paramount. To address this, organizations building AI apps need high-quality retrieval; they need to trust that the most relevant information is extracted from their data with precision. Voyage AI’s advanced embedding and reranking models enable applications to extract meaning from highly specialized and domain-specific text and unstructured data. With roots at Stanford and MIT, Voyage AI’s world-class team is trusted by AI innovators like Anthropic, LangChain, Harvey, and Replit. Integrating Voyage AI’s technology with MongoDB will enable organizations to easily build trustworthy, AI-powered applications by offering highly accurate and relevant information retrieval deeply integrated with operational data. For more, check out MongoDB CEO Dev Ittycheria’s blog post about Voyage AI , and what this means for developers and businesses (in short, delivering high-quality results at scale). Onward! P.S. If you’re in Vegas for HumanX this week, stop by booth 412 to say hi to MongoDB! Welcoming new AI and tech partners The Voyage AI news was hardly the only exciting development last month. In February 2025, MongoDB welcomed three new AI and tech partners that offer product integrations with MongoDB. Read on to learn more about each great new partner! CopilotKit Seattle-based CopilotKit provides open source infrastructure for in-app AI copilots. CopilotKit helps organizations build production-ready copilots and agents effortlessly. “We’re excited to be partnering with MongoDB to help companies build best-in-class copilots that leverage RAG & take action based on internal data,” said Uli Barkai, Co-Founder and Chief Marketing Officer at CopilotKit. “MongoDB made it dead simple to build a scalable vector database with operational data. This collaboration enables developers to easily ship production-grade RAG applications.” Varonis Varonis is the leader in data security, protecting data wherever it lives—across SaaS, IaaS, and hybrid cloud environments. Varonis’ cloud-native Data Security Platform continuously discovers and classifies critical data, removes exposures, and detects advanced threats with AI-powered automation. “Varonis’s mission is to protect data wherever it lives,” said David Bass, Executive Vice President of Engineering and Chief Technology Officer at Varonis. “We are thrilled to further advance our mission by offering AI-powered data security and compliance for MongoDB, the database of choice for high-performance application and AI development. With this integration, joint customers can automatically discover and classify sensitive data, detect abnormal activities, secure AI data pipelines, and prevent data leaks.” Xlrt Xlrt is an automated insight-generation platform that enables financial institutions to create innovative financial credit products at scale by simplifying the financial spreading process. “We are excited to partner with MongoDB Atlas to transform AI-driven financial workflows,” said Rupesh Chaudhuri, Chief Operating Officer and Co-Founder of Xlrt. “XLRT.ai leverages agentic AI, combining graph-based contextualization, vector search, and LLMs to redefine data-driven decision-making. With MongoDB's robust NoSQL and vector search capabilities, we’re delivering unparalleled efficiency, accuracy, and scalability in automating financial processes.” To learn more about building AI-powered apps with MongoDB, check out our AI Learning Hub and stop by our Partner Ecosystem Catalog to read about our integrations with MongoDB’s ever-evolving AI partner ecosystem. And visit the MongoDB AI Applications Program (MAAP) page to learn how MongoDB and the MAAP ecosystem helps organizations build applications with advanced AI capabilities.

March 12, 2025

AI-Powered Java Applications With MongoDB and LangChain4j

MongoDB is pleased to introduce its integration with LangChain4j , a popular framework for integrating large language models (LLMs) into Java applications. This collaboration simplifies the integration of MongoDB Atlas Vector Search into Java applications for building AI applications. The advent of generative AI has opened up many new possibilities for developing novel applications. These advancements have led to the development of AI frameworks that simplify the complexities of orchestrating and integrating LLMs and the various components of the AI stack , where MongoDB plays a key role as an operational and vector database. Simplifying AI development for Java The first AI frameworks to emerge were developed for Python and JavaScript, which were favored by early AI developers. However, Java remains widespread in enterprise software. This has led to the development of LangChain4j to address the needs of the Java ecosystem. While largely inspired by LangChain and other popular AI frameworks, LangChain4j is independently developed. As with other LLM frameworks, LangChain4j offers several advantages for developing AI systems and applications by providing: A unified API for integrating LLM providers and vector stores. This enables developers to adopt a modular approach with an interchangeable stack while ensuring a consistent developer experience. Common abstractions for LLM-powered applications, such as prompt templating, chat memory management, and function calling, offering ready-to-use building blocks for common AI applications like retrieval-augmented generation (RAG) and agents. Powering RAG and agentic systems with MongoDB and LangChain4j MongoDB worked with the LangChain4j open-source community to integrate MongoDB Atlas Vector Search into the framework, enabling Java developers to develop AI-powered applications from simple RAG to agentic applications. In practice, this means developers can now use the unified LangChain4j API to store vector embeddings in MongoDB Atlas and use Atlas Vector Search capabilities for retrieving relevant context data. These capabilities are essential for enabling RAG pipelines, where private, often enterprise data is retrieved based on relevancy and combined with the original prompt to get more accurate results in LLM-based applications. LangChain4j supports various levels of RAG, from basic to advanced implementations, making it easy to prototype and experiment before customizing and scaling your solution to your needs. A basic RAG setup with LangChain4j typically involves loading and parsing unstructured data from documents stored locally or on remote services like Amazon S3 or Azure Storage using the Document API. The process then transforms and splits the data, then embeds it to capture the semantic meaning of the content. For more details, check out the documentation on core RAG APIs . However, real-world use cases often demand solutions with advanced RAG and agentic systems. LangChain4j optimizes RAG pipelines with predefined components designed to enhance accuracy, latency, and overall efficiency through techniques like query transformation, routing, content aggregation, and reranking. It also supports AI agent implementation through dedicated APIs, such as AI Services and Tools , with function calling and RAG integration, among others. Learn more about the MongoDB Atlas Vector Search integration in LangChain4j’s documentation . MongoDB’s dedication to providing the best developer experience for building AI applications across different ecosystems remains strong, and this integration reinforces that commitment. We will continue strengthening our integration with LLM frameworks enabling developers to build more-innovative AI applications, agentic systems, and AI agents. Ready to start building AI applications with Java? Learn how to create your first RAG system by visiting our tutorial: How to Make a RAG Application With LangChain4j .

March 4, 2025

Why Vector Quantization Matters for AI Workloads

Key takeaways As vector embeddings scale into millions, memory usage and query latency surge, leading to inflated costs and poor user experience. By storing embeddings in reduced-precision formats (int8 or binary), you can dramatically cut memory requirements and speed up retrieval. Voyage AI's quantization-aware embedding models are specifically tuned to handle compressed vectors without significant loss of accuracy. MongoDB Atlas streamlines the workflow by handling the creation, storage, and indexing of compressed vectors, enabling easier scaling and management. MongoDB is built for change, allowing users to effortlessly scale AI workloads as resource demands evolve. Organizations are now scaling AI applications from proofs of concept to production systems serving millions of users. This shift creates scalability, latency, and resource challenges for mission-critical applications leveraging recommendation engines, semantic search, and retrieval-augmented generation (RAG) systems. At scale, minor inefficiencies compound and become major bottlenecks, increasing latency, memory usage, and infrastructure costs. This guide explains how vector quantization enables high-performance, cost-effective AI applications at scale. The challenge: Scaling vector search in production Let’s start by considering a modern voice assistance platform that combines semantic search with natural language understanding. During development, the system only needs to process a few hundred queries per day, converting speech to text and matching the resulting embeddings against a modest database of responses. The initial implementation is straightforward: each query generates a 32-bit floating-point embedding vector that's matched against a database of similar vectors using cosine similarity. This approach works smoothly in the prototype phase—response times are quick, memory usage is manageable, and the development team can focus on improving accuracy and adding features. However, as the platform gains traction and scales to processing thousands of queries per second against millions of document embeddings, the simple approach begins to break down. Each incoming query now requires loading massive amounts of high-precision floating-point vectors into memory, computing similarity scores across an exponentially larger dataset, and maintaining increasingly complex vector indexes for efficient retrieval. Without proper optimization, the system struggles as memory usage balloons, query latency increases, and infrastructure costs spiral upward. What started as a responsive, efficient prototype has become a bottleneck production system that struggles to maintain its performance requirements while serving a growing user base. The key challenges are: Loading high-precision 32-bit floating-point vectors into memory Computing similarity scores across massive embedding collections Maintaining large vector indexes for efficient retrieval Which can lead to critical issues like: High memory usage as vector databases struggle to keep float32 embeddings in RAM Increased latency as systems process large volumes of high-precision data Growing infrastructure costs as organizations scale their vector operations Reduced query throughput due to computational overhead AI workloads with tens or hundreds of millions of high-dimensional vectors (e.g., 80M+ documents at 1536 dimensions) face soaring RAM and CPU requirements. Storing float32 embeddings for these workloads can become prohibitively expensive. Vector quantization: A path to efficient scaling The obvious question is: How can you maintain the accuracy of your recommendations, semantic matches, and search queries, while drastically cutting down on compute and memory usage and reducing retrieval latency? Vector quantization is how. It helps you store embeddings more compactly, reduce retrieval times, and keep costs under control. Vector quantization offers a powerful solution to scalability, latency, and resource utilization challenges by compressing high-dimensional embeddings into compact representations while preserving their essential characteristics. This technique can dramatically reduce memory requirements and accelerate similarity computations without compromising retrieval accuracy. What is vector quantization? Vector quantization is a compression technique widely applied in digital signal processing and machine learning. Its core idea is to represent numerical data using fewer bits, reducing storage requirements without entirely sacrificing the data’s informative value. In the context of AI workloads, quantization commonly involves converting embeddings—originally stored as 32-bit floating-point values—into formats like 8-bit integers. By doing so, you can substantially decrease memory and storage consumption while maintaining a level of precision suitable for similarity search tasks. An important point to note is that the quantization mechanism is especially suitable for use cases that involve over 1 million vector embeddings, such as RAG applications, semantic search, or recommendation systems that require tight control of operational costs without a compromise on retrieval accuracy. Smaller datasets with fewer than 1 million embeddings might not see significant gains from quantization procedures. For smaller datasets, the overhead of implementing quantization might outweigh its benefits. Understanding vector quantization Vector quantization operates by mapping high-dimensional vectors to a discrete set of prototype vectors or converting them to lower-precision formats. There are three main approaches: Scalar quantization: Converts individual 32-bit floating-point values to 8-bit integers, reducing memory usage of vector values by 75% while maintaining reasonable precision. Product quantization: Compresses entire vectors at once by mapping them to a codebook of representative vectors, offering better compression than scalar quantization at the cost of more complex encoding/decoding. Binary quantization: Transforms vectors into binary (0/1) representations, achieving maximum compression but with more significant information loss. A vector database that applies these compression techniques must effectively manage multiple data structures: Hierarchical navigable small world (HNSW) graph for navigable search Full-fidelity vectors (32-bit float embeddings) Quantized vectors (int8 or binary) When quantization is defined in the vector index, the system builds quantized vectors and constructs the HNSW graph from these compressed vectors. Both structures are placed in memory for efficient search operations, significantly reducing the RAM footprint compared to storing full-fidelity vectors alone. The table below illustrates how different quantization mechanisms impact memory usage and disk consumption. This example focuses on HNSW indexes storing 30 GB of original float32 embeddings alongside a 0.1 GB HNSW graph structure. Our RAM usage estimates include a 10% overhead factor (1.1 multiplier) to account for JVM memory requirements with indexes loaded into page cache, reflecting typical production deployment conditions. Actual overhead may vary based on specific configurations. Here are key attributes to consider based on the table below: Estimated RAM usage: Combines HNSW graph size with either full or quantized vectors, plus a small overhead factor (1.1 for index overhead). Disk usage: Includes storage for full-fidelity vectors, HNSW graph, and quantized vectors when applicable. Notice that while enabling quantization increases total disk usage —because you still store full-fidelity vectors for exact nearest neighbor queries in both cases and rescoring in the case of binary quantization—it dramatically decreases RAM requirements and speeds up initial retrieval . MongoDB Atlas Vector Search offers powerful scaling capabilities through its automatic quantization system . As illustrated in Figure 1 below, MongoDB Atlas supports multiple vector search indexes with varying precision levels: Float32 for maximum accuracy, Scalar Quantized (int8) for balanced performance with 3.75× RAM reduction, and Binary Quantized (1-bit) for maximum speed with 24× RAM reduction. The quantization variety provided by MongoDB Atlas allows users to optimize their vector search workloads based on specific requirements. For collections exceeding 1M vectors, Atlas automatically applies the appropriate quantization mechanism, with binary quantization particularly effective when combined with Float32 rescoring for final refinement. Figure 1: MongoDB Atlas Vector Search Architecture with Automatic Quantization Data flow through embedding generation, storage, and tiered vector indexing with binary rescoring. Binary quantization with rescoring A particularly effective strategy is to combine binary quantization with a rescoring step using full-fidelity vectors. This approach offers the best of both worlds: extremely fast lookups thanks to binary data formats, plus more precise final rankings from higher-fidelity embeddings. Initial retrieval (Binary) Embeddings are stored as binary to minimize memory usage and accelerate the approximate nearest neighbor (ANN) search. Hamming distance (via XOR + population count) is used, which is computationally faster than Euclidean or cosine similarity on floats. Rescoring The top candidate results from the binary pass are re-evaluated using their float or int8 vectors to refine the ranking. This step mitigates the loss of detail in binary vectors, balancing result accuracy with the speed of the initial retrieval. By pairing binary vectors for rapid recall with full-fidelity embeddings for final refinement, you can keep your system highly performant and maintain strong relevance. The need for quantization-aware models Not all embedding models perform equally well under quantization. Models need to be specifically trained with quantization in mind to maintain their effectiveness when compressed. Some models—especially those trained purely for high-precision scenarios—suffer significant accuracy drops when their embeddings are represented with fewer bits. Quantization-aware training (QAT) involves: Simulating quantization effects during the training process Adjusting model weights to minimize information loss Ensuring robust performance across different precision levels This is particularly important for production applications where maintaining high accuracy is crucial. Embedding models like those from Voyage AI— which recently joined MongoDB —are specifically designed with quantization awareness, making them more suitable for scaled deployments. These models preserve more of their essential feature information even under aggressive compression. Voyage AI provides a suite of embedding models specifically designed with QAT in mind, ensuring minimal loss in semantic quality when shifting to 8-bit integer or even binary representations. Figure 2: Embedding model performance comparing retrieval quality (NDCG@10) versus storage costs . Voyage AI models (green) maintain superior retrieval quality even with binary quantization (triangles) and int8 compression (squares), achieving up to 100x storage efficiency compared to standard float embeddings (circles) . The graph above shows several important patterns that demonstrate why quantization-aware training (QAT) is crucial for maintaining performance under aggressive compression. The Voyage AI family of models (shown in green) demonstrates strong performance in retrieval quality even under extreme compression. The voyage-3-large model demonstrates this dramatically—when using int8 precision at 1024 dimensions, it performs nearly identically to its float precision, 2048-dimensional counterpart, showing only a minimal 0.31% quality reduction despite using 8 times less storage. This showcases how models specifically designed with quantization in mind can preserve their semantic understanding even under substantial compression. Even more impressive is how QAT models maintain their edge over larger, uncompressed models. The voyage-3-large model with int8 precision and 1024 dimensions outperforms OpenAI-v3-large (using float precision and 3072 dimensions) by 9.44% while requiring 12 times less storage. This performance gap highlights that raw model size and dimension count aren't the decisive factors —it's the intelligent design for quantization that matters. The cost implications become truly striking when we examine binary quantization. Using voyage-3-large with 512-dimensional binary embeddings, we still achieve better retrieval quality than OpenAI-v3-large with its full 3072-dimensional float embeddings while using 200 times less storage. To put this in practical terms: what would have cost $20,000 in monthly storage can be reduced to just $100 while actually improving performance. In contrast, models not specifically trained for quantization, such as OpenAI's v3-small (shown in gray), show a more dramatic drop in retrieval quality as compression increases. While these models perform well in their full floating-point representation (at 1x storage cost), their effectiveness deteriorates more sharply when quantized, especially with binary quantization. For production applications where both accuracy and efficiency are crucial, choosing a model that has undergone quantization-aware training can make the difference between a system that degrades under compression and one that maintains its effectiveness while dramatically reducing resource requirements. Read more on the Voyage AI blog . Impact: Memory, retrieval latency, and cost Vector quantization addresses the three core challenges of large-scale AI workloads—memory, retrieval latency, and cost—by compressing full-precision embeddings into more compact representations. Below is a breakdown of how quantization drives efficiency in each area. Figure 3: Quantization Performance Metrics: Memory Savings with Minimal Accuracy Trade-offs Comparison of scalar vs. binary quantization showing RAM reduction (75%/96%), query accuracy retention (99%/95%), and performance gains (>100%) for vector search operations Memory and storage optimization Quantization techniques dramatically reduce compute resource requirements while maintaining search accuracy for vector embeddings at scale. Lower RAM footprint Storage in RAM is often the primary bottleneck for vector search systems Embeddings stored as 8-bit integers or binary reduce overall memory usage, allowing significantly more vectors to remain in memory. This compression directly shrinks vector indexes (e.g., HNSW), leading to faster lookups and fewer disk I/O operations. Reduced disk usage in collection with binData binData (binary) formats can cut raw storage needs by up to 66%. Some disk overhead may remain when storing both quantized and original vectors, but the performance benefits justify this tradeoff. Practical gains 3.75× reduction in RAM usage with scalar (int8) quantization Up to 24× reduction with binary quantization, especially when combined with rescoring to preserve accuracy. Significantly more efficient vector indexes, enabling large-scale deployments without prohibitive hardware upgrades. Retrieval latency Quantization methods leverage CPU cache optimizations and efficient distance calculations to accelerate vector search operations beyond what's possible with standard float32 embeddings. Faster similarity computations Smaller data types are more CPU-cache-friendly, which speeds up distance calculations. Binary quantization uses Hamming distance (XOR + popcount), yielding dramatically faster top-k candidate retrieval. Improved throughput With reduced memory overhead, the system can handle more concurrent queries at lower latencies. In internal benchmarks, query performance for large-scale retrievals improved by up to 80% when adopting quantized vectors. Cost efficiency Vector quantization provides substantial infrastructure savings by reducing memory and computation requirements while maintaining retrieval quality through compression and rescoring techniques. Lower infrastructure costs Smaller vectors consume fewer hardware resources, enabling deployments on less expensive instances or tiers. Reduced CPU/GPU time per query allows resource reallocation to other critical parts of the application. Better scalability As data volumes grow, memory and compute requirements don’t escalate as sharply. Quantization-aware training (QAT) models, such as those from Voyage AI, help maintain accuracy while reaping cost savings at scale. By compressing vectors into int8 or binary formats, you tackle memory constraints, accelerate lookups, and curb infrastructure expenses—making vector quantization an indispensable strategy for high-volume AI applications. MongoDB Atlas: Built for Changing Workloads with Automatic Vector Quantization The good news for developers is that MongoDB Atlas supports “automatic scalar” and “automatic binary quantization” in index definitions, reducing the need for external scripts or manual data preprocessing. By quantizing at index build time and query time, organizations can run large-scale vector workloads on smaller, more cost-effective clusters. A common question most developers ask is when to use quantization. Quantization becomes most valuable once you reach substantial data volumes—on the order of a million or more embeddings. At this scale, memory and compute demands can skyrocket, making reduced memory footprints and faster retrieval speeds essential. Examples of cases that call for quantization include: High-volume scenarios: Datasets with millions of vector embeddings where you must tightly control memory and disk usage. Real-time responses: Systems needing low-latency queries under high user concurrency. High query throughput: Environments with numerous concurrent requests demanding both speed and cost-efficiency. For smaller datasets (under 1 million vectors), the added complexity of quantization may not justify the benefits. However, for large-scale deployments, it becomes a critical optimization that can dramatically improve both performance and cost-effectiveness. Now that we have established a strong foundation on the advantages of quantization—specifically the benefits of binary quantization with rescoring— feel free to refer to the MongoDB documentation to learn more about implementing vector quantization. You can also learn more about Voyage AI’s state-of-the-art embedding models on our product page .

February 27, 2025

Redefining the Database for AI: Why MongoDB Acquired Voyage AI

This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . AI is reshaping industries, redefining customer experiences, and transforming how businesses innovate, operate, and compete. While much of the focus is on frontier models, a fundamental challenge lies in data—how it is stored, retrieved, and made useful for AI applications. The democratization of AI-powered software depends on building on top of the right abstractions, yet today, creating useful, real-time AI applications at scale is not feasible for most organizations. The challenge isn’t just complexity—it’s trust. AI models are probabilistic, meaning their outputs aren’t deterministic and predictable. This is easily evident in the hallucination problem in chatbots today, and becomes even more critical with the rise of agents, where AI systems make autonomous decisions. Development teams need the ability to control, shape, and ground generated outputs to align with their objectives and ensure accuracy. AI-powered search and retrieval is a powerful tool that extracts relevant contextual data from specific sources, augmenting AI models to generate reliable and accurate responses or take responsible and safe actions, as seen in the prominent retrieval augmented generation (RAG) approach. At the core of AI-powered retrieval are embedding generation and reranking—two key AI components that capture the semantic meaning of data and assess the relevance of queries and results. We believe embedding generation and reranking, as well as AI-powered search, belong in the database layer, simplifying the stack and creating a more reliable foundation for AI applications. By bringing more intelligence into the database, we help businesses mitigate hallucinations, improve trustworthiness, and unlock AI’s full potential at scale. The most impactful applications require a flexible, intelligent, and scalable data foundation. That’s why we’re excited to announce the acquisition of Voyage AI , a leader in embedding and reranking models that dramatically improve accuracy through AI-powered search and retrieval. This move isn’t just about adding AI capabilities— it’s about redefining the database for the AI era . Why this matters: The future of AI is built on better relevance and accuracy in data AI is probabilistic—it’s not built like traditional software with pre-defined rules and logic. Instead, it generates responses or takes action based on how the AI model is trained and what data is retrieved. However, due to the probabilistic nature of the technology, AI can hallucinate. Hallucinations are a direct consequence of poor or imprecise retrieval—when AI lacks access to the right data, it generates plausible but incorrect information. This is a critical barrier to AI adoption, especially in enterprises and for mission-critical use cases where accuracy is non-negotiable. This makes retrieving the most relevant data essential for AI applications to deliver high-quality, contextually accurate results. Today, developers rely on a patchwork of separate components to build AI-powered applications. Sub-optimal choices of these components, such as embedding models, can yield low-relevancy data retrieval and low-quality generated outputs. This fragmented approach is complex, costly, inefficient, and cumbersome for developers. With Voyage AI, MongoDB solves this challenge by making AI-powered search and retrieval native to the database. Instead of implementing workarounds or managing separate systems, developers can generate high-quality embeddings from real-time operational data, store vectors, perform semantic search, and refine results—all within MongoDB. This eliminates complexity and delivers higher accuracy, lower latency, and a streamlined developer experience. What Voyage AI brings to MongoDB Voyage AI has built a world-class AI research team with roots at Stanford, MIT, UC Berkeley, and Princeton and has rapidly become a leader in high-precision AI retrieval. Their technology is already trusted by some of the most advanced AI startups, including Anthropic, LangChain, Harvey, and Replit. Notably, Voyage AI’s embedding models are the highest-rated zero-shot models in the Hugging Face community. Voyage AI’s models are designed to increase the quality of generated output by: Enhancing vector search by creating embeddings that better capture meaning across text, images, PDFs, and structured data. Improving retrieval accuracy through advanced reranking models that refine search results for AI-powered applications. Enabling domain-specific AI with fine-tuned models optimized for different industries such as financial services, healthcare, and law, and use cases such as code generation. By integrating Voyage AI’s retrieval capabilities into MongoDB, we’re helping organizations more easily build AI applications with greater accuracy and reliability—without unnecessary complexity. How Voyage AI will be integrated into MongoDB We are integrating Voyage AI with MongoDB in three phases. In the first phase, Voyage AI’s text embedding, multi-modal embedding, and reranking models will remain widely available through Voyage AI’s current APIs and via the AWS and Azure Marketplaces—ensuring developers can continue to use their best-in-class embedding and reranking capabilities. We will also invest in the scalability and enterprise readiness of the platform to support the increased adoption of Voyage AI’s models. Next, we will seamlessly embed Voyage AI’s capabilities into MongoDB Atlas , starting with an auto-embedding service for Vector Search, which will handle embedding generation automatically. Native reranking will follow, allowing developers to boost retrieval accuracy instantly. We also plan to expand domain-specific AI capabilities to better support different industries (e.g., financial services, legal, etc.) or use cases (e.g., code generation). Finally, we will advance AI-powered retrieval with enhanced multi-modal capabilities, enabling seamless retrieval and ranking of text, images, and video. We also plan to introduce instruction-tuned models, allowing developers to refine search behavior using simple prompts instead of complex fine-tuning. This will be complemented by embedding lifecycle management in MongoDB Atlas, ensuring continuous updates and real-time optimization for AI applications. What this means for developers and businesses AI-powered applications need more than a database that just stores, processes, and persists data—they need a database that actively improves retrieval accuracy, scales seamlessly, and eliminates operational friction. With Voyage AI, MongoDB redefines what’s required for a database to underpin mission-critical AI-powered applications. Developers will no longer need to manage external embedding APIs, standalone vector stores, or complex search pipelines. AI retrieval will be built into the database itself, making semantic search, vector retrieval, and ranking as seamless as traditional queries. For businesses, this translates to faster time-to-value and greater confidence in scaling AI applications. By delivering high-quality results at scale, enterprises can seamlessly integrate AI into their most critical use cases, ensuring reliability, performance, and real-world impact. Looking ahead: What comes next This is just the beginning. Our vision is to make MongoDB the most powerful and intuitive database for modern, AI-driven applications. Voyage AI’s models will soon be natively available in MongoDB Atlas. We will continue evolving MongoDB’s AI retrieval capabilities, making it smarter, more adaptable, and capable of handling a wider range of data types and use cases. Stay tuned for more details on how you can start using Voyage AI’s capabilities in MongoDB. To learn more about how MongoDB and Voyage AI are powering state-of-the-art AI search and retrieval for building, scaling, and deploying intelligent applications, visit our product page .

February 24, 2025

AI용 데이터베이스 재정의: MongoDB가 Voyage AI를 인수한 이유

AI는 산업을 재편하고, 고객 경험을 새롭게 정의하며, 기업의 혁신, 운영, 경쟁 방식을 변화시키고 있습니다. 대부분의 초점이 프론티어 모델에 맞추어져 있지만, 근본적인 과제는 데이터를 저장하고 조회하여 AI 애플리케이션에 유용하게 만드는 방법에 있습니다. AI 기반 소프트웨어의 민주화는 적절한 추상화 위에 구축하는 데 달려 있지만, 현재 대부분의 조직에서 유용한 실시간 AI 애플리케이션을 대규모로 생성하는 것은 불가능합니다. 과제는 단지 복잡성만이 아닌 신뢰입니다. AI 모델은 확률론적이기 때문에 출력이 결정론적이지 않고 예측할 수 없습니다. 이는 오늘날 챗봇의 환각 문제에서 쉽게 확인할 수 있으며, AI 시스템이 자율적으로 의사 결정을 내리는 에이전트의 부상으로 더욱 중요해졌습니다. 개발 팀은 생성된 출력을 목표에 맞게 조정하고 정확성을 보장하기 위해 출력을 제어하고 구체화하며 근거를 마련할 수 있는 능력이 필요합니다. AI 기반 검색 및 조회는 특정 소스에서 관련 문맥 데이터를 추출하여 신뢰할 수 있고 정확한 응답을 생성하거나 책임감 있고 안전한 조치를 취하도록 AI 모델을 보강하는 강력한 도구입니다. 이는 RAG(검색 증강 생성) 접근 방식에서 두드러지게 나타납니다. AI 기반 검색의 핵심은 데이터의 의미론적 의미를 포착하고 쿼리와 결과의 관련성을 평가하는 두 가지 주요 AI 구성 요소인 임베딩 생성과 순위 재지정입니다. 임베딩 생성과 순위 재지정 및 AI 기반 검색은 데이터베이스 계층에 포함되어 스택을 간소화하고 AI 애플리케이션을 위한 보다 안정적인 기반을 마련한다고 생각합니다. 데이터베이스에 보다 많은 지능을 활용하여 MongoDB는 기업이 환각을 완화하고, 신뢰성을 높이고, AI의 잠재력을 대규모로 최대한 발휘할 수 있도록 지원합니다. 가장 영향력 있는 애플리케이션에는 유연하고 지능적이며 확장 가능한 데이터 기반이 필요합니다. 그래서 AI 기반 검색 및 조회를 통해 정확도를 획기적으로 개선하는 임베딩 및 순위 재지정 모델 분야의 선두주자인 Voyage AI 인수를 발표하게 되어 기쁩니다. 이러한 움직임은 단순히 AI 기능을 추가하는 것이 아니라 AI 시대에 맞춰 데이터베이스를 재정의하는 것입니다. 이것이 중요한 이유: AI의 미래는 데이터의 더 나은 관련성과 정확성을 기반으로 구축됩니다 AI는 확률론적입니다. 사전 정의된 규칙과 논리를 갖춘 기존 소프트웨어처럼 구축되지 않았습니다. 대신 AI 모델의 훈련 방식과 조회된 데이터에 따라 응답을 생성하거나 조치를 취합니다. 그러나 기술의 확률론적 특성으로 인해 AI는 환각을 일으킬 수 있습니다. 환각은 잘못되거나 부정확한 검색의 직접적인 결과로, AI가 적절한 데이터에 액세스하지 못하면 그럴듯하지만 잘못된 정보를 생성합니다. 이는 특히 정확도를 타협할 수 없는 엔터프라이즈 및 미션 크리티컬 사용 사례에서 AI 도입을 가로막는 중대한 장벽입니다. 따라서 가장 관련성이 높은 데이터를 검색하는 것은 AI 애플리케이션이 고품질의 맥락에 맞는 정확한 결과를 제공하는 데 필수적입니다. 오늘날 개발자들은 AI 기반 애플리케이션을 구축하기 위해 개별 구성 요소의 조합에 의존합니다. 임베딩 모델과 같은 이러한 구성 요소를 최적으로 선택하지 않으면 데이터 검색의 관련성이 낮아지고 생성된 출력의 품질이 낮아질 수 있습니다. 이러한 단편적인 접근 방식은 복잡하고 비용이 많이 들며 비효율적이고 개발자에게 번거롭습니다. MongoDB는 Voyage AI를 통해 데이터베이스에 AI 기반 검색 및 조회 기능을 네이티브로 제공하여 이 과제를 해결합니다. 개발자는 해결 방법을 구현하거나 별도의 시스템을 관리하는 대신 MongoDB 내에서 실시간 운영 데이터로부터 고품질 임베딩을 생성하고, 벡터를 저장하고, 시맨틱 검색을 수행하고, 결과를 정제할 수 있습니다. 이를 통해 복잡성을 제거하고 더 높은 정확도, 더 짧은 지연 시간, 간소화된 개발자 경험을 제공할 수 있습니다. Voyage AI가 MongoDB에 제공하는 기능 Voyage AI는 Stanford, MIT, UC Berkeley, Princeton에 뿌리를 둔 세계적 수준의 AI 연구팀을 구축하여 고정밀 AI 검색 분야의 리더로 빠르게 성장해 왔습니다. 이들의 기술은 이미 Anthropic, LangChain, Harvey, Replit 등 가장 발전한 AI 스타트업의 신뢰를 받고 있습니다. 특히 Voyage AI의 임베딩 모델은 Hugging Face 커뮤니티에서 가장 높은 평가를 받은 제로샷 모델입니다. Voyage AI의 모델은 다음을 통해 생성된 출력의 품질을 향상하도록 설계되었습니다: 텍스트, 이미지, PDF 및 정형 데이터 전반의 의미를 더 잘 포착하는 임베딩을 생성하여 벡터 검색을 향상시킵니다. AI 기반 애플리케이션의 검색 결과를 정제하는 고급 순위 재지정 모델을 통해 검색 정확도를 향상시킵니다. 금융 서비스, 의료, 법률과 같은 다양한 산업에 최적화된 미세 조정된 모델과 코드 생성과 같은 사용 사례를 통해 분야별 AI를 활성화합니다. Voyage AI의 검색 기능을 MongoDB에 통합함으로써 조직이 불필요한 복잡성 없이 더 높은 정확도와 신뢰성을 갖춘 AI 애플리케이션을 보다 쉽게 구축할 수 있도록 지원하고 있습니다. Voyage AI가 MongoDB에 통합되는 방식 3단계에 걸쳐 Voyage AI를 MongoDB 와 통합할 예정입니다. 첫 번째 단계에서는 Voyage AI의 텍스트 임베딩, 멀티모달 임베딩 및 순위 재지정 모델이 Voyage AI의 현재 API와 AWS 및 Azure Marketplace를 통해 계속 널리 제공되어 개발자가 동급 최고의 임베딩 및 순위 재지정 기능을 계속 사용할 수 있도록 보장할 것입니다. 또한 Voyage AI 모델 채택 증가를 지원하기 위해 플랫폼의 확장성과 엔터프라이즈 준비성에 투자할 것입니다. 다음으로, 임베딩 생성을 자동으로 처리하는 Vector Search용 자동 임베딩 서비스를 시작으로 Voyage AI의 기능을 MongoDB Atlas에 원활하게 임베딩할 예정입니다. 네이티브 순위 재지정이 뒤따를 예정이므로 개발자는 검색 정확도를 즉시 높일 수 있습니다. 또한 다양한 산업(예: 금융 서비스, 법률 등)이나 사용 사례(예: 코드 생성)를 더 잘 지원하기 위해 분야별 AI 기능을 확장할 계획입니다. 마지막으로, 향상된 멀티모달 기능을 통해 AI 기반 검색을 발전시켜 텍스트, 이미지, 동영상을 원활하게 검색하고 순위를 지정할 수 있도록 할 것입니다. 또한 명령어 조정 모델을 도입하여 개발자가 복잡한 미세 조정 대신 간단한 프롬프트를 사용하여 검색 동작을 개선할 수 있도록 할 계획입니다. 이는 MongoDB Atlas에 수명 주기 관리 기능을 임베딩하여 AI 애플리케이션에 대한 지속적인 업데이트와 실시간 최적화를 보장함으로써 보완될 것입니다. 이것이 개발자와 기업에 의미하는 것 AI 기반 애플리케이션에는 단순히 데이터를 저장, 처리, 보존하는 데이터베이스가 아니라 검색 정확도를 능동적으로 개선하고 원활하게 확장하며 운영상의 마찰을 없애는 데이터베이스가 필요합니다. MongoDB는 Voyage AI를 통해 데이터베이스가 미션 크리티컬 AI 기반 애플리케이션을 지원하는 데 필요한 요건을 재정의합니다. 개발자는 더 이상 외부 임베딩 API, 독립형 벡터 저장소 또는 복잡한 검색 파이프라인을 관리할 필요가 없습니다. AI 검색 기능이 데이터베이스 자체에 내장되어 시맨틱 검색, 벡터 검색, 순위 지정을 기존 쿼리처럼 원활하게 수행할 수 있습니다. 기업에게 이는 가치 실현 시간을 단축하고 AI 애플리케이션 확장에 대한 확신이 높아진다는 것을 의미합니다. 대규모로 고품질의 결과를 제공함으로써 엔터프라이즈는 가장 중요한 사용 사례에 AI를 원활하게 통합하여 신뢰성, 성능 및 실제 영향을 보장할 수 있습니다. 앞으로의 전망: 다음 단계 이것은 시작에 불과합니다. MongoDB의 비전은 MongoDB를 최신 AI 기반 애플리케이션을 위한 가장 강력하고 직관적인 데이터베이스로 만드는 것입니다. Voyage AI 모델은 곧 MongoDB Atlas에서 기본적으로 제공될 예정입니다. MongoDB의 AI 검색 기능을 계속 발전시켜 더욱 스마트하고 적응력이 뛰어나며 다양한 데이터 유형과 사용 사례를 처리할 수 있도록 만들 것입니다. MongoDB에서 Voyage AI의 기능을 사용하는 방법에 대한 자세한 내용은 계속 지켜봐 주세요. MongoDB와 Voyage AI가 지능형 애플리케이션의 구축, 확장, 배포를 위한 최첨단 AI 검색 및 조회를 지원하는 방식에 대해 자세히 알아보려면 제품 페이지 를 방문하세요.

February 24, 2025

Redefinindo o banco de dados para AI: por que o MongoDB adquiriu a Voyage AI

A AI está remodelando setores, redefinindo as experiências dos clientes e transformando a forma como as empresas inovam, operam e competem entre si. Embora grande parte do foco esteja nos modelos de ponta, um desafio fundamental está nos dados: como eles são armazenados, extraídos e se tornam úteis para aplicativos de AI. A democratização de softwares baseados em AI depende de um desenvolvimento a partir dos aspectos abstratos certos, mas, atualmente, criar aplicativos de AI úteis, em tempo real e em escala não é viável para a maioria das organizações. O desafio não é apenas a complexidade, mas sim a confiança. Modelos de AI são probabilísticos, o que significa que seus resultados não são determinísticos e previsíveis. Isso é fácil de notar no problema de alucinação dos chatbots atuais, e fica ainda mais crucial com o surgimento de agentes, em que os sistemas de AI tomam decisões de forma autônoma. As equipes de desenvolvimento precisam ter a capacidade de controlar, moldar e fundamentar os resultados gerados para se alinharem com seus objetivos e garantir a precisão. A busca e recuperação impulsionada por AI é uma ferramenta poderosa que extrai dados contextuais relevantes de fontes específicas, aprimorando modelos de AI para gerar respostas confiáveis e precisas ou tomar atitudes responsáveis e seguras, como visto na abordagem proeminente de geração aumentada por recuperação (RAG). No centro da recuperação impulsionada por AI estão a geração de incorporações e a reclassificação: dois componentes principais da AI que captam o significado semântico dos dados e avaliam a relevância de queries e resultados. Acreditamos que a geração de incorporações e a reclassificação, assim como a pesquisa impulsionada por AI, devem estar na camada de banco de dados, simplificando a pilha e criando uma base mais confiável para aplicativos de AI. Ao trazer mais inteligência para o banco de dados, ajudamos as empresas a mitigar alucinações, melhorar a confiabilidade e liberar todo o potencial da AI em grande escala. Os aplicativos mais impactantes exigem uma base de dados flexível, inteligente e dimensionável. É por isso que estamos animados em anunciar a aquisição da Voyage AI , líder em modelos de incorporação e reclassificação que melhoram drasticamente a precisão por meio de pesquisa e recuperação impulsionadas por AI. Essa mudança não se trata apenas de adicionar funcionalidades de AI, mas também de redefinir o banco de dados para a era da AI. Por que isso importa: o futuro da AI é criado com base em uma maior relevância e precisão dos dados A AI é probabilística; não é criada como o software tradicional, com regras e lógica predefinidas. Em vez disso, ela gera respostas ou toma atitudes com base em como o modelo de AI é treinado e em quais dados são extraídos. No entanto, devido à natureza probabilística da tecnologia, a AI pode alucinar. As alucinações são uma consequência direta da extração inadequada ou imprecisa. Quando a AI não tem acesso aos dados corretos, ela gera informações plausíveis, mas incorretas. Essa é uma barreira crucial para a adoção de AI, especialmente em empresas e para casos de uso fundamentais, em que não se pode abrir mão da precisão. Isso torna essencial a recuperação dos dados mais relevantes para que os aplicativos de AI entreguem resultados de alta qualidade e precisos quanto ao contexto. Hoje, os desenvolvedores dependem de um conjunto de componentes separados para criar aplicativos com tecnologia de AI. Escolhas inferiores desses componentes, como modelos de incorporação, podem resultar em recuperação de dados de baixa relevância e resultados de baixa qualidade. Essa abordagem fragmentada é complexa, cara, ineficiente e complicada para os desenvolvedores. Com a Voyage AI, o MongoDB resolve esse desafio ao tornar a pesquisa e recuperação com tecnologia de AI nativas do banco de dados. Em vez de implementar soluções alternativas ou gerenciar sistemas separados, os desenvolvedores podem gerar incorporações de alta qualidade a partir de dados operacionais em tempo real, armazenar vetores, realizar pesquisa semântica e refinar os resultados. Tudo dentro do MongoDB. Isso elimina a complexidade e proporciona maior precisão, menor latência e uma experiência de desenvolvedor mais simplificada. O que o Voyage AI traz para o MongoDB A Voyage AI montou uma equipe de pesquisa em AI de excelência com origens em Stanford, MIT, UC Berkeley e Princeton, e rapidamente se tornou líder em recuperação de AI de alta precisão. A tecnologia deles já garantiu a confiança de algumas das startups de AI mais avançadas, como Anthropic, LangChain, Harvey e Replit. Em especial, os modelos de incorporações da Voyage AI são os modelos zero-shot com a melhor classificação na comunidade Hugging Face. Os modelos da Voyage AI são projetados para melhorar a qualidade do resultado gerado: Aprimorando a pesquisa vetorial por meio da criação de incorporações que captam melhor o significado em textos, imagens, PDFs e dados estruturados. Aprimorando a precisão da recuperação por meio de modelos avançados de reclassificação que refinam os resultados de pesquisa para aplicativos impulsionados por AI. Habilitando AI específica de domínio com modelos ajustados e otimizados para diferentes setores, como serviços financeiros, de saúde e jurídicos, e casos de uso como geração de código. Ao integrar as funcionalidades de recuperação da Voyage AI ao MongoDB, estamos ajudando as organizações a criar aplicativos de AI com maior facilidade, precisão e confiabilidade, sem complexidade desnecessária. Como a Voyage AI será integrada ao MongoDB Estamos integrando a Voyage AI ao MongoDB em três fases. Na primeira fase, os modelos de incorporação de texto, incorporação multimodal e reclassificação da Voyage AI continuarão amplamente disponíveis através das APIs atuais da Voyage AI e dos Marketplaces da AWS e Azure, garantindo que os desenvolvedores possam continuar a utilizar suas funcionalidades de incorporação e reclassificação de ponta. Também investiremos na escalabilidade e na prontidão empresarial da plataforma para dar suporte à crescente adoção dos modelos da Voyage AI. Em seguida, integraremos totalmente as funcionalidades da Voyage AI ao MongoDB Atlas , começando com um serviço de incorporação automática para o Vector Search, que gerenciará a geração de incorporações automaticamente. A reclassificação nativa virá em seguida, permitindo que os desenvolvedores aumentem instantaneamente a precisão da recuperação. Também planejamos expandir as funcionalidades de AI específicas de domínio para oferecer melhor suporte a diferentes setores (por exemplo, serviços financeiros, jurídicos etc.) ou casos de uso (por exemplo, geração de código). Por fim, avançaremos na recuperação com tecnologia de AI com funcionalidades multimodais aprimoradas, permitindo a recuperação e classificação total de textos, imagens e vídeos. Também planejamos introduzir modelos ajustados por instruções, permitindo que os desenvolvedores refinem o comportamento de pesquisa usando prompts simples em vez de ajustes complexos. Isso será complementado pela incorporação do gerenciamento do ciclo de vida no MongoDB Atlas, assegurando atualizações contínuas e otimização em tempo real para aplicativos de AI. Como isso afeta desenvolvedores e empresas Aplicativos com tecnologia de AI precisam de mais do que um banco de dados que apenas armazene, processe e mantenha dados; eles precisam de um banco de dados que melhore ativamente a precisão da recuperação, dimensione sem interrupções e elimine o atrito operacional. Com a Voyage AI, o MongoDB redefine o que é necessário para que um banco de dados sustente aplicativos fundamentais com tecnologia de AI. Os desenvolvedores não precisarão mais gerenciar APIs de incorporação externas, armazenamentos de vetores autônomos ou pipelines de pesquisa complexos. A recuperação de AI será criada no banco de dados, tornando a pesquisa semântica, a recuperação de vetores e a classificação tão integradas quanto as queries tradicionais. Para as empresas, isso se traduz em um tempo menor para obter valor e maior confiança no dimensionamento de aplicativos de AI. Ao fornecer resultados de alta qualidade em escala, as empresas podem integrar perfeitamente a AI aos seus casos de uso mais críticos, garantindo confiabilidade, desempenho e impacto no mundo real. Olhando para o futuro: o que vem a seguir Esse é apenas o começo. Nossa meta é fazer o MongoDB se tornar o banco de dados mais potente e intuitivo para aplicativos modernos impulsionados por AI. Os modelos da Voyage AI estarão disponíveis em breve, de forma nativa, no MongoDB Atlas. Continuaremos a evoluir as funcionalidades de recuperação de AI do MongoDB, tornando-as mais inteligentes, mais adaptáveis e capazes de lidar com uma gama mais ampla de tipos de dados e casos de uso. Logo traremos mais detalhes sobre como você pode começar a usar as funcionalidades da Voyage AI no MongoDB. Saiba mais sobre como o MongoDB e a Voyage AI estão impulsionando a pesquisa e recuperação de AI de última geração para criar, dimensionar e implantar aplicativos inteligentes acessando nossa página de produtos .

February 24, 2025

Ridefinire il database per l'IA: perché MongoDB ha acquisito Voyage AI

L'AI sta rimodellando i settori, ridefinendo le esperienze dei clienti e trasformando il modo in cui le aziende innovano, operano e competono. Sebbene gran parte dell'attenzione sia rivolta ai modelli di frontiera, una sfida fondamentale risiede nei dati: come vengono archiviati, recuperati e resi utili per le applicazioni di intelligenza artificiale. La democratizzazione dei software basati sull'AI dipende dalla pratica di sviluppare sulle astrazioni corrette, tuttavia oggi creare applicazioni AI utili e in tempo reale su larga scala non è fattibile per la maggior parte delle organizzazioni. La sfida non rappresentata solo dalla complessità, ma anche dalla fiducia. I modelli di AI sono probabilistici, il che significa che i loro risultati non sono deterministici e prevedibili. Questo aspetto è facilmente ravvisabile nel problema delle allucinazioni nei chatbot odierni e diventa ancora più critico con l'ascesa degli agenti, dove i sistemi AI prendono decisioni autonome. I team di sviluppo devono avere la capacità di controllare, modellare e ancorare i risultati generati per allinearli ai loro obiettivi e garantire l'accuratezza. La ricerca e il recupero basati sull'AI sono strumenti potenti che estraggono dati contestuali rilevanti da fonti specifiche, migliorando i modelli di AI per generare risposte affidabili e accurate o intraprendere azioni responsabili e sicure, come si vede nell'importante approccio Retrieval Augmented Generation (RAG). Al centro del recupero basato sull'AI ci sono la generazione di embedding e il reranking, due componenti chiave dell'AI che catturano il significato semantico dei dati e valutano la rilevanza delle query e dei risultati. Crediamo che l'integrazione della generazione di embedding e del reranking, così come la ricerca basata sull'AI, debbano essere parte del layer del database, semplificando lo stack e creando una base più affidabile per le applicazioni AI. Introducendo maggiore intelligenza nel database, aiutiamo le aziende a ridurre le allucinazioni, migliorare l'affidabilità e sfruttare appieno il potenziale dell'AI su larga scala. Le applicazioni di maggior impatto richiedono una base dati flessibile, intelligente e scalabile. Ecco perché siamo entusiasti di annunciare l'acquisizione di Voyage AI , leader nei modelli di embedding e reranking che migliorano notevolmente la precisione attraverso la ricerca e il recupero basati sull'AI. Questa mossa non riguarda solo l'aggiunta di funzionalità di intelligenza artificiale, ma anche la ridefinizione del database per l'era dell'AI. Perché questo è importante: il futuro dell'AI si basa su una maggiore pertinenza e accuratezza dei dati L'AI è probabilistica: non viene sviluppata come i software tradizionali con regole e logiche predefinite. Al contrario, genera risposte o intraprende azioni in base a come il modello di AI è stato addestrato e ai dati recuperati. Tuttavia, a causa della natura probabilistica della tecnologia, l'AI può generare allucinazioni. Le allucinazioni sono una conseguenza diretta di un recupero scarso o impreciso: quando l'AI non ha accesso ai dati corretti, genera informazioni plausibili ma errate. Si tratta di un ostacolo importante all'adozione dell'AI, specialmente nelle aziende e per i casi d'uso mission-critical dove la precisione è imprescindibile. Questo rende il recupero dei dati più rilevanti essenziale per le applicazioni AI, al fine di fornire risultati di alta qualità e contestualmente accurati. Oggi, gli sviluppatori si affidano a un mosaico di componenti separati per creare applicazioni basate sull'AI. Scelte non ottimali di questi componenti, come i modelli di embedding, possono portare a un recupero dei dati di bassa pertinenza e alla generazione di output di bassa qualità. Questo approccio frammentato è complesso, costoso, inefficiente e macchinoso per gli sviluppatori. Con Voyage AI, MongoDB risolve questa sfida rendendo la ricerca e il recupero basati sull'intelligenza artificiale nativi al database. Invece di implementare soluzioni alternative o gestire sistemi separati, gli sviluppatori possono generare embedding di alta qualità dai dati operativi in tempo reale, memorizzare vettori, eseguire ricerche semantiche e perfezionare i risultati, tutto all'interno di MongoDB. Questo elimina la complessità e offre una maggiore precisione, una minore Latency e un'esperienza semplificata per gli sviluppatori. In che modo Voyage AI migliora MongoDB Voyage AI ha messo insieme un team di ricerca sull'AI di alto livello, con membri formatisi in istituti come Stanford, MIT, UC Berkeley e Princeton, ed è rapidamente diventato un leader nel recupero ad alta precisione basato sull'AI. La loro tecnologia è già stata scelta da alcune delle startup di AI più avanzate, tra cui Anthropic, LangChain, Harvey e Replit. In particolare, i modelli di embedding di Voyage AI sono i modelli zero-shot più apprezzati nella comunità Hugging Face. I modelli di Voyage AI sono progettati per migliorare la qualità dell'output generato da: Migliorare la ricerca vettoriale creando embedding che catturano meglio il significato attraverso testi, immagini, PDF e dati strutturati. Migliorare l'accuratezza del recupero attraverso modelli avanzati di reranking che affinano i risultati di ricerca per applicazioni basate sull'AI. Abilitare l'AI specifica per il dominio con modelli perfezionati e ottimizzati per diversi settori, come financial services, sanità e diritto, e per casi d'uso come la generazione di codice. Integrando le capacità di recupero di Voyage AI in MongoDB, stiamo aiutando le organizzazioni a sviluppare applicazioni di AI con maggiore precisione e affidabilità, senza inutili complessità. In che modo Voyage AI sarà integrato in MongoDB Stiamo integrando Voyage AI con MongoDB in tre fasi. Nella prima fase, i modelli di embedding testuale, embedding multimodale e reranking di Voyage AI rimarranno ampiamente disponibili tramite le attuali API di Voyage AI e tramite i marketplace di AWS e Azure, garantendo agli sviluppatori di continuare a utilizzare le loro migliori funzionalità di embedding e reranking. Investiremo anche nella scalabilità e nella prontezza aziendale della piattaforma per supportare l'aumento dell'adozione dei modelli di Voyage AI. Successivamente, integreremo in maniera fluida le capacità di Voyage AI in MongoDB Atlas , iniziando con un servizio di embedding automatico per Vector Search, che gestirà automaticamente la generazione di embedding. Seguirà il reranking nativo, consentendo agli sviluppatori di migliorare immediatamente l'accuratezza del recupero. Abbiamo anche intenzione di espandere le funzionalità di AI specifiche del dominio per supportare meglio diversi settori (ad esempio, financial services, settore giuridico, ecc.) o casi d'uso (ad esempio, generazione di codice). Infine, ottimizzeremo il recupero basato sull'AI con capacità multimodali migliorate, consentendo un recupero e una classificazione perfetti di testo, immagini e video. Abbiamo anche in programma di introdurre modelli ottimizzati tramite istruzioni, che consentano agli sviluppatori di perfezionare il comportamento di ricerca utilizzando semplici prompt anziché complessi lavori di ottimizzazione. Questo processo sarà completato dall'integrazione della gestione del ciclo di vita in MongoDB Atlas, garantendo aggiornamenti continui e ottimizzazione in tempo reale per le applicazioni AI. Cosa significa questo per gli sviluppatori e le imprese Le applicazioni basate sull'AI necessitano di più di un database che si limiti a memorizzare, elaborare e conservare i dati: hanno bisogno di un database che migliori attivamente la precisione del recupero, che scali senza difficoltà e che elimini gli attriti operativi. Con Voyage AI, MongoDB ridefinisce i requisiti dei database per supportare applicazioni mission-critical basate sull'AI. Gli sviluppatori non avranno più bisogno di gestire API di embedding esterne, archivi vettoriali autonomi o pipeline di ricerca complesse. Il recupero basato sull'AI sarà integrato nel database stesso, rendendo la ricerca semantica, il recupero vettoriale e il posizionamento fluidi come le query tradizionali. Per le aziende, ciò si traduce in un tempo di realizzazione più veloce e in una maggiore fiducia nella scalabilità delle applicazioni AI. Fornendo risultati di alta qualità su larga scala, le aziende possono integrare senza problemi l'AI nei loro casi d'uso più critici, garantendo affidabilità, prestazioni e impatto nel mondo reale. Uno sguardo al futuro: cosa succederà dopo Questo è solo l'inizio. La nostra visione è fare di MongoDB il database più potente e intuitivo per le applicazioni moderne basate sull'AI. I modelli di Voyage AI saranno presto disponibili nativamente in MongoDB Atlas. Continueremo a sviluppare le funzionalità di recupero basate sull'AI di MongoDB, rendendole più intelligenti, adattabili e in grado di gestire una gamma più ampia di tipi di dati e casi d'uso. Seguici per ulteriori dettagli su come puoi iniziare a utilizzare le funzionalità di Voyage AI in MongoDB. Per saperne di più su come MongoDB e Voyage AI stanno potenziando la ricerca e il recupero basati sull'AI per la creazione, la scalabilità e l'implementazione di applicazioni intelligenti, visita la nostra pagina del prodotto .

February 24, 2025

Redéfinir la base de données pour l’IA : Pourquoi MongoDB a fait l'acquisition de Voyage AI

L’IA transforme les secteurs, redéfinit l’expérience client et modifie la manière dont les entreprises innovent, fonctionnent et gèrent la concurrence. Si les modèles d’avant-garde concentrent l’essentiel de l’attention, les données n’en demeurent pas moins un enjeu fondamental : de part leur façon d'être stockées, récupérées et préparées à l'usage pour les applications d’IA. La démocratisation des logiciels basés sur l’IA dépend de la capacité à s’appuyer sur les bonnes abstractions. Pourtant, aujourd’hui, la création d’applications d’IA utiles et en temps réel n’est pas à la portée de toutes les organisations. Le défi n’est pas seulement la complexité, mais aussi la confiance. Les modèles d’IA sont probabilistes, ce qui signifie que leurs résultats ne sont ni déterministes ni prévisibles. Le problème de l’hallucination dans les chatbots actuels le montre clairement et devient encore plus critique avec l’essor des agents, où les systèmes d'IA prennent des décisions autonomes. Les équipes de développement doivent pouvoir contrôler, modeler et rectifier les résultats générés afin de les aligner sur leurs objectifs et en garantir l’exactitude. La recherche et la récupération basées sur l’IA sont des outils puissants qui permettent d’extraire des données contextuelles pertinentes à partir de sources spécifiques, augmentant les modèles d’IA pour générer des réponses fiables et précises ou prendre des mesures responsables et sûres, comme le montre l’approche de la génération augmentée par récupération (RAG). La récupération basée sur l’IA repose sur deux composants clés de l’IA, la génération d’embeddings et le reclassement, qui permettent de capturer la signification sémantique des données et d’évaluer la pertinence des requêtes et des résultats. Nous pensons que la génération d’embeddings et le reclassement, de même que la recherche basée sur l’IA, doivent se situer au niveau de la couche de base de données de façon à simplifier la pile et à créer une base plus fiable pour les applications d’IA. En intégrant davantage d’intelligence dans la base de données, nous aidons les entreprises à réduire les hallucinations, à accroître la fiabilité et à exploiter pleinement le potentiel de l’IA à l’échelle. Les applications les plus percutantes nécessitent un socle de données flexible, intelligent et évolutif. C’est pourquoi nous sommes ravis d’annoncer l’acquisition de Voyage AI , un leader des modèles d’embedding et de reclassement qui améliorent considérablement la précision grâce à la recherche et à la récupération basées sur l’IA. Cette action ne consiste pas seulement à ajouter des capacités d’IA, mais à redéfinir la base de données pour l’ère de l’IA. Pourquoi est-ce important ? L’avenir de l’IA repose sur une meilleure pertinence et précision des données L’IA est probabiliste : elle n’est pas construite comme un logiciel traditionnel avec des règles et une logique prédéfinies. Au lieu de cela, elle génère des réponses ou prend des mesures en fonction de la manière dont le modèle d’IA est entraîné et les données récupérées. Cependant, en raison de la nature probabiliste de la technologie, l’IA peut halluciner. Les hallucinations sont une conséquence directe d’une récupération de données mal conçue ou imprécise. Lorsque l’IA n’a pas accès aux données appropriées, elle génère des informations plausibles mais incorrectes. Il s’agit d’un obstacle majeur à l’adoption de l’IA, notamment dans les entreprises et pour les cas d’utilisation critiques où la précision est un impératif. Par conséquent, pour que les applications d’IA fournissent des résultats de haute qualité et contextuellement précis, il est essentiel de récupérer les données les plus pertinentes. Aujourd’hui, les développeurs s’appuient sur un ensemble de composants distincts pour créer des applications basées sur l’IA. Un choix sous-optimal de ces composants, tels que les modèles d’embedding, peut entraîner une récupération de données peu pertinente et des résultats générés de faible qualité. Cette approche fragmentée est complexe, coûteuse, inefficace et fastidieuse pour les développeurs. Avec Voyage AI, MongoDB résout ce défi en intégrant de façon native la recherche et la récupération assistées par l’IA à la base de données. Au lieu d’implémenter des solutions de contournement ou de gérer des systèmes distincts, les développeurs peuvent générer des embeddings de haute qualité à partir de données opérationnelles en temps réel, stocker des vecteurs, effectuer une recherche sémantique et affiner les résultats, le tout dans MongoDB. Cette approche élimine la complexité, augmente la précision, réduit la latence et offre une expérience de développement simplifiée. Ce que Voyage AI apporte à MongoDB Voyage AI a constitué une équipe de recherche en IA de renommée mondiale rassemblant des universitaires de Stanford, du MIT, de l’UC Berkeley et de Princeton. L’entreprise est rapidement devenue un leader de la récupération de haute précision assistée par l’IA. Sa technologie est déjà adoptée par certaines des startups les plus avancées en IA, notamment Anthropic, LangChain, Harvey et Replit. En particulier, les modèles d’embedding de Voyage AI sont les modèles zero-shot les mieux notés de la communauté Hugging Face. Les modèles de Voyage AI sont conçus pour améliorer la qualité des résultats générés par : Amélioration de la recherche vectorielle en créant des embeddings qui capturent mieux le sens du texte, des images, des PDF et des données structurées. Amélioration de la précision de la récupération grâce à des modèles de reclassement avancés qui affinent les résultats de recherche pour les applications alimentées par l’IA. IA spécifique au domaine avec des modèles ajustés et optimisés pour divers secteurs comme les services financiers, la santé et le droit, ainsi que pour des cas d’utilisation comme la génération de code. En intégrant les capacités de récupération de Voyage AI dans MongoDB, nous aidons les organisations à créer plus facilement des applications d’IA avec une plus grande précision et fiabilité, sans complexité superflue. Comment Voyage AI sera intégré dans MongoDB L’intégration de Voyage AI dans MongoDB se déroulera en trois phases. Dans la première phase, les modèles d’embedding de texte, d’embedding multimodal et de reclassement de Voyage AI resteront largement disponibles via les API actuelles de Voyage AI et les places de marché AWS et Azure, garantissant ainsi aux développeurs de pouvoir continuer à utiliser leurs fonctionnalités d’embedding et de reclassement de qualité. Nous investirons également dans l’évolutivité et la préparation de la plateforme à l’entreprise afin de soutenir l’adoption des modèles de Voyage AI. Ensuite, nous intégrerons de manière fluide les fonctionnalités de Voyage AI dans MongoDB Atlas , en commençant par un service d’embedding automatisé pour Vector Search, qui assurera la génération automatique d’embeddings. Des fonctions natives de reclassement suivront, permettant aux développeurs d’améliorer instantanément la précision de la récupération. Nous prévoyons également d’étendre les capacités d’IA spécifiques à un domaine pour mieux soutenir divers secteurs (services financiers, juridiques, etc.) ou cas d’utilisation (p. ex., génération de code). Enfin, nous allons faire progresser la récupération assistée par l’IA avec des capacités multimodales améliorées, permettant une récupération et un classement fluides de textes, d’images et de vidéos. Nous prévoyons également d’introduire des modèles réglés par instructions, permettant aux développeurs d’affiner le comportement de recherche à l’aide de prompts simples au lieu d’un réglage complexe. Cela sera complété par l’intégration de la gestion du cycle de vie dans MongoDB Atlas, assurant des mises à jour continues et une optimisation en temps réel pour les applications d’IA. Implications pour les développeurs et les entreprises Les applications alimentées par l’IA nécessitent plus qu’une simple base de données qui stocke, traite et conserve les données ; elles ont besoin d’une base de données qui améliore activement la précision de la récupération, assure une répartition transparente et élimine les frictions opérationnelles. Avec Voyage AI, MongoDB redéfinit ce qui est nécessaire pour qu’une base de données soutienne des applications critiques alimentées par l’IA. Les développeurs n’auront plus besoin de gérer des API d’embedding externes, des magasins de vecteurs autonomes ou des pipelines de recherche complexes. La récupération par IA sera intégrée directement dans la base de données, rendant la recherche sémantique, la récupération vectorielle et le classement aussi fluides que les requêtes traditionnelles. Pour les entreprises, cela signifie un délai de rentabilisation plus rapide et une plus grande confiance dans la mise à l’échelle des applications d’IA. En fournissant des résultats de haute qualité à l’échelle, les entreprises peuvent intégrer l’IA de manière transparente dans leurs cas d’utilisation les plus critiques, garantissant ainsi la fiabilité, la performance et un impact sur le monde réel. Ce que nous prévoyons ensuite Ce n’est que le début. Notre vision est de faire de MongoDB la base de données la plus puissante et intuitive pour les applications modernes alimentées par l’IA. Les modèles de Voyage AI seront bientôt disponibles de façon native dans MongoDB Atlas. Nous continuerons à faire évoluer les capacités d’extraction d’IA de MongoDB, en les rendant plus intelligentes, plus adaptables et capables de gérer un plus large éventail de types de données et de cas d’utilisation. Restez à l’écoute pour plus de détails sur la façon dont vous pouvez commencer à utiliser les capacités de Voyage AI dans MongoDB. Pour en savoir plus sur la manière dont MongoDB et Voyage AI optimisent les fonctionnalités de recherche et de récupération basées sur l’IA pour créer, mettre à l’échelle et déployer des applications intelligentes, consultez notre page produit .

February 24, 2025

Redefiniendo la base de datos para la IA: Por qué MongoDB adquirió Voyage AI

La IA está transformando las industrias, redefiniendo las experiencias de los clientes y cambiando cómo las empresas innovan, operan y compiten. Aunque gran parte de la atención se centra en los modelos de vanguardia, un desafío fundamental radica en los datos: cómo se almacenan, se recuperan y se hacen útiles para las aplicaciones de IA. La democratización del software impulsado por IA depende de construir sobre las abstracciones correctas; sin embargo, hoy en día, crear aplicaciones de IA útiles y en tiempo real a escala no es factible para la mayoría de las organizaciones. El desafío no es solo la complejidad, sino la confianza. Los modelos de IA son probabilísticos, lo que significa que sus resultados no son deterministas ni predecibles. Esto queda fácilmente en evidencia en el problema de las alucinaciones en los chatbots actuales, y se vuelve aún más crítico con el auge de los agentes, donde los sistemas de IA toman decisiones autónomas. Los equipos de desarrollo necesitan la capacidad de controlar, dar forma y fundamentar los resultados generados para alinearse con sus objetivos y garantizar la precisión. La búsqueda y recuperación impulsadas por IA es una herramienta potente que extrae datos contextuales relevantes de fuentes específicas, mejorando los modelos de IA para generar respuestas confiables y precisas o tomar acciones responsables y seguras, como se observa en el destacado enfoque de generación aumentada por recuperación (RAG). En el núcleo de la recuperación impulsada por IA están la generación de incrustación y reclasificación, dos componentes clave de la IA que capturan el significado semántico de los datos y evalúan la relevancia de las consultas y los resultados. Creemos que la generación de incrustación y reclasificación, así como la búsqueda potenciada por IA, deben integrarse en la capa de base de datos, simplificando la arquitectura y creando una base más confiable para las aplicaciones de IA. Al incorporar más inteligencia en la base de datos, ayudamos a las empresas a mitigar las alucinaciones, mejorar la confiabilidad y liberar todo el potencial de la IA a gran escala. Las aplicaciones más impactantes requieren una base de datos flexible, inteligente y escalable. Por eso nos complace anunciar la adquisición de Voyage AI , un líder en modelos de incrustación y reclasificación que mejoran drásticamente la precisión mediante la búsqueda y recuperación impulsadas por IA. Este movimiento no solo se trata de agregar capacidades de IA, sino de redefinir la base de datos para la era de la IA. Por qué esto importa: El futuro de la IA se construye sobre una mejor relevancia y precisión en los datos La IA es probabilística; no se construye como el software tradicional con reglas y lógica predefinidas. En su lugar, genera respuestas o toma medidas basándose en cómo se entrena el modelo de IA y qué datos se recuperan. Sin embargo, debido a la naturaleza probabilística de la tecnología, la IA puede tener alucinaciones. Las alucinaciones son una consecuencia directa de una recuperación deficiente o imprecisa: cuando la IA no tiene acceso a los datos correctos, genera información plausible, pero incorrecta. Esta es una barrera crítica para la adopción de la IA, especialmente en las empresas y para casos de uso críticos donde la precisión no es negociable. Esto hace que la recuperación de los datos más relevantes sea esencial para que las aplicaciones de IA ofrezcan resultados de alta calidad y precisos en su contexto. Hoy en día, los desarrolladores dependen de un conjunto de componentes separados para desarrollar aplicaciones basadas en IA. Las elecciones subóptimas de estos componentes, como los modelos de incrustación, pueden resultar en una recuperación de datos de baja relevancia y en la generación de resultados de baja calidad. Este enfoque fragmentado es complejo, costoso, ineficiente y engorroso para los desarrolladores. Con Voyage AI, MongoDB resuelve este desafío al hacer que la búsqueda y recuperación impulsadas por IA sean nativas de la base de datos. En lugar de implementar soluciones alternativas o gestionar sistemas separados, los desarrolladores pueden generar incrustaciones de alta calidad a partir de datos operativos en tiempo real, almacenar vectores, realizar búsquedas semánticas y refinar resultados, todo dentro de MongoDB. Esto elimina la complejidad y proporciona una mayor precisión, menor latencia y una experiencia de desarrollador optimizada. Lo que Voyage AI aporta a MongoDB Voyage AI ha formado un equipo de investigación de IA de clase mundial con sede en Stanford, MIT, UC Berkeley y Princeton, y se ha convertido rápidamente en un líder en la recuperación de IA de alta precisión. Su tecnología ya cuenta con la confianza de algunas de las startups de IA más avanzadas, incluidas Anthropic, LangChain, Harvey y Replit. En particular, los modelos de incrustación de Voyage AI son los modelos zero-shock mejor calificados en la comunidad Hugging Face. Los modelos de Voyage AI están diseñados para mejorar la calidad del resultado generado: Al mejorar la búsqueda vectorial creando incrustaciones que capturen mejor el significado en texto, imágenes, PDF y datos estructurados. Al mejorar la precisión de recuperación mediante modelos avanzados de reclasificación que optimizan los resultados de búsqueda para aplicaciones basadas en IA. Al habilitar la IA específica del dominio con modelos ajustados y optimizados para diferentes sectores, como los servicios financieros, la atención médica y el derecho, y casos de uso, como la generación de código. Al integrar las capacidades de recuperación de Voyage AI en MongoDB, estamos ayudando a las organizaciones a crear aplicaciones de IA con mayor precisión y confiabilidad, sin complejidad innecesaria. Cómo se integrará Voyage AI en MongoDB Estamos integrando Voyage AI con MongoDB en tres fases. En la primera fase, los modelos de incrustación de texto, incrustación multimodal y reclasificación de Voyage AI seguirán estando ampliamente disponibles a través de las API actuales de Voyage AI y a través de los mercados de AWS y Azure, asegurando que los desarrolladores puedan continuar utilizando sus capacidades de incrustación y reclasificación de clase mundial. También invertiremos en la escalabilidad y la preparación empresarial de la plataforma para apoyar la mayor adopción de los modelos de Voyage AI. Luego, integraremos sin problemas las capacidades de Voyage AI en MongoDB Atlas , comenzando con un servicio de incrustación automática para Vector Search, que manejará la generación de incrustaciones automáticamente. La reclasificación nativa seguirá, permitiendo a los desarrolladores mejorar instantáneamente la precisión de la recuperación. También tenemos previsto ampliar las capacidades de IA específicas de cada dominio para ofrecer un mejor soporte a diferentes sectores (por ejemplo, servicios financieros, legales, etc.) o casos de uso (por ejemplo, generación de código). Finalmente, avanzaremos en la recuperación impulsada por IA con capacidades multimodales mejoradas, permitiendo una recuperación y clasificación fluida de texto, imágenes y videos. También planeamos introducir modelos ajustados a las instrucciones, lo que permitirá a los desarrolladores refinar el comportamiento de búsqueda mediante indicaciones simples en lugar de un ajuste complejo. Esto se complementará mediante la integración de la gestión del ciclo de vida en MongoDB Atlas, asegurando actualizaciones continuas y optimización en tiempo real para aplicaciones de IA. Qué significa esto para los desarrolladores y las empresas Las aplicaciones impulsadas por IA necesitan más que una base de datos que solo almacene, procese y persista datos: necesitan una base de datos que mejore activamente la precisión de recuperación, escale sin problemas y elimine la fricción operativa. Con Voyage AI, MongoDB redefine lo que se requiere para que una base de datos sustente aplicaciones críticas impulsadas por IA. Los desarrolladores ya no necesitarán gestionar API de incrustación externas, almacenes de vectores independientes o flujos de búsqueda complejos. La recuperación de IA se integrará en la base de datos misma, haciendo que la búsqueda semántica, la recuperación de vectores y la clasificación sean tan fluidas como las consultas tradicionales. Para las empresas, esto se traduce en un tiempo más rápido para obtener valor y una mayor confianza al escalar aplicaciones de IA. Al proporcionar resultados de alta calidad a gran escala, las empresas pueden integrar a la perfección la IA en sus casos de uso más críticos, asegurando la confiabilidad, el rendimiento y el impacto en el mundo real. Con miras al futuro: qué sigue Esto es solo el comienzo. Nuestra visión es convertir a MongoDB en la base de datos más potente e intuitiva para aplicaciones modernas impulsadas por IA. Los modelos de Voyage AI pronto estarán disponibles de forma nativa en MongoDB Atlas. Continuaremos evolucionando con las capacidades de recuperación de IA de MongoDB, haciéndolas más inteligentes, más adaptables y capaces de manejar una gama más amplia de tipos de datos y casos de uso. Manténgase pendiente para más detalles sobre cómo puede comenzar a utilizar las capacidades de Voyage AI en MongoDB. Para obtener más información sobre cómo MongoDB y Voyage AI están impulsando la búsqueda y recuperación de IA de última generación para construir, escalar e implementar aplicaciones inteligentes, visite nuestra página de productos .

February 24, 2025

Die Datenbank für KI neu definieren: Warum MongoDB Voyage AI übernommen hat

KI verändert Industriezweige, definiert Kundenerfahrungen neu und wandelt die Art und Weise, wie Unternehmen Innovationen vorantreiben, agieren und im Wettbewerb bestehen. Während sich viel auf fortschrittliche Modelle konzentriert, stellt die Datenverwaltung eine grundlegende Herausforderung dar – insbesondere, wie Daten gespeichert, abgerufen und für KI-Anwendungen nutzbar gemacht werden. Die Demokratisierung von KI-gestützter Software hängt davon ab, auf den richtigen Abstraktionen aufzubauen. Dennoch ist es für die meisten Organisationen derzeit nicht möglich, nützliche Echtzeit-KI-Anwendungen in großem Umfang zu erstellen. Die Herausforderung besteht nicht nur in der Komplexität, sondern auch im Vertrauen. KI-Modelle sind probabilistisch, was bedeutet, dass ihre Ergebnisse nicht deterministisch und vorhersehbar sind. Dies wird besonders deutlich beim Halluzinationsproblem in heutigen Chatbots und wird noch entscheidender mit dem Aufkommen von Agenten, bei denen KI-Systeme autonome Entscheidungen treffen. Entwicklungsteams benötigen die Fähigkeit, die generierten Ergebnisse zu kontrollieren, zu gestalten und zu begründen, um sie mit ihren Zielen in Einklang zu bringen und die Genauigkeit zu gewährleisten. Die KI-gestützte Suche und Abfrage ist ein leistungsstarkes Tool, das relevante Kontextdaten aus bestimmten Quellen extrahiert und KI-Modelle erweitert, um zuverlässige und genaue Antworten zu generieren oder verantwortungsvolle und sichere Maßnahmen zu ergreifen, wie der bekannte Ansatz der Retrieval Augmented Generation (RAG) zeigt. Das Herzstück des KI-gestützten Retrievals sind die Generierung von Einbettungen und das Reranking – zwei KI-Schlüsselkomponenten, die die semantische Bedeutung von Daten erfassen und die Relevanz von Abfragen und Ergebnissen bewerten. Wir sind der Meinung, dass die Einbettung von Generierung und Neubewertung sowie die KI-gestützte Suche in die Datenbankebene gehören, um den Stack zu vereinfachen und eine zuverlässigere Grundlage für KI-Anwendungen zu schaffen. Indem wir mehr Intelligenz in die Datenbank integrieren, unterstützen wir Unternehmen dabei, Halluzinationen zu reduzieren, die Vertrauenswürdigkeit zu erhöhen und das volle Potenzial der KI im großen Maßstab zu entfalten. Die wirkungsvollsten Anwendungen benötigen eine flexible, intelligente und skalierbare Datenbasis. Deshalb freuen wir uns, die Übernahme von Voyage AI bekanntzugeben, einem führenden Unternehmen in der Entwicklung von Embedding- und Neu-Ranking-Modellen, die die Genauigkeit durch KI-gestützte Suche und Abfrage erheblich verbessern. Dieser Schritt zielt nicht nur darauf ab, KI-Fähigkeiten hinzuzufügen – es geht darum, die Datenbank für das KI-Zeitalter neu zu definieren. Warum das wichtig ist: Die Zukunft der KI beruht auf einer besseren Relevanz und Genauigkeit der Daten KI ist probabilistisch – sie wird nicht wie traditionelle Software mit vordefinierten Regeln und Logik erstellt. Stattdessen generiert sie Antworten oder ergreift Maßnahmen basierend darauf, wie das KI-Modell trainiert wurde und welche Daten abgerufen werden. Aufgrund der probabilistischen Natur der Technologie kann KI jedoch Halluzinationen erzeugen. Halluzinationen sind eine direkte Folge einer schlechten oder ungenauen Datenabfrage – wenn KI keinen Zugriff auf die richtigen Daten hat, generiert sie plausible, aber falsche Informationen. Dies ist eine entscheidende Hürde für die Einführung von KI, insbesondere in Unternehmen und bei geschäftskritischen Anwendungsfällen, bei denen Genauigkeit nicht verhandelbar ist. Dies macht das Abrufen der relevantesten Daten essenziell, damit KI-Anwendungen qualitativ hochwertige und kontextuell genaue Ergebnisse liefern können. Heute verlassen sich Entwickler auf ein Flickwerk aus separaten Komponenten, um KI-gestützte Anwendungen zu erstellen. Suboptimale Entscheidungen bei diesen Komponenten, wie z. B. bei Embedding-Modellen, können zu Datenabfragen mit geringer Relevanz und zu schlecht generierten Ausgaben führen. Dieser fragmentierte Ansatz ist komplex, kostspielig, ineffizient und umständlich für Entwickler. Mit Voyage AI löst MongoDB diese Herausforderung, indem es KI-gestützte Suche und Abfrage direkt in die Datenbank integriert. Anstatt Umgehungslösungen zu implementieren oder separate Systeme zu verwalten, können Entwickler hochwertige Embeddings aus Echtzeit-Betriebsdaten generieren, Vektoren speichern, semantische Suchen durchführen und Ergebnisse verfeinern – alles innerhalb von MongoDB.Dies beseitigt Komplexität und bietet höhere Genauigkeit, niedrigere Latenzzeiten und ein optimiertes Entwicklererlebnis. Was Voyage AI zu MongoDB beiträgt Voyage AI hat ein erstklassiges KI-Forschungsteam aufgebaut, dessen Wurzeln in Stanford, am MIT, an der UC Berkeley und in Princeton liegen, und hat sich schnell zu einem führenden Unternehmen im Bereich hochpräziser KI-Abfrage entwickelt. Ihre Technologie genießt bereits das Vertrauen einiger der fortschrittlichsten KI-Startups, darunter Anthropic, LangChain, Harvey und Replit. Bemerkenswert ist, dass die Embedding-Modelle von Voyage AI die am höchsten bewerteten Zero-Shot-Modelle in der Hugging Face Community sind. Die Modelle von Voyage AI sind darauf ausgelegt, die Qualität der generierten Ergebnisse zu verbessern durch: Verbesserung der Vektorsuche durch die Erstellung von Einbettungen, die die Bedeutung von Texten, Bildern, PDFs und strukturierten Daten besser erfassen. Verbesserung der Abfragegenauigkeit durch fortschrittliche Reranking-Modelle, die Suchergebnisse für KI-gestützte Anwendungen verfeinern. Ermöglichung von domänenspezifischer KI mit fein abgestimmten Modellen, die für verschiedene Branchen wie Finanzdienstleistungen, Gesundheitswesen und Recht sowie für Anwendungsfälle wie die Codegenerierung optimiert sind. Indem wir die Abruffunktionen von Voyage AI in MongoDB integrieren, helfen wir Organisationen dabei, KI-Anwendungen einfacher und mit höherer Genauigkeit und Zuverlässigkeit zu entwickeln – ohne unnötige Komplexität. Wie Voyage AI in MongoDB integriert werden wird Wir integrieren Voyage AI in drei Phasen. In der ersten Phase bleiben die Modelle von Voyage AI zur Texteinbettung, multimodalen Einbettung und Neubewertung über die aktuellen APIs von Voyage AI und über die AWS- und Azure-Marktplätze allgemein verfügbar. So wird sichergestellt, dass Entwickler weiterhin ihre erstklassigen Einbettungs- und Neubewertungsfunktionen nutzen können. Wir werden auch in die Skalierbarkeit und Unternehmensbereitschaft der Plattform investieren, um die verstärkte Nutzung der Modelle von Voyage AI zu unterstützen. Als Nächstes werden wir die Fähigkeiten von Voyage AI nahtlos in MongoDB Atlas integrieren, beginnend mit einem Auto-Embedding-Dienst für die Vektorsuche, der die Generierung von Embeddings automatisch übernimmt. Native Neu-Rankierung wird folgen und es Entwicklern ermöglichen, die Abrufgenauigkeit sofort zu verbessern. Wir planen auch, domänenspezifische KI-Fähigkeiten zu erweitern, um verschiedene Branchen (z. B. Finanzdienstleistungen, Recht, etc.) oder Anwendungsfälle (z. B. Code-Generierung) besser zu unterstützen. Abschließend werden wir die KI-gestützte Abfrage mit erweiterten multimodalen Fähigkeiten vorantreiben, die eine nahtlose Abfrage und Rangfolge von Texten, Bildern und Videos ermöglichen. Wir planen außerdem die Einführung anweisungsoptimierter Modelle, die es Entwicklern ermöglichen, das Suchverhalten mithilfe einfacher Eingabeaufforderungen statt durch komplexe Feinabstimmungen zu verfeinern. Dies wird durch ein Embedding-Lifecycle-Management in MongoDB Atlas ergänzt, das kontinuierliche Aktualisierungen und Echtzeit-Optimierungen für KI-Anwendungen sicherstellt. Was das für Entwickler und Unternehmen bedeutet KI-gestützte Anwendungen benötigen mehr als nur eine Datenbank, die Daten speichert, verarbeitet und aufbewahrt – sie brauchen eine Datenbank, die aktiv die Abrufgenauigkeit verbessert, nahtlos skaliert und betriebliche Reibungen beseitigt. Mit Voyage AI definiert MongoDB neu, was erforderlich ist, damit eine Datenbank als Grundlage für geschäftskritische KI-gestützte Anwendungen dienen kann. Entwickler müssen keine externen APIs, eigenständigen Vektorspeicher oder komplexe Suchpipelines mehr verwalten. Die KI-Abfrage wird in die Datenbank selbst integriert, sodass die semantische Suche, die Vektorabfrage und die Rangfolge genauso nahtlos erfolgen wie herkömmliche Abfragen. Für Unternehmen bedeutet dies eine schnellere Wertschöpfung und ein größeres Vertrauen in die Skalierung von KI-Anwendungen. Durch die Bereitstellung qualitativ hochwertiger Ergebnisse im großen Maßstab können Unternehmen KI nahtlos in ihre kritischsten Anwendungsfälle integrieren und so Zuverlässigkeit, Leistung und reale Auswirkungen sicherstellen. Blick nach vorn: Was kommt als Nächstes? Aber das ist erst der Anfang. Unsere Vision ist es, MongoDB zur leistungsfähigsten und benutzerfreundlichsten Datenbank für moderne, KI-gesteuerte Anwendungen zu machen. Die Modelle von Voyage AI werden in Kürze direkt in MongoDB Atlas verfügbar sein. Wir werden die KI-Abruffunktionen von MongoDB weiter ausbauen, damit sie intelligenter und anpassungsfähiger werden und eine größere Bandbreite an Datentypen und Anwendungsfällen verarbeiten können. Bleiben Sie dran, um mehr darüber zu erfahren, wie Sie die Fähigkeiten von Voyage AI in MongoDB nutzen können! Wenn Sie mehr darüber erfahren möchten, wie MongoDB und Voyage AI eine hochmoderne KI-Suche und -Abfrage für die Entwicklung, Skalierung und Bereitstellung intelligenter Anwendungen ermöglichen, dann besuchen Sie unsere Produktseite .

February 24, 2025

为 AI 重新定义数据库：MongoDB 为何收购 Voyage AI

AI 正在重塑各行各业，重新定义客户体验，并改变企业创新、运营和竞争的方式。尽管大部分关注点在前沿模型上，但一项根本的挑战在于数据 — 如何存储和检索数据并让数据为 AI 应用所用。AI 驱动软件的民主化依赖于在正确的抽象层上进行开发，但目前，对于大多数组织来说，大规模创建有用的实时 AI 应用仍然不可行。挑战不仅在于复杂性，还在于信任。AI 模型是概率性的，这意味着其输出不具有确定性和可预测性。这在当今聊天机器人的幻觉问题中显而易见，并且随着 AI 智能载体的兴起，AI 系统可以自主做出决策，这一点变得更加重要。开发团队需要能够控制、塑造和调整生成的输出，以符合其目标并确保准确性。 AI 驱动的搜索和检索是一项强大的工具，可以从特定来源提取相关的上下文数据，增强 AI 模型，以生成可靠和准确的响应或采取负责任和安全的行动，这在著名的检索增强生成（RAG）方法中得到了体现。在 AI 驱动的检索中，核心是嵌入生成和重新排序 — 这两个关键的 AI 组件能够捕捉数据的语义含义，并评估问询和结果的相关性。我们认为将生成、重新排序以及 AI 驱动的搜索嵌入数据库层可简化堆栈，从而为 AI 应用奠定更可靠的基础。通过将更多智能引入数据库，我们帮助企业减少幻觉，提高可信度，并在 AI 扩展上释放 AI 的全部潜力。最具影响力的应用需要一个灵活、智能且可扩展的数据基础。因此，我们很高兴地宣布收购了 Voyage AI ，这是一家在嵌入和重新排序模型领域的领导者，通过 AI 驱动的搜索和检索显著提高了准确性。此举不仅旨在增加 AI 功能，更是关乎为 AI 时代重新定义数据库。为什么这很重要：AI 的未来构建在数据更高的相关性和准确性之上 AI 是概率性的 — 这不像传统软件那样具有预定义的规则和逻辑。相反，它会根据 AI 模型的训练方式和检索到的数据生成响应或采取行动。然而，由于该技术的概率性，AI 可能会出现幻觉。幻觉是检索不佳或不精确的直接后果 — 当 AI 无法访问正确的数据时，它会生成看似合理但不正确的信息。这是 AI 采用的一项关键障碍，尤其是在企业中以及在准确性不可妥协的关键任务用例中。这使得检索最相关的数据对于 AI 应用程序提供高质量、上下文准确的结果至关重要。如今，开发者依赖于拼凑而成的独立组件来构建 AI 驱动的应用程序。这些组件的次优选择，例如嵌入模型，可能会导致低相关性的数据检索和低质量的生成输出。这种分散的方法对开发者来说既复杂、昂贵、效率低下，又繁琐。借助 Voyage AI，MongoDB 通过使 AI 驱动的搜索和检索成为数据库的原生功能，解决了这一挑战。开发者无需实施变通方法或管理单独的系统，而是可以从实时操作数据中生成高质量的嵌入，存储向量，执行语义搜索，并优化结果——所有这些都在 MongoDB 中完成。这消除了复杂性，并提供了更高的准确性、更低的延迟和简化的开发者体验。 Voyage AI 为 MongoDB 带来的优势 Voyage AI 已组建了一支以斯坦福大学、麻省理工学院、加州大学伯克利分校和普林斯顿大学为基础的世界级 AI 研究团队，并迅速成为高精度 AI 检索领域的领导者。他们的技术已经被一些最先进的 AI 初创企业所信任，包括 Anthropic、LangChain、Harvey 和 Replit。值得注意的是，Voyage AI 的嵌入模型是 Hugging Face 社区中评分最高的零样本模型。Voyage AI 的模型旨在通过以下方式提高生成输出的质量：通过创建更好地捕捉文本、图像、PDF 和结构化数据含义的嵌入来增强向量搜索。通过先进的重新排序模型提高检索准确性，以优化 AI 驱动式应用的搜索结果。通过使用针对金融服务、医疗保健、法律等不同行业以及代码生成等使用案例进行优化的微调模型，启用特定领域的 AI。通过将 Voyage AI 的检索功能集成到 MongoDB 中，我们正在帮助组织更轻松地构建更准确、更可靠的 AI 应用，而不会增加不必要的复杂性。如何将 Voyage AI 集成到 MongoDB 中我们将 Voyage AI 与 MongoDB 的集成分为三个阶段。在第一阶段，Voyage AI 的文本嵌入、多模态嵌入和重排序模型将继续通过 Voyage AI 的现有 API 以及 AWS 和 Azure 云市场广泛提供，确保开发者可以继续使用其先进的嵌入和重新排序功能。我们还将投资于平台的可扩展性和企业级就绪能力，以支持 Voyage AI 模型的更广泛采用。接下来，我们会将 Voyage AI 的功能无缝嵌入到 MongoDB Atlas 中，首先推出用于 Vector Search 的自动嵌入服务，该服务将自动处理嵌入生成。然后将进行原生重新排序，使开发人员能够立即提高检索准确性。我们还计划扩展特定领域的 AI 功能，以更好地支持不同行业（例如，金融服务、法律等）或用例（例如，代码生成）。最后，我们将通过增强的多模态功能推进 AI 驱动的检索，实现文本、图像和视频的无缝检索和排序。我们还计划引入指令调整模型，允许开发者使用简单的提示而不是复杂的微调来优化搜索行为。这将通过在 MongoDB Atlas 中嵌入生命周期管理来实现补充，确保 AI 应用的持续更新和实时优化。这对开发者和企业意味着什么？ AI 驱动的应用需要的不仅仅是一个存储、处理和持久化数据的数据库，而是还需要一个能够主动提高检索准确性、无缝扩展并消除操作摩擦的数据库。借助 Voyage AI，MongoDB 重新定义了支撑任务关键型 AI 驱动的应用的数据库要求。开发者将不再需要管理外部嵌入 API、独立运行的实例向量存储或复杂的搜索管道。AI 检索将内建到数据库中，实现与传统查询一样的无缝语义搜索、向量检索和排序。对于企业来说，这意味着能够更加信心十足地扩展 AI 应用，并加快价值实现速度。通过在大规模扩展交付高质量的结果，企业可以将 AI 无缝集成到其最关键的用例中，确保可靠性、性能和实际影响。展望未来：接下来会发生什么这仅仅是个开始。我们的愿景是将 MongoDB 打造成最强大且直观的数据库，适用于现代 AI 驱动的应用程序。 Voyage AI 的模型将很快在 MongoDB Atlas 中原生可用。我们将继续提升 MongoDB 的 AI 检索能力，使其更智能、更具适应性，并能够处理更广泛的数据类型和应用场景。请继续关注最新动态，详细了解如何在 MongoDB 中开始使用 Voyage AI 功能。要了解更多关于 MongoDB 和 Voyage AI 如何为构建、扩展和部署智能应用提供最先进的 AI 搜索和检索功能的信息，请访问我们的产品页面。

February 24, 2025