Vector Search

53 results

Binary Quantization & Rescoring: 96% Less Memory, Faster Search

We are excited to share that several new vector quantization capabilities are now available in public preview in MongoDB Atlas Vector Search : support for binary quantized vector ingestion, automatic scalar quantization, and automatic binary quantization and rescoring. Together with our recently released support for scalar quantized vector ingestion , these capabilities will empower developers to scale semantic search and generative AI applications more cost-effectively. For a primer on vector quantization, check out our previous blog post . Enhanced developer experience with native quantization in Atlas Vector Search Effective quantization methods—specifically scalar and binary quantization—can now be done automatically in Atlas Vector Search. This makes it easier and more cost-effective for developers to use Atlas Vector Search to unlock a wide range of applications, particularly those requiring over a million vectors. With the new “quantization” index definition parameters, developers can choose to use full-fidelity vectors by specifying “none,” or they can quantize vector embeddings by specifying the desired quantization type—”scalar” or “binary” (Figure 1). This native quantization capability supports vector embeddings from any model provider as well as MongoDB’s BinData float32 vector subtype . Figure 1: New index definition parameters for specifying automatic quantization type in Atlas Vector Search Scalar quantization—converting a float point into an integer—is generally used when it's crucial to maintain search accuracy on par with full-precision vectors. Meanwhile, binary quantization—converting a float point into a single bit of 0 or 1—is more suitable for scenarios where storage and memory efficiency are paramount, and a slight reduction in search accuracy is acceptable. If you’re interested in learning more about this process, check out our documentation . Binary quantization with rescoring: Balance cost and accuracy Compared to scalar quantization, binary quantization further reduces memory usage, leading to lower costs and improved scalability—but also a decline in search accuracy. To mitigate this, when “binary” is chosen in the “quantization” index parameter, Atlas Vector Search incorporates an automatic rescoring step, which involves re-ranking a subset of the top binary vector search results using their full-precision counterparts, ensuring that the final search results are highly accurate despite the initial vector compression. Empirical evidence demonstrates that incorporating a rescoring step when working with binary quantized vectors can dramatically enhance search accuracy, as shown in Figure 2 below. Figure 2: Combining binary quantization and rescoring helps retain search accuracy by up to 95% And as Figure 3 shows, in our tests, binary quantization reduced processing memory requirement by 96% while retaining up to 95% search accuracy and improving query performance. Figure 3: Improvements in Atlas Vector Search with the use of vector quantization It’s worth noting that even though the quantized vectors are used for indexing and search, their full-fidelity vectors are still stored on disk to support rescoring. Furthermore, retaining the full-fidelity vectors enables developers to perform exact vector search for experimental, high-precision use cases, such as evaluating the search accuracy of quantized vectors produced by different embedding model providers, as needed. For more on evaluating the accuracy of quantized vectors, please see our documentation . So how can developers make the most of vector quantization? Here are some example use cases that can be made more efficient and scaled effectively with quantized vectors: Massive knowledge bases can be used efficiently and cost-effectively for analysis and insight-oriented use cases, such as content summarization and sentiment analysis. Unstructured data like customer reviews, articles, audio, and videos can be processed and analyzed at a much larger scale, at a lower cost and faster speed. Using quantized vectors can enhance the performance of retrieval-augmented generation (RAG) applications. The efficient processing can support query performance from large knowledge bases, and the cost-effectiveness advantage can enable a more scalable, robust RAG system, which can result in better customer and employee experience. Developers can easily A/B test different embedding models using multiple vectors produced from the same source field during prototyping. MongoDB’s flexible document model lets developers quickly deploy and compare embedding models’ results without the need to rebuild the index or provision an entirely new data model or set of infrastructure. The relevance of search results or context for large language models (LLMs) can be improved by incorporating larger volumes of vectors from multiple sources of relevance, such as different source fields (product descriptions, product images, etc.) embedded within the same or different models. To get started with vector quantization in Atlas Vector Search, see the following developer resources: Documentation: Vector Quantization in Atlas Vector Search Documentation: How to Measure the Accuracy of Your Query Results Tutorial: How to Use Cohere's Quantized Vectors to Build Cost-effective AI Apps With MongoDB

December 12, 2024

Announcing Hybrid Search Support for LlamaIndex

MongoDB is excited to announce enhancements to our LlamaIndex integration. By combining MongoDB’s robust database capabilities with LlamaIndex’s innovative framework for context-augmented large language models (LLMs), the enhanced MongoDB-LlamaIndex integration unlocks new possibilities for generative AI development. Specifically, it supports vector (powered by Atlas Vector Search ), full-text (powered by Atlas Search ), and hybrid search, enabling developers to blend precise keyword matching with semantic search for more context-aware applications, depending on their use case. Building AI applications with LlamaIndex LlamaIndex is one of the world’s leading AI frameworks for building with LLMs. It streamlines the integration of external data sources, allowing developers to combine LLMs with relevant context from various data formats. This makes it ideal for building application features like retrieval-augmented generation (RAG), where accurate, contextual information is critical. LlamaIndex empowers developers to build smarter, more responsive AI systems while reducing the complexities involved in data handling and query management. Advantages of building with LlamaIndex include: Simplified data ingestion with connectors that integrate structured databases, unstructured files, and external APIs, removing the need for manual processing or format conversion. Organizing data into structured indexes or graphs , significantly enhancing query efficiency and accuracy, especially when working with large or complex datasets. An advanced retrieval interface that responds to natural language prompts with contextually enhanced data, improving accuracy in tasks like question-answering, summarization, or data retrieval. Customizable APIs that cater to all skill levels—high-level APIs enable quick data ingestion and querying for beginners, while lower-level APIs offer advanced users full control over connectors and query engines for more complex needs. MongoDB's LlamaIndex integration Developers are able to build powerful AI applications using LlamaIndex as a foundational AI framework alongside MongoDB Atlas as the long term memory database. With MongoDB’s developer-friendly document model and powerful vector search capabilities within MongoDB Atlas, developers can easily store and search vector embeddings for building RAG applications. And because of MongoDB’s low-latency transactional persistence capabilities, developers can do a lot more with MongoDB integration in LlamIndex to build AI applications in an enterprise-grade manner. LlamaIndex's flexible architecture supports customizable storage components, allowing developers to leverage MongoDB Atlas as a powerful vector store and a key-value store. By using Atlas Vector Search capabilities, developers can: Store and retrieve vector embeddings efficiently ( llama-index-vector-stores-mongodb ) Persist ingested documents ( llama-index-storage-docstore-mongodb ) Maintain index metadata ( llama-index-storage-index-store-mongodb ) Store Key-value pairs ( llama-index-storage-kvstore-mongodb ) Figure adapted from Liu, Jerry and Agarwal, Prakul (May 2023). “Build a ChatGPT with your Private Data using LlamaIndex and MongoDB”. Medium. https://medium.com/llamaindex-blog/build-a-chatgpt-with-your-private-data-using-llamaindex-and-mongodb-b09850eb154c Adding hybrid and full-text search support Developers may use different approaches to search for different use cases. Full-text search retrieves documents by matching exact keywords or linguistic variations, making it efficient for quickly locating specific terms within large datasets, such as in legal document review where exact wording is critical. Vector search, on the other hand, finds content that is ‘semantically’ similar, even if it does not contain the same keywords. Hybrid search combines full-text search with vector search to identify both exact matches and semantically similar content. This approach is particularly valuable in advanced retrieval systems or AI-powered search engines, enabling results that are both precise and aligned with the needs of the end-user. It is super simple for developers to try out powerful retrieval capabilities on their data and improve the accuracy of their AI applications with this integration. In the LlamaIndex integration, the MongoDBAtlasVectorSearch class is used for vector search. All you have to do is enable full-text search, using VectorStoreQueryMode.TEXT_SEARCH in the same class. Similarly, to use Hybrid search, enable VectorStoreQueryMode.HYBRID . To learn more, check out the GitHub repository . With the MongoDB-LlamaIndex integration’s support, developers no longer need to navigate the intricacies of Reciprocal Rank Fusion implementation or to determine the optimal way to combine vector and text searches—we’ve taken care of the complexities for you. The integration also includes sensible defaults and robust support, ensuring that building advanced search capabilities into AI applications is easier than ever. This means that MongoDB handles the intricacies of storing and querying your vectorized data, so you can focus on building! We’re excited for you to work with our LlamaIndex integration. Here are some resources to expand your knowledge on this topic: Check out how to get started with our LlamaIndex integration Build a content recommendation system using MongoDB and LlamaIndex with our helpful tutorial Experiment with building a RAG application with LlamaIndex, OpenAI, and our vector database Learn how to build with private data using LlamaIndex, guided by one of its co-founders

October 17, 2024

Cuantización vectorial: búsqueda a escala y aplicaciones de IA generativa

Update 12/12/2024: The upcoming vector quantization capabilities mentioned at the end of this blog post are now available in public preview: Support for ingestion and indexing of binary (int1) quantized vectors: gives developers the flexibility to choose and ingest the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring: provides a native mechanism for scalar quantization and binary quantization with rescoring, making it easier for developers to implement vector quantization entirely within Atlas Vector Search. View the documentation to get started. Nos complace anunciar un sólido conjunto de capacidades de cuantificación vectorial en MongoDB Atlas Vector Search . Estas capacidades reducirán el tamaño de los vectores al tiempo que preservan el rendimiento, lo que permitirá a los desarrolladores crear poderosas aplicaciones de búsqueda semántica e IA generativa con más escala y a un costo menor. Además, a diferencia de las bases de datos vectoriales relacionales o de nicho, el modelo de documentos flexible de MongoDB, junto con los vectores cuantificados, permite una mayor agilidad en las pruebas y la implementación de diferentes modelos de incrustación de forma rápida y sencilla. La compatibilidad con la ingesta de vectores cuantizados escalares ya está disponible con carácter general, y será seguida por varias versiones nuevas en las próximas semanas. Siga leyendo para saber cómo funciona la cuantificación vectorial y visite nuestra documentación para comenzar. Los desafíos de las aplicaciones vectoriales a gran escala Si bien el uso de vectores abrió una gama de nuevas posibilidades , como el resumen de contenido y el análisis de sentimientos, los chatbots de lenguaje natural y la generación de imágenes, desbloquear información dentro de datos no estructurados puede requerir almacenar y buscar en miles de millones de vectores, lo que puede volver inviable rápidamente. Los vectores son efectivamente matrices de números de coma flotante que representan información no estructurada de una manera que las computadoras pueden entender (que van desde unos pocos cientos hasta miles de millones de matrices), y a medida que aumenta el número de vectores, también lo hace el tamaño del índice requerido para buscar en ellos. Como resultado, las aplicaciones basadas en vectores a gran escala que utilizan vectores de fidelidad completa a menudo tienen altos costos de procesamiento y tiempos de consulta lentos, lo que dificulta su escalabilidad y rendimiento. Cuantificación de vectores para la rentabilidad, la escalabilidad y el rendimiento La cuantización vectorial, una técnica que comprime vectores conservando su similitud semántica, ofrece una solución a este desafío. Imagine convertir una imagen a todo color en escala de grises para reducir el espacio de almacenamiento en una computadora. Esto implica simplificar la información de color de cada pixel agrupando colores similares en canales de color primarios o "contenedores de cuantificación" y, a continuación, representar cada pixel con un solo valor de su contenedor. Los valores en conjunto se utilizan para crear una nueva imagen en escala de grises con un tamaño más pequeño pero conservando la mayoría de los detalles originales, como se muestra en la Figura 1. Figura 1: Ilustración de la cuantificación de una imagen RGB en escala de grises La cuantización vectorial funciona de manera similar, reduciendo los vectores de alta fidelidad a menos bits para reducir significativamente los costos de memoria y almacenamiento sin comprometer los detalles importantes. Mantener este equilibrio es fundamental, ya que las aplicaciones de búsqueda e inteligencia artificial deben proporcionar información relevante para ser útiles. Dos métodos de cuantificación efectivos son escalar (convertir un punto flotante en un número entero) y binario (convertir un punto flotante en un solo bit de 0 o 1). Las capacidades de cuantificación actuales y futuras capacitarán a los desarrolladores para maximizar el potencial de Atlas Vector Search. El beneficio más impactante de la cuantificación vectorial es el aumento de la escalabilidad y el ahorro de costos a través de la reducción de los recursos informáticos y el procesamiento eficiente de vectores. Y cuando se combina con los nodos de búsqueda , la infraestructura dedicada de MongoDB para la escalabilidad independiente a través del aislamiento de cargas de trabajo y la infraestructura optimizada para memoria para cargas de trabajo de búsqueda semántica e IA generativa, la cuantificación vectorial puede reducir aún más los costos y mejorar el rendimiento, incluso en el volumen y la escala más altos para desbloquear más casos de uso. "Cohere se complace en ser uno de los primeros socios en apoyar la ingestión cuantificada de vectores en MongoDB Atlas”, dijo Nils Reimers, VP de búsqueda de AI en Cohere. "Los modelos de incrustación, como Cohere Embed v3, ayudan a las compañías a ver resultados de búsqueda más precisos basados en sus propias fuentes de datos. Esperamos poder ofrecer a nuestros clientes conjuntos aplicaciones precisas y rentables para sus necesidades”. En nuestras pruebas, en comparación con los vectores de fidelidad completa, los vectores tipo BSON , el formato de serialización binaria tipo JSON de MongoDB para un almacenamiento eficiente de documentos, redujeron el tamaño del almacenamiento en un 66% (de 41 GB a 14 GB). Y como se muestra en las Figuras 2 y 3, las pruebas ilustran una reducción significativa de memoria (73% a 96% menos) y mejoras de latencia utilizando vectores cuantificados, donde la cuantificación escalar preserva el rendimiento de recuperación y el rendimiento de recuperación de la cuantificación binaria se mantiene con la reclasificación, un proceso de evaluación de un pequeño subconjunto de las salidas cuantificadas frente a vectores de fidelidad completa para mejorar la precisión de los resultados de búsqueda. Figura 2: Reducción significativa del almacenamiento + buen rendimiento de recuperación y latencia con cuantificación en diferentes modelos de incrustación Figura 3: Mejora notable en el rendimiento de recuperación para la cuantificación binaria cuando se combina con la repuntuación Además, gracias al beneficio de costo reducido, la cuantificación de vectores facilita casos de uso de vectores múltiples más avanzados que fueron demasiado exigentes desde el punto de vista computacional o prohibitivos para implementar. Por ejemplo, la cuantificación vectorial puede ayudar a los usuarios a: Pruebe fácilmente diferentes modelos de integración A/B empleando múltiples vectores producidos a partir del mismo campo fuente durante la creación de prototipos. El modelo de documentos de MongoDB, junto con vectores cuantificados, permite una mayor agilidad a menores costos. El esquema de documentos flexible permite a los desarrolladores implementar y comparar rápidamente los resultados de los modelos de incrustación sin necesidad de reconstruir el índice o aprovisionar un modelo de datos o un conjunto de infraestructura completamente nuevos. Mejore aún más la relevancia de los resultados de búsqueda o el contexto para los modelos lingüísticos grandes (LLM) mediante la incorporación de vectores de múltiples fuentes de relevancia, como diferentes campos de origen (descripciones de productos, imágenes de productos, etc.) incrustados en el mismo modelo o en modelos diferentes. Cómo empezar y qué sigue Ahora, gracias a la compatibilidad con la ingesta de vectores cuantizados escalares, los desarrolladores pueden importar y trabajar con vectores cuantificados de los proveedores de modelos de incrustación que prefieran (como Cohere, Nomic, Jina, Mixedbread y otros), directamente en Atlas Vector Search. Lea la documentación y el tutorial para comenzar. Y en las próximas semanas, características adicionales de cuantificación vectorial equiparán a los desarrolladores con un completo conjunto de herramientas para crear y optimizar aplicaciones con vectores cuantificados: El soporte para la ingestión de vectores binarios cuantificados permitirá una mayor reducción del espacio de almacenamiento, lo que permitirá un mayor ahorro de costos y brindará a los desarrolladores la flexibilidad de elegir el tipo de vectores cuantificados que mejor se adapte a sus necesidades. La cuantificación y la reclasificación automáticas proporcionarán capacidades nativas para la cuantificación escalar, así como la cuantificación binaria con la reclasificación en Atlas Vector Search, lo que facilita a los desarrolladores aprovechar al máximo la cuantificación vectorial dentro de la plataforma. Con la compatibilidad con vectores cuantificados en MongoDB Atlas Vector Search, puede crear aplicaciones de búsqueda semántica y de IA generativa escalables y de alto rendimiento con flexibilidad y rentabilidad. Consulte estos recursos para comenzar la documentación y el tutorial . Diríjase a nuestra guía de inicio rápido para comenzar con Atlas Vector Search hoy.

October 7, 2024

Vektorquantisierung: Scale-Suche und Generative-KI-Anwendungen

Update 12/12/2024: The upcoming vector quantization capabilities mentioned at the end of this blog post are now available in public preview: Support for ingestion and indexing of binary (int1) quantized vectors: gives developers the flexibility to choose and ingest the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring: provides a native mechanism for scalar quantization and binary quantization with rescoring, making it easier for developers to implement vector quantization entirely within Atlas Vector Search. View the documentation to get started. Wir freuen uns, einen robusten Satz von Vektorquantisierungsfunktionen in MongoDB Atlas Vector Search ankündigen zu können. Diese Funktionen reduzieren die Vektorgrößen bei gleichbleibender Leistung und ermöglichen Entwicklern die Erstellung leistungsstarker Anwendungen für semantische Suche und Generative KI in größerem Maßstab – und zu geringeren Kosten. Darüber hinaus ermöglicht das flexible Dokumentmodell von MongoDB – gekoppelt mit quantisierten Vektoren – im Gegensatz zu relationalen oder Nischen-Vektordatenbanken eine größere Flexibilität beim schnellen und einfachen Testen und Bereitstellen verschiedener Einbettungsmodelle. Die Unterstützung für die Aufnahme skalarer quantisierter Vektoren ist jetzt allgemein verfügbar und wird in den kommenden Wochen durch mehrere neue Versionen ergänzt. Lesen Sie weiter, um zu erfahren, wie die Vektorquantisierung funktioniert, und besuchen Sie unsere Dokumentation , um loszulegen! Die Herausforderungen großer Vektoranwendungen Während die Verwendung von Vektoren eine Reihe neuer Möglichkeiten eröffnet hat, wie Inhaltszusammenfassung und Stimmungsanalyse, Chatbots mit natürlicher Sprache und Bilderzeugung, kann das Gewinnen von Erkenntnissen aus unstrukturierten Daten das Speichern und Durchsuchen von Milliarden von Vektoren erfordern, was schnell unpraktikabel werden kann. Vektoren sind im Grunde Arrays von Gleitkommazahlen, die unstrukturierte Informationen auf eine für Computer verständliche Weise darstellen (die Bandbreite reicht von einigen Hundert bis hin zu Milliarden von Arrays). Und mit der Anzahl der Vektoren steigt auch die Indexgröße, die für die Suche in ihnen erforderlich ist. Infolgedessen haben große vektorbasierte Anwendungen, die Full-Fidelity-Vektoren verwenden, oft hohe Verarbeitungskosten und langsame Abfragezeiten, was ihre Skalierbarkeit und Leistung beeinträchtigt. Vektorquantisierung für Kosteneffizienz, Skalierbarkeit und Leistung Die Vektorquantisierung, eine Technik, die Vektoren komprimiert und dabei ihre semantische Ähnlichkeit beibehält, bietet eine Lösung für diese Herausforderung. Stellen Sie sich vor, ein Vollfarbbild wird in Graustufen umgewandelt, um Speicherplatz auf einem Computer zu sparen. Dabei werden die Farbinformationen jedes Pixels vereinfacht, indem ähnliche Farben in Primärfarbkanäle oder „Quantisierungs-Bins“ gruppiert und dann jeder Pixel mit einem einzelnen Wert aus seinem Bin dargestellt wird. Die in Bins eingeteilten Werte werden dann verwendet, um ein neues Graustufenbild mit kleinerer Größe zu erstellen, bei dem jedoch die meisten ursprünglichen Details erhalten bleiben (siehe Abbildung 1). Abbildung 1. Darstellung der Quantisierung eines RGB-Bildes in Graustufen Die Vektorquantisierung funktioniert ähnlich, indem sie Full-Fidelity-Vektoren auf weniger Bits verkleinert, um die Speicher- und Speicherkosten erheblich zu reduzieren, ohne die wichtigen Details zu beeinträchtigen. Die Aufrechterhaltung dieses Gleichgewichts ist von entscheidender Bedeutung, da Such- und KI-Anwendungen relevante Erkenntnisse liefern müssen, um nützlich zu sein. Zwei effektive Quantisierungsmethoden sind skalar (Umwandeln einer Gleitkommazahl in eine Ganzzahl) und binär (Umwandeln einer Gleitkommazahl in ein einzelnes Bit von 0 oder 1). Aktuelle und zukünftige Quantisierungsfunktionen ermöglichen Entwicklern, das Potenzial von Atlas Vector Search optimal zu nutzen. Die wirkungsvollsten Vorteile der Vektorquantisierung sind die erhöhte Skalierbarkeit und Kosteneinsparungen durch reduzierte Rechenressourcen und eine effiziente Verarbeitung von Vektoren. Und in Kombination mit Search Nodes – der dedizierten Infrastruktur von MongoDB für unabhängige Skalierbarkeit durch Workload-Isolierung und speicheroptimierte Infrastruktur für Semantische-Such- und Generative-KI-Workloads – kann die Vektorquantisierung die Kosten weiter senken und die Leistung verbessern, selbst bei höchstem Volumen und Skalierung, um mehr Anwendungsfälle zu erschließen. „Cohere freut sich, einer der ersten Partner zu sein, der die Aufnahme quantisierter Vektoren in MongoDB Atlas unterstützt“, sagte Nils Reimers, VP of AI Search bei Cohere. „Einbettungsmodelle wie Cohere Embed v3 helfen Unternehmen, genauere Suchergebnisse auf der Grundlage ihrer eigenen Datenquellen zu erhalten. Wir freuen uns darauf, unseren gemeinsamen Kunden präzise und kostengünstige Anwendungen für ihre Anforderungen bereitzustellen.“ In unseren Tests reduzierten BSON-Vektoren – MongoDBs JSON-ähnliches binäres Serialisierungsformat für eine effiziente Dokumentenspeicherung – die Speichergröße im Vergleich zu Vektoren mit voller Genauigkeit um 66 % (von 41 GB auf 14 GB). Wie aus den Abbildungen 2 und 3 hervorgeht, zeigen die Tests eine erhebliche Verringerung des Speicherbedarfs (73 % bis 96 % weniger) und eine Verbesserung der Latenzzeit durch quantisierte Vektoren, wobei die Abrufleistung bei skalarer Quantisierung erhalten bleibt und die Abrufleistung bei binärer Quantisierung durch Neubewertung beibehalten wird – ein Prozess, bei dem eine kleine Teilmenge der quantisierten Ausgaben gegen Vektoren mit voller Genauigkeit bewertet wird, um die Genauigkeit der Suchergebnisse zu verbessern. Abbildung 2: Signifikante Speicherreduzierung und gute Recall- sowie Latenzleistung durch Quantisierung bei verschiedenen Einbettungsmodellen Abbildung 3: Bemerkenswerte Verbesserung der Recall-Leistung bei der binären Quantisierung durch Kombination mit Rescoring Darüber hinaus ermöglicht die Vektorquantisierung dank der geringeren Kosten fortschrittlichere Anwendungsfälle mit mehreren Vektoren, deren Implementierung zu rechenintensiv oder zu kostspielig gewesen wäre. Die Vektorquantisierung kann Benutzern beispielsweise bei Folgendem helfen: Führen Sie beim Prototyping problemlos A/B-Tests verschiedener Einbettungsmodelle durch, indem Sie mehrere Vektoren verwenden, die aus demselben Quellfeld erstellt wurden. Das Dokumentmodell von MongoDB ermöglicht – gekoppelt mit quantisierten Vektoren – mehr Agilität bei geringeren Kosten. Das flexible Dokumentschema ermöglicht Entwicklern eine schnelle Bereitstellung und den Vergleich von Ergebnissen eingebetteter Modelle, ohne den Index neu erstellen oder ein völlig neues Datenmodell bzw. eine neue Infrastruktur bereitstellen zu müssen. Verbessern Sie die Relevanz von Suchergebnissen oder Kontext für Large Language Models (LLMs) weiter, indem Sie Vektoren aus mehreren relevanten Quellen integrieren, z. B. verschiedene Quellfelder (Produktbeschreibungen, Produktbilder usw.), die in dasselbe oder in verschiedene Modelle eingebettet sind. Erste Schritte und weitere Schritte Dank der Unterstützung für die Aufnahme skalarer quantisierter Vektoren können Entwickler jetzt quantisierte Vektoren von den Einbettungsmodellanbietern ihrer Wahl (wie Cohere, Nomic, Jina, Mixedbread und anderen) importieren und damit arbeiten – direkt in Atlas Vector Search. Lesen Sie die Dokumentation und das Tutorial , um loszulegen. Und in den kommenden Wochen werden zusätzliche Vektorquantisierungsfunktionen Entwicklern ein umfassendes Toolset für die Erstellung und Optimierung von Anwendungen mit quantisierten Vektoren an die Hand geben: Durch die Unterstützung der Aufnahme binärer quantisierter Vektoren lässt sich der Speicherplatz weiter reduzieren, was zu größeren Kosteneinsparungen führt und Entwicklern die Flexibilität gibt, den Typ quantisierter Vektoren auszuwählen, der ihren Anforderungen am besten entspricht. Automatische Quantisierung und Neubewertung bieten native Funktionen für skalare Quantisierung sowie binäre Quantisierung mit Neubewertung in Atlas Vector Search, was es Entwicklern erleichtert, die Vektorquantisierung innerhalb der Plattform voll auszunutzen. Mit der Unterstützung für quantisierte Vektoren in MongoDB Atlas Vector Search können Sie skalierbare und leistungsstarke Semantische-Such- und Generative-KI-Anwendungen flexibel und kostengünstig erstellen. Schauen Sie sich diese Ressourcen an, um mit der Dokumentation und dem Tutorial zu beginnen. Schauen Sie sich unsere Kurzanleitung an, um noch heute mit Atlas Vector Search zu beginnen.

October 7, 2024

벡터 양자화: 대규모 검색 및 생성형 인공지능 애플리케이션

Update 12/12/2024: The upcoming vector quantization capabilities mentioned at the end of this blog post are now available in public preview: Support for ingestion and indexing of binary (int1) quantized vectors: gives developers the flexibility to choose and ingest the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring: provides a native mechanism for scalar quantization and binary quantization with rescoring, making it easier for developers to implement vector quantization entirely within Atlas Vector Search. View the documentation to get started. MongoDB Atlas Vector Search 에 강력한 벡터 양자화 기능이 추가되었음을 발표하게 되어 기쁩니다. 이러한 기능은 성능을 유지하면서 벡터 크기를 줄여 개발자가 더 큰 규모와 더 낮은 비용으로 강력한 시맨틱 검색 및 생성형 인공지능 애플리케이션을 구축할 수 있도록 지원합니다. 또한 관계형 데이터베이스나 특정 벡터 데이터베이스와 달리 MongoDB의 유연한 문서 모델과 양자화된 벡터를 결합하면 다양한 임베딩 모델을 더욱 빠르고 쉽게 테스트하고 배포할 수 있습니다. 스칼라 양자화 벡터 수집 지원이 정식으로 제공되며, 향후 몇 주 내에 몇 가지 새로운 릴리스가 이어질 예정입니다. 벡터 양자화의 작동 방식을 알아보려면 계속 읽어보세요. 시작하려면 MongoDB 문서를 참조하세요 ! 대규모 벡터 애플리케이션의 과제 벡터를 사용하면 콘텐츠 요약, 감정 분석, 자연어 챗봇, 이미지 생성과 같은 다양한 새로운 가능성 이 열립니다. 하지만 비정형 데이터에서 인사이트를 도출하려면 수십억 개의 벡터를 저장하고 검색해야 하는 경우가 발생합니다. 이는 곧 큰 어려움에 직면할 수 있습니다. 벡터는 컴퓨터가 이해할 수 있는 방식으로 비정형 정보를 나타내는 부동 소수점 숫자 배열(수백 개에서 수십억 개의 배열)이며, 벡터의 수가 증가함에 따라 이들을 검색하는 데 필요한 인덱스 크기도 증가합니다. 대규모 벡터 기반 애플리케이션에서 고정밀 벡터를 사용하면 처리 비용이 높아지고 쿼리 시간이 느려질 수 있습니다. 이는 확장성과 성능 저하로 이어지는 경우가 많습니다. 비용 효율성, 확장성 및 성능 향상을 위한 벡터 양자화 시맨틱 유사성을 유지하면서 벡터를 압축하는 기술인 벡터 양자화는 이러한 문제에 대한 해결책을 제시합니다. 컴퓨터의 저장 공간을 줄이기 위해 풀컬러 이미지를 흑백 이미지로 변환하는 것을 생각해 보세요. 이 과정에는 유사한 색상을 기본 색상 채널 또는 "양자화 구간"으로 그룹화하여 각 픽셀의 색상 정보를 단순화한 다음 각 픽셀을 해당 구간의 단일 값으로 표현하는 작업이 포함됩니다. 그런 다음 구간 값을 사용하여 크기는 더 작지만 원본 세부 정보의 대부분을 유지하는 새로운 흑백 이미지를 만듭니다(그림 1 참조). 그림 1: RGB 이미지를 흑백으로 양자화하는 예시 벡터 양자화도 마찬가지로 고정밀 벡터를 더 적은 비트로 축소하여 중요한 세부 정보를 손상시키지 않고 메모리 및 스토리지 비용을 크게 절감합니다. 검색 및 AI 애플리케이션이 유용하려면 관련 있는 인사이트를 제공해야 하므로 이러한 균형을 유지하는 것이 매우 중요합니다. 두 가지 효과적인 양자화 방법은 스칼라(부동 소수점을 정수로 변환)와 이진(부동 소수점을 0 또는 1의 단일 비트로 변환)입니다. 현재 및 향후 제공될 양자화 기능을 통해 개발자는 Atlas Vector Search의 잠재력을 최대한 활용할 수 있습니다. 벡터 양자화의 가장 큰 이점은 컴퓨팅 리소스 감소 및 효율적인 벡터 처리를 통해 확장성이 향상되고 비용이 절감된다는 것입니다. MongoDB의 검색 노드 는 워크로드 격리 및 메모리 최적화 인프라를 통해 독립적인 확장성을 제공하는 전용 인프라입니다. 시맨틱 검색과 생성형 인공지능 워크로드에 최적화된 검색 노드와 벡터 양자화를 결합하면 최대 볼륨 및 규모에서도 비용을 더욱 절감하고 성능을 향상시켜 더 많은 사용 사례를 창출할 수 있습니다. Cohere의 AI 검색 담당 VP인 Nils Reimers는 "Cohere는 MongoDB Atlas에서 양자화된 벡터 수집을 지원하는 최초의 파트너 중 하나가 되어 기쁩니다."라고 말했습니다. "Cohere Embed v3와 같은 임베딩 모델은 기업이 자체 데이터 소스를 기반으로 더욱 정확한 검색 결과를 얻을 수 있도록 지원합니다. 양사 고객에게 필요에 맞는 정확하고 비용 효율적인 애플리케이션을 제공할 수 있기를 기대합니다." 테스트에서 고정밀 벡터와 비교했을 때 BSON 유형 벡터 (효율적인 문서 저장을 위한 MongoDB의 JSON 유사 이진 직렬화 형식)는 저장 용량을 66%(41GB에서 14GB로) 줄였습니다. 그림 2와 3에서 볼 수 있듯이, 양자화된 벡터를 사용한 테스트 결과 메모리 사용량이 73%~96% 감소하고 지연 시간이 크게 개선되었습니다. 특히, 스칼라 양자화는 재현율 성능을 유지하며, 이진 양자화는 리스코어링을 통해 재현율 성능을 유지합니다. 리스코어링은 검색 결과의 정확도를 높이기 위해 양자화된 출력의 일부를 고정밀 벡터와 비교하여 평가하는 프로세스입니다. 그림 2: 다양한 임베딩 모델에 양자화를 적용하여 스토리지 사용량을 크게 줄이면서도 우수한 재현율과 지연 시간 성능을 유지 그림 3: 리스코어링과 결합 시 이진 양자화의 재현율 성능이 크게 향상됨 또한 비용 절감 효과 덕분에 벡터 양자화는 이전에는 컴퓨팅 리소스가 너무 많이 필요하거나 비용이 많이 들어 구현하기 어려웠던 고급 다중 벡터 사용 사례를 더 쉽게 구현할 수 있도록 지원합니다. 예를 들어, 벡터 양자화는 사용자가 다음을 수행하는 데 도움이 될 수 있습니다. 프로토타이핑 중 동일한 소스 필드에서 생성된 여러 벡터를 사용하여 다양한 임베딩 모델을 쉽게 A/B 테스트할 수 있습니다. MongoDB 의 문서 모델 과 양자화된 벡터를 결합하면 더 낮은 비용으로 민첩성을 향상시킬 수 있습니다. 유연한 문서 스키마를 통해 개발자는 인덱스를 다시 빌드하거나 완전히 새로운 데이터 모델이나 인프라 세트를 프로비저닝하지 않고도 임베딩 모델 결과를 신속하게 배포하고 비교할 수 있습니다. 다양한 관련성 소스(예: 제품 설명, 제품 이미지 등)에서 추출한 벡터를 동일한 또는 다른 모델에 통합하면 대규모 언어 모델(LLM)의 검색 결과 또는 컨텍스트 관련성을 더욱 향상시킬 수 있습니다. 시작 방법 및 향후 계획 이제 스칼라 양자화 벡터 수집이 지원되므로 개발자는 원하는 임베딩 모델 제공업체(Cohere, Nomic, Jina, Mixedbread 등)의 양자화된 벡터를 Atlas Vector Search에서 직접 가져와서 사용할 수 있습니다. 시작하 려면 문서 와 튜토리얼을 참조하세요 . 그리고 향후 몇 주 내에 추가 벡터 양자화 기능이 제공되어 개발자는 양자화된 벡터를 사용하여 애플리케이션을 구축하고 최적화하는 데 필요한 포괄적인 툴 세트를 갖추게 될 것입니다. 이진 양자화 벡터 수집 지원을 통해 저장 공간을 더욱 줄일 수 있으므로 비용을 더 절감하고 개발자는 요구 사항에 가장 적합한 유형의 양자화된 벡터를 유연하게 선택할 수 있습니다. Atlas Vector Search는 자동 양자화 및 리스코어링 기능을 통해 스칼라 양자화와 리스코어링을 사용한 이진 양자화를 기본적으로 지원합니다. 이를 통해 개발자는 플랫폼에서 벡터 양자화를 더욱 쉽게 활용할 수 있습니다. MongoDB Atlas Vector Search는 양자화된 벡터를 지원합니다. 이를 통해 확장성이 뛰어나고 비용 효율적인 고성능 시맨틱 검색 및 생성형 AI 애플리케이션을 유연하게 구축할 수 있습니다. 시작하 려면 문서 및 튜토리얼 리소스를 참조하세요 . 지금 바로 Atlas Vector Search 를 시작하려면 빠른 시작 가이드를 확인하세요.

October 7, 2024

Quantification vectorielle : recherche d’évolutivité et applications d’IA générative

Update 12/12/2024: The upcoming vector quantization capabilities mentioned at the end of this blog post are now available in public preview: Support for ingestion and indexing of binary (int1) quantized vectors: gives developers the flexibility to choose and ingest the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring: provides a native mechanism for scalar quantization and binary quantization with rescoring, making it easier for developers to implement vector quantization entirely within Atlas Vector Search. View the documentation to get started. Nous sommes ravis d’annoncer le lancement d’un grand nombre de fonctionnalités avancées de quantification vectorielle dans MongoDB Atlas Vector Search . Elles réduiront la taille des vecteurs tout en préservant les performances. Les développeurs pourront donc créer de puissantes applications de recherche sémantique et d’IA générative à plus grande échelle et à moindre coût. De plus, contrairement aux bases de données vectorielles relationnelles ou de niche, le document model flexible de MongoDB, associé aux vecteurs quantifiés, permet de réaliser des tests plus agiles et de faciliter le déploiement de différents modèles d’intégration. La prise en charge de l’ingestion de vecteurs quantifiés scalaires est désormais disponible. D’autres nouveautés seront annoncées dans les semaines à venir. Poursuivez votre lecture pour découvrir le fonctionnement de la quantification vectorielle et consultez notre documentation pour commencer ! Les défis des applications vectorielles à grande échelle Bien que l’utilisation de vecteurs ait donné lieu à de nombreuses possibilités , telles que la synthèse de contenu et l’analyse des sentiments, les chatbots en langage naturel et la génération d’images, l’exploitation de données non structurées peut nécessiter le stockage et la recherche dans des milliards de vecteurs, ce qui devient une tâche difficile. Les vecteurs sont en fait des tableaux de nombres à virgule flottante. Ils représentent des informations non structurées compréhensibles par les ordinateurs (de quelques centaines à des milliards de tableaux). Plus leur nombre augmente, plus la taille de l’index nécessaire pour effectuer une recherche sur ces vecteurs s’accroît. Par conséquent, les applications vectorielles à grande échelle qui reposent sur des vecteurs de haute fidélité ont souvent des coûts de traitement élevés et des temps de requête lents, ce qui entrave leur évolutivité et leurs performances. Quantification vectorielle pour maximiser la rentabilité, l’évolutivité et les performances La quantification vectorielle, une technique qui permet de compresser les vecteurs tout en préservant leur similarité sémantique, permet de résoudre cette problématique. Imaginez convertir une image en couleurs en niveaux de gris pour réduire l’espace de stockage sur un ordinateur. Cette opération implique de simplifier les informations sur les couleurs de chaque pixel en regroupant celles similaires dans des canaux de couleurs primaires ou des « bacs de quantification », puis de représenter chaque pixel par une seule valeur de son bac. Les valeurs compartimentées sont ensuite utilisées pour créer une nouvelle image en niveaux de gris de plus petite taille tout en conservant la plupart des détails d’origine (voir figure 1). Figure 1 . illustration de la quantification d’une image RGB en niveaux de gris La quantification vectorielle fonctionne de la même manière. Elle réduit les vecteurs de haute fidélité en un plus petit nombre de bits afin de considérablement diminuer les coûts de mémoire et de stockage tout en conservant les informations essentielles. Maintenir cet équilibre est primordial, car les applications de recherche et d’IA doivent fournir des informations pertinentes pour être utiles. Les deux méthodes les plus efficaces sont la méthode scalaire (conversion d’un point flottant en un nombre entier) et la méthode binaire (conversion d’un point flottant en un seul bit de 0 ou 1). Les fonctionnalités de quantification actuelles et à venir permettront aux développeurs d’exploiter tout le potentiel d’Atlas Vector Search. Principal avantage : une évolutivité accrue et des coûts réduits grâce à la diminution des ressources informatiques et au traitement efficace des vecteurs. Associée à Search Nodes , l’infrastructure dédiée de MongoDB pour une évolutivité indépendante grâce à l’isolation des charges de travail et à l’infrastructure optimisée pour la mémoire pour la recherche sémantique et les charges de travail d’IA générative, la quantification vectorielle peut encore réduire les coûts et améliorer les performances. C’est le cas même lorsque le volume et l’évolutivité sont très élevés. Les développeurs peuvent ainsi accéder à un plus grand nombre de cas d’utilisation. « La société Cohere est ravie d’être l’un des premiers partenaires à soutenir l’ingestion quantifiée de vecteurs dans MongoDB Atlas », a déclaré Nils Reimers, vice-président de la recherche sur l’IA chez Cohere. « Les modèles d’intégration, tels que Cohere Embed v3, aident les entreprises à obtenir des résultats de recherche plus précis en fonction de leurs propres sources de données. Nous avons hâte de fournir à nos clients communs des applications précises et rentables adaptées à leurs besoins. » Lors de nos tests, par rapport aux vecteurs de haute fidélité, les vecteurs de type BSON (le format de sérialisation binaire de type JSON de MongoDB pour un stockage efficace des documents) ont réduit la taille de stockage de 66 % (de 41 Go à 14 Go). Comme le montrent les figures 2 et 3, les tests affichent une réduction significative de la mémoire (de 73 % à 96 %) et des améliorations de la latence en utilisant des vecteurs quantifiés. La quantification scalaire préserve la performance de rappel. Celle de la quantification binaire est maintenue avec le rescoring, un processus d’évaluation d’un petit sous-ensemble de résultats quantifiés par rapport à des vecteurs de haute fidélité afin d’améliorer la précision des résultats de la recherche. Figure 2 . réduction significative du stockage et bonnes performances de rappel et de latence avec la quantification sur différents modèles d’intégration Figure 3 . nette amélioration des performances de rappel pour la quantification binaire lorsqu’elle est associée au rescoring De plus, grâce à son coût réduit, la quantification vectorielle facilite des cas d’utilisation plus avancés et multiples, dont la mise en œuvre aurait été trop fastidieuse ou trop onéreuse. Elle peut notamment aider les utilisateurs à réaliser les actions suivantes : procéder à des tests A/B de différents modèles d’intégration en utilisant plusieurs vecteurs produits à partir du même champ source pendant le prototypage. Le document model MongoDB, associé aux vecteurs quantifiés, permet une plus grande agilité à moindre coût. Grâce au schéma flexible du document, les développeurs peuvent déployer et comparer rapidement les résultats des modèles d’intégration sans avoir à reconstruire l’index ou à fournir un modèle de données ou un ensemble d’infrastructures entièrement nouveaux ; améliorer la pertinence des résultats de recherche ou du contexte pour les grands modèles de langage (LLM) en intégrant des vecteurs provenant de multiples sources pertinentes, telles que différents champs sources (descriptions de produits, images de produits, etc.) intégrés dans le même modèle ou dans des modèles différents. Comment se lancer ? Désormais, grâce à la prise en charge de l’ingestion de vecteurs quantifiés scalaires, les développeurs peuvent importer et travailler avec des vecteurs quantifiés provenant des fournisseurs de modèles d’intégration de leur choix (Cohere, Nomic, Jina, Mixedbread, etc.), directement dans Atlas Vector Search. Lisez la documentation et regardez le tutoriel pour commencer. Dans les semaines à venir, de nouvelles fonctionnalités de quantification vectorielle permettront d’utiliser un ensemble complet d’outils pour créer et optimiser des applications avec des vecteurs quantifiés : la prise en charge de l’ingestion de vecteurs quantifiés binaires permettra de réduire davantage l’espace de stockage, ce qui se traduira par des économies plus importantes et donnera aux développeurs la possibilité de choisir les vecteurs quantifiés les plus adaptés à leurs besoins ; la quantification et la rescoring automatiques fourniront des capacités natives pour la quantification scalaire ainsi que la quantification binaire avec rescoring dans Atlas Vector Search. Les développeurs pourront ainsi tirer pleinement parti de la quantification vectorielle au sein de la plateforme. Avec la prise en charge des vecteurs quantifiés dans MongoDB Atlas Vector Search, vous pouvez créer des applications de recherche sémantique et d’IA générative évolutives, performantes, flexibles et rentables. Consultez ces ressources pour vous lancer . Consultez notre guide de démarrage rapide pour commencer à utiliser Atlas Vector Search dès aujourd’hui.

October 7, 2024

向量量化:扩展搜索和生成式人工智能应用程序

Update 12/12/2024: The upcoming vector quantization capabilities mentioned at the end of this blog post are now available in public preview: Support for ingestion and indexing of binary (int1) quantized vectors: gives developers the flexibility to choose and ingest the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring: provides a native mechanism for scalar quantization and binary quantization with rescoring, making it easier for developers to implement vector quantization entirely within Atlas Vector Search. View the documentation to get started. 我们很高兴地宣布 MongoDB Atlas Vector Search 将提供一组强大的向量量化功能。这些功能在保持性能的同时还将减小向量大小,使开发者能够以更大的规模和更低的成本构建强大的语义搜索和生成式人工智能应用程序。此外,与关系型或生态位向量数据库不同,MongoDB 灵活的文档模型与量化向量相结合,可以轻松快捷地测试和部署不同嵌入模型,同时提高灵活性。 对标量量化向量注入的支持现已普遍推出,未来几周还将发布几个新版本。继续阅读以了解向量量化的工作原理, 访问我们的文档即可开始 ! 大规模向量应用程序的挑战 虽然向量的使用开辟了一系列 新的可能性 ,如内容摘要和情感分析、自然语言聊天机器人和图像生成,但要从非结构化数据中获得洞察,可能需要存储和搜索数十亿个向量,这很快就会变得不可行。 向量实际上是浮点数数组,以计算机可以理解的方式表示非结构化信息(从几百到数十亿数组不等),随着向量数量的增加,搜索向量所需的索引大小也随之增加。因此,使用全保真向量的大规模向量应用程序通常具有较高的处理成本,并且查询速度慢,从而影响了其可扩展性和性能。 向量量化可提升成本效益、可扩展性和性能 向量量化是一种可保留语义相似性的向量压缩技术,为这一挑战提供了解决方案。想象一下,将全彩图像转换为灰度图像,就能减少计算机上的存储空间。这需要将相似的颜色归入原色通道或“量化区间”,以简化每个像素的颜色信息,然后用其区间中的单个值来表示每个像素。然后使用已划分区间的值创建新的灰度图像,新图像的尺寸更小,但保留了大部分原始细节,如图 1 所示。 图 1:将 RGB 图像量化为灰度图像的示意图 向量量化的工作原理与此类似,缩小全保真向量的位数可以显著降低内存和存储成本,而不会影响重要细节。保持这种平衡至关重要,因为搜索和 AI 应用程序需要提供相关的洞察才能发挥作用。 有效的量化方法有两种:标量量化(将浮点转换为整数)和二进制量化(将浮点转换为一位 0 或 1)。现有的和即将推出的量化功能将助力开发者充分挖掘 Atlas Vector Search 的潜力。 向量量化最显著的优势是通过减少计算资源和高效处理向量提升了可扩展性并节省了成本。 与搜索节点 (MongoDB 的专用基础架构,可通过工作负载隔离性实现独立可扩展性,针对语义搜索和生成式人工智能工作负载进行了内存优化)相结合时,向量量化可进一步降低成本并提高性能,即使在最大容量和规模下也能解锁更多使用案例。 "Cohere 很高兴成为首批支持 MongoDB Atlas 量化向量注入的合作伙伴之一,”Cohere 人工智能搜索副总裁 Nils Reimers 表示。“像 Cohere Embed v3 这样的嵌入模型可帮助企业根据自己的数据源查看更准确的搜索结果。我们期待为我们的共同客户提供准确、经济实惠的应用程序,以满足他们的需求。” 在我们的测试中,与全保真向量相比, BSON 型向量 (MongoDB 的类 JSON 二进制序列化格式,用于高效文档存储)将存储空间减少了 66%(从 41 GB 减少到 14 GB)。如图 2 和图 3 所示,测试表明,使用量化向量可以显著减少内存(减少 73% 到 96%),延迟也有所改善,其中标量量化保留了召回性能,二进制量化的召回性能通过重新评分来维持(重新评分是根据全保真向量对一小部分量化输出进行评估的过程,可提高搜索结果的准确性)。 图 2:通过不同嵌入模型上的量化,存储空间显著减少,召回和延迟性能良好 图 3:与重新评分相结合时,二进制量化的召回性能显著提高 此外,由于成本方面的优势,向量量化有利于实现更先进的多向量使用案例,这类使用案例由于计算负担太重或成本太高而难以实现。例如,向量量化可以帮助用户: 在原型设计期间,使用从同一源字段生成的多个向量,轻松地对不同嵌入模型进行 A/B 测试。MongoDB 的文档模 型与量化向量相结合,能够以更低的成本实现更高的灵活性。灵活的文档模式支持开发者快速部署和比较嵌入模型的结果,而无需重建索引或预配全新的数据模型或基础架构。 通过合并来自多个相关源的向量,例如嵌入在相同或不同模型中的不同源字段(产品描述、产品图像等),进一步提高大型语言模型 (LLM) 搜索结果或上下文的相关性。 如何开始,以及下一步 现在,凭借对标量量化向量注入的支持,开发者可以直接在 Atlas Vector Search 中导入和使用量化向量,这些量化向量来自他们所选择的嵌入模型提供商(如 Cohere、Nomic、Jina、Mixedbread 等)。阅读 文档 和 教程 即可开始。 未来几周还会推出其他向量量化功能,开发者可借助这套全面的工具集,使用量化向量来构建和优化应用程序: 支持注入二进制量化向量,将进一步减少存储空间,从而节省更多成本,开发者能够灵活选择最符合其要求的量化向量类型。 自动量化和重新评分将为标量量化提供原生功能,以及在 Atlas Vector Search 中通过重新评分进行二进制量化的功能,开发者可以更轻松地充分利用平台中的向量量化功能。 MongoDB Atlas Vector Search 支持量化向量,您可以灵活构建可扩展的高性能语义搜索和生成式人工智能应用程序,并实现成本效益。查看这些资源获取入门 文档 和 教程 。 立即查看我们的 快速入门指南 ,开始使用 Atlas Vector Search。

October 7, 2024

Vector Quantization: Scale Search & Generative AI Applications

This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . Update 12/12/2024: The upcoming vector quantization capabilities mentioned at the end of this blog post are now available in public preview: Support for ingestion and indexing of binary (int1) quantized vectors: gives developers the flexibility to choose and ingest the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring: provides a native mechanism for scalar quantization and binary quantization with rescoring, making it easier for developers to implement vector quantization entirely within Atlas Vector Search. View the documentation to get started. We are excited to announce a robust set of vector quantization capabilities in MongoDB Atlas Vector Search . These capabilities will reduce vector sizes while preserving performance, enabling developers to build powerful semantic search and generative AI applications with more scale—and at a lower cost. In addition, unlike relational or niche vector databases, MongoDB’s flexible document model—coupled with quantized vectors—allows for greater agility in testing and deploying different embedding models quickly and easily. Support for scalar quantized vector ingestion is now generally available, and will be followed by several new releases in the coming weeks. Read on to learn how vector quantization works and visit our documentation to get started! The challenges of large-scale vector applications While the use of vectors has opened up a range of new possibilities , such as content summarization and sentiment analysis, natural language chatbots, and image generation, unlocking insights within unstructured data can require storing and searching through billions of vectors—which can quickly become infeasible. Vectors are effectively arrays of floating-point numbers representing unstructured information in a way that computers can understand (ranging from a few hundred to billions of arrays), and as the number of vectors increases, so does the index size required to search over them. As a result, large-scale vector-based applications using full-fidelity vectors often have high processing costs and slow query times, hindering their scalability and performance. Vector quantization for cost-effectiveness, scalability, and performance Vector quantization, a technique that compresses vectors while preserving their semantic similarity, offers a solution to this challenge. Imagine converting a full-color image into grayscale to reduce storage space on a computer. This involves simplifying each pixel's color information by grouping similar colors into primary color channels or "quantization bins," and then representing each pixel with a single value from its bin. The binned values are then used to create a new grayscale image with smaller size but retaining most original details, as shown in Figure 1. Figure 1: Illustration of quantizing an RGB image into grayscale Vector quantization works similarly, by shrinking full-fidelity vectors into fewer bits to significantly reduce memory and storage costs without compromising the important details. Maintaining this balance is critical, as search and AI applications need to deliver relevant insights to be useful. Two effective quantization methods are scalar (converting a float point into an integer) and binary (converting a float point into a single bit of 0 or 1). Current and upcoming quantization capabilities will empower developers to maximize the potential of Atlas Vector Search. The most impactful benefit of vector quantization is increased scalability and cost savings through reduced computing resources and efficient processing of vectors. And when combined with Search Nodes —MongoDB’s dedicated infrastructure for independent scalability through workload isolation and memory-optimized infrastructure for semantic search and generative AI workloads— vector quantization can further reduce costs and improve performance, even at the highest volume and scale to unlock more use cases. "Cohere is excited to be one of the first partners to support quantized vector ingestion in MongoDB Atlas,” said Nils Reimers, VP of AI Search at Cohere. “Embedding models, such as Cohere Embed v3, help enterprises see more accurate search results based on their own data sources. We’re looking forward to providing our joint customers with accurate, cost-effective applications for their needs.” In our tests, compared to full-fidelity vectors, BSON-type vectors —MongoDB’s JSON-like binary serialization format for efficient document storage—reduced storage size by 66% (from 41 GB to 14 GB). And as shown in Figures 2 and 3, the tests illustrate significant memory reduction (73% to 96% less) and latency improvements using quantized vectors, where scalar quantization preserves recall performance and binary quantization’s recall performance is maintained with rescoring–a process of evaluating a small subset of the quantized outputs against full-fidelity vectors to improve the accuracy of the search results. Figure 2: Significant storage reduction + good recall and latency performance with quantization on different embedding models Figure 3: Remarkable improvement in recall performance for binary quantization when combining with rescoring In addition, thanks to the reduced cost advantage, vector quantization facilitates more advanced, multiple vector use cases that would have been too computationally-taxing or cost-prohibitive to implement. For example, vector quantization can help users: Easily A/B test different embedding models using multiple vectors produced from the same source field during prototyping. MongoDB’s document model —coupled with quantized vectors—allows for greater agility at lower costs. The flexible document schema lets developers quickly deploy and compare embedding models’ results without the need to rebuild the index or provision an entirely new data model or set of infrastructure. Further improve the relevance of search results or context for large language models (LLMs) by incorporating vectors from multiple sources of relevance, such as different source fields (product descriptions, product images, etc.) embedded within the same or different models. How to get started, and what’s next Now, with support for the ingestion of scalar quantized vectors, developers can import and work with quantized vectors from their embedding model providers of choice (such as Cohere, Nomic, Jina, Mixedbread, and others)—directly in Atlas Vector Search. Read the documentation and tutorial to get started. And in the coming weeks, additional vector quantization features will equip developers with a comprehensive toolset for building and optimizing applications with quantized vectors: Support for ingestion of binary quantized vectors will enable further reduction of storage space, allowing for greater cost savings and giving developers the flexibility to choose the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring will provide native capabilities for scalar quantization as well as binary quantization with rescoring in Atlas Vector Search, making it easier for developers to take full advantage of vector quantization within the platform. With support for quantized vectors in MongoDB Atlas Vector Search, you can build scalable and high-performing semantic search and generative AI applications with flexibility and cost-effectiveness. Check out these resources to get started documentation and tutorial . Head over to our quick-start guide to get started with Atlas Vector Search today.

October 7, 2024

Quantização vetorial: pesquisa de escala e aplicativos de IA generativa

Update 12/12/2024: The upcoming vector quantization capabilities mentioned at the end of this blog post are now available in public preview: Support for ingestion and indexing of binary (int1) quantized vectors: gives developers the flexibility to choose and ingest the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring: provides a native mechanism for scalar quantization and binary quantization with rescoring, making it easier for developers to implement vector quantization entirely within Atlas Vector Search. View the documentation to get started. Estamos muito satisfeitos em anunciar um conjunto robusto de recursos de quantização vetorial no MongoDB Atlas Vector Search . Esses recursos reduzirão o tamanho dos vetores e, ao mesmo tempo, preservarão o desempenho, permitindo que os desenvolvedores criem aplicativos avançados de pesquisa semântica e IA generativa com mais escala - e a um custo menor. Além disso, diferentemente dos bancos de dados vetoriais relacionais ou de nicho, o modelo de documento flexível do MongoDB, associado a vetores quantizados, permite maior agilidade para testar e implementar diferentes modelos de incorporação de forma rápida e fácil. O suporte à ingestão de vetores escalares quantizados já está disponível de forma geral e será seguido por várias novas versões nas próximas semanas. Continue lendo para saber como funciona a quantização de vetores e visite nossa documentação para começar! Os desafios dos aplicativos vetoriais de grande escala Embora o uso de vetores tenha aberto uma série de novas possibilidades , como resumo de conteúdo e análise de sentimentos, chatbots de linguagem natural e geração de imagens, o desbloqueio de insights em dados não estruturados pode exigir o armazenamento e a pesquisa em bilhões de vetores, o que pode se tornar inviável rapidamente. Os vetores são, na verdade, matrizes de números de ponto flutuante que representam informações não estruturadas de uma forma que os computadores possam entender (variando de algumas centenas a bilhões de matrizes) e, à medida que o número de vetores aumenta, também aumenta o tamanho do índice necessário para pesquisá-los. Como resultado, os aplicativos baseados em vetores em grande escala que usam vetores de fidelidade total geralmente têm altos custos de processamento e tempos de consulta lentos, o que prejudica sua escalabilidade e desempenho. Quantização vetorial para redução de custos, escalabilidade e desempenho A quantização de vetores, uma técnica que comprime vetores e, ao mesmo tempo, preserva sua similaridade semântica, oferece uma solução para esse desafio. Considere converter uma imagem totalmente digitalizada em escala de cinza para reduzir o espaço de armazenamento em um computador. Isso envolve a simplificação das informações de cores de cada pixel, agrupando cores semelhantes em canais de cores primárias ou "compartimentos de quantização," e, em seguida, representando cada pixel com um único valor de seu compartimento. Os valores binned são então usados para criar uma nova imagem em escala de cinza com tamanho menor, mas mantendo a maioria dos detalhes originais, conforme mostrado na Figura 1. Imagem 1: Ilustração da quantização de uma imagem GB em escala de cinza A quantização de vetores funciona de forma semelhante, diminuindo os vetores de fidelidade total em menos bits para reduzir significativamente os custos de memória e armazenamento sem comprometer os detalhes importantes. Manter esse equilíbrio é fundamental, pois os aplicativos de pesquisa e AI precisam fornecer insights relevantes para serem úteis. Dois métodos eficazes de quantização são o escalar (conversão de um ponto flutuante em um número inteiro) e o binário (conversão de um ponto flutuante em um único bit de 0 ou 1). Os recursos de quantização atuais e futuros capacitarão os desenvolvedores a maximizar o potencial do Atlas Vector Search. O benefício de maior impacto da quantização vetorial é o aumento da escalabilidade e da redução de custos por meio da redução de recursos de computação e do processamento eficiente de vetores. E quando combinada com o Search Nodes - a infraestrutura dedicada do MongoDB para escalabilidade independente por meio do isolamento da carga de trabalho e da infraestrutura otimizada para memória para pesquisa semântica e cargas de trabalho de IA generativas - a quantização vetorial pode reduzir ainda mais os custos e melhorar o desempenho, mesmo no volume e na escala mais altos, para desbloquear mais casos de uso. "A Cohere está satisfeita por ser um dos primeiros parceiros a apoiar a ingestão de vetores quantizados no MongoDB Atlas", disse Nils Reimer, VP de Search da AI da Cohere. “Modelos de incorporação, como o Cohere Embed v3, ajudam as empresas a ver resultados de pesquisa mais precisos com base em suas próprias fontes de dados. Estamos ansiosos para fornecer a nossos clientes em comum aplicativos precisos e econômicos para suas necessidades.” Em nossos testes, em comparação com os vetores de fidelidade total, os vetores do tipo BSON - o formato de serialização binária semelhante ao JSON do MongoDB para armazenamento eficiente de documentos - reduziram o tamanho do armazenamento em 66% (de 41 GB para 14 GB). E, conforme mostrado nas Figuras 2 e 3, os testes ilustram uma redução significativa de memória (73% a 96% menos) e melhorias de latência usando vetores quantizados, em que a quantização escalar preserva o desempenho de recuperação e o desempenho de recuperação da quantização binária é mantido com a restauração - um processo de avaliação de um pequeno subconjunto das saídas quantizadas em relação a vetores de fidelidade total para melhorar a precisão dos resultados da pesquisa. Figura 2: redução significativa do armazenamento + bom desempenho de recuperação e latência com quantização em diferentes modelos de incorporação Figura 3: Melhoria notável no desempenho de recuperação para quantização binária quando combinada com a reescalonamento Além disso, graças à vantagem de custo reduzido, a quantização vetorial facilita casos de uso de vetores múltiplos mais avançados que teriam sido muito computacionalmente taxativos ou proibitivos em termos de custo para serem implementados. Por exemplo, a quantização vetorial pode ajudar os usuários a: Fazer testes A/B facilmente com diferentes modelos de incorporação usando vários vetores produzidos a partir do mesmo campo de origem durante a criação de protótipos. O modelo de documento do MongoDB, juntamente com vetores quantizados, permite maior agilidade a custos mais baixos. O esquema flexível de documento permite que os desenvolvedores implementem e comparem rapidamente os resultados dos modelos incorporados sem a necessidade de reconstruir o índice ou provisionar um modelo de dados totalmente novo ou um conjunto de infraestruturas. Melhorar ainda mais a relevância dos resultados de pesquisa ou do contexto para modelos de linguagem grandes (LLMs) incorporando vetores de várias fontes de relevância, como diferentes campos de origem (descrições de produtos, imagens de produtos etc.) incorporados no mesmo modelo ou em modelos diferentes. Como começar e o que vem a seguir Agora, com suporte para a ingestão de vetores quantizados escalares, os desenvolvedores podem importar e trabalhar com vetores quantizados de seus fornecedores de modelos de incorporação de escolha (como Cohere, Nomic, Jina, Mixedbread e outros) — diretamente no Atlas Vector Search. Leia a documentação e o tutorial para começar. E, nas próximas semanas, recursos adicionais de quantização de vetores equiparão os desenvolvedores com um conjunto abrangente de ferramentas para criar e otimizar aplicativos com vetores quantizados: O suporte à ingestão de vetores binários quantizados permitirá reduzir ainda mais o espaço de armazenamento, possibilitando maior economia de custos e oferecendo aos desenvolvedores a flexibilidade de escolher o tipo de vetor quantizado que melhor se adapta às suas necessidades. A quantização e a repontuação automáticas fornecerão recursos nativos para quantização escalar, bem como para quantização binária com repontuação no Atlas Vector Search, facilitando para os desenvolvedores aproveitar ao máximo a quantização vetorial dentro da plataforma. Com suporte para vetores quantizados no MongoDB Atlas Vector Search, você pode construir pesquisa semântica escalável e de alto desempenho e aplicativos de IA generativa com flexibilidade e relação custo-eficiência. Confira estes recursos para começar a documentação e o tutorial . Acesse nosso guia de início rápido para começar a usar o Atlas Vector Search hoje mesmo.

October 7, 2024

Vector Quantization: come scalare applicazioni di ricerca e di AI Generativa

Update 12/12/2024: The upcoming vector quantization capabilities mentioned at the end of this blog post are now available in public preview: Support for ingestion and indexing of binary (int1) quantized vectors: gives developers the flexibility to choose and ingest the type of quantized vectors that best fits their requirements. Automatic quantization and rescoring: provides a native mechanism for scalar quantization and binary quantization with rescoring, making it easier for developers to implement vector quantization entirely within Atlas Vector Search. View the documentation to get started. Siamo lieti di annunciare una solida serie di funzionalità di quantizzazione vettoriale in MongoDB Atlas Vector Search . Queste funzionalità ridurranno le dimensioni dei vettori preservando le prestazioni, consentendo agli sviluppatori di creare potenti applicazioni di ricerca semantica e AI generativa con maggiore scalabilità e a un costo inferiore. Inoltre, a differenza dei database vettoriali relazionali o di nicchia, il modello di documento flessibile di MongoDB, abbinato a vettori quantizzati, consente una maggiore agilità nel test e nell'implementazione di diversi modelli di incorporamento in modo rapido e semplice. Il supporto per l'inserimento di vettori quantizzati scalari è ora disponibile a livello generale e sarà seguito da diverse nuove release nelle prossime settimane. Continua a leggere per scoprire come funziona la quantizzazione vettoriale e consulta la nostra documentazione per iniziare! Le sfide delle applicazioni vettoriali su larga scala Sebbene l'uso dei vettori abbia aperto una serie di nuove possibilità , come il riepilogo dei contenuti e l'analisi del sentiment, i chatbot in linguaggio naturale e la generazione di immagini, sbloccare insight all'interno di dati non strutturati può richiedere l'archiviazione e la ricerca tra miliardi di vettori, il che può diventare rapidamente irrealizzabile. I vettori sono effettivamente degli array di numeri in virgola mobile che rappresentano informazioni non strutturate in un modo comprensibile ai computer (da poche centinaia a miliardi di array) e, con l'aumentare del numero di vettori, aumenta anche la dimensione dell'indice necessario per effettuare una ricerca su di essi. Di conseguenza, le applicazioni vettoriali su larga scala che utilizzano vettori a piena fedeltà hanno spesso costi di elaborazione elevati e tempi di interrogazione lenti, che ne ostacolano la scalabilità e le prestazioni. Quantizzazione vettoriale per economicità, scalabilità e prestazioni La quantizzazione vettoriale, una tecnica che comprime i vettori preservandone la somiglianza semantica, offre una soluzione a questa sfida. Immagina di convertire un'immagine a colori in scala di grigi per ridurre lo spazio di archiviazione su un computer. Ciò comporta la semplificazione delle informazioni sul colore di ciascun pixel raggruppando colori simili in canali di colore primari o "intervalli di quantizzazione" e quindi rappresentando ogni pixel con un singolo valore dal suo intervallo. I valori degli intervalli vengono poi utilizzati per creare una nuova immagine in scala di grigi con dimensioni più piccole, ma conservando la maggior parte dei dettagli originali, come mostrato nella Figura 1. Figura 1. Illustrazione della quantizzazione di un'immagine RGB in scala di grigi La quantizzazione vettoriale funziona in modo simile, riducendo i vettori a piena fedeltà in un minor numero di bit per ridurre significativamente i costi di memoria e archiviazione senza compromettere i dettagli importanti. Mantenere questo equilibrio è fondamentale, in quanto le applicazioni di ricerca e AI devono fornire insight pertinenti per essere utili. Due metodi di quantizzazione efficaci sono quello scalare (conversione di un punto float in un numero intero) e quello binario (conversione di un punto float in un singolo bit di 0 o 1). Le funzionalità di quantizzazione attuali e future consentiranno agli sviluppatori di massimizzare il potenziale di Atlas Vector Search. Il vantaggio più importante della quantizzazione vettoriale è l'aumento della scalabilità e il risparmio sui costi, grazie alla riduzione delle risorse di calcolo e all'elaborazione efficiente dei vettori. E quando viene combinata con Search Nodes , l'infrastruttura dedicata di MongoDB per la scalabilità indipendente attraverso l'isolamento del carico di lavoro e l'infrastruttura ottimizzata per la memoria per la ricerca semantica e i carichi di lavoro dell'AI generativa, la quantizzazione vettoriale può ridurre ulteriormente i costi e migliorare le prestazioni, anche al massimo volume e alla massima scalabilità, per sbloccare più casi d'uso. "Cohere è entusiasta di essere uno dei primi partner a supportare l'inserimento di vettori quantizzati in MongoDB Atlas", ha dichiarato Nils Reimers, VP of AI Search di Cohere. "L'incorporamento di modelli, come Cohere Embed v3, aiuta le aziende a visualizzare risultati di ricerca più accurati in base alle proprie fonti di dati. Non vediamo l'ora di fornire ai nostri clienti comuni applicazioni accurate e convenienti per le loro esigenze." Nei nostri test, rispetto ai vettori a piena fedeltà, i vettori di tipo BSON , il formato di serializzazione binaria simile a JSON di MongoDB per un'archiviazione efficiente dei documenti, hanno ridotto le dimensioni di archiviazione del 66% (da 41 GB a 14 GB). E come mostrato nelle Figure 2 e 3, i test illustrano una significativa riduzione della memoria (dal 73% al 96% in meno) e miglioramenti della latency utilizzando vettori quantizzati, dove la quantizzazione scalare preserva le prestazioni di richiamo e le prestazioni di richiamo della quantizzazione binaria vengono mantenute con il rescoring, un processo di valutazione di un piccolo sottoinsieme degli output quantizzati rispetto a vettori a piena fedeltà per migliorare l'accuratezza dei risultati della ricerca. Figura 2: Riduzione significativa dello spazio di archiviazione + buone prestazioni di richiamo e latency con quantizzazione su diversi modelli di incorporamento Figura 3: Notevole miglioramento delle prestazioni di richiamo per la quantizzazione binaria quando combinata con il rescoring Inoltre, grazie al vantaggio del costo ridotto, la quantizzazione vettoriale facilita casi d'uso più avanzati, a vettore multiplo, che sarebbero stati troppo onerosi dal punto di vista computazionale o proibitivi da implementare. Ad esempio, la quantizzazione vettoriale può aiutare gli utenti a: Eseguire facilmente A/B test di diversi modelli di incorporamento utilizzando più vettori prodotti dallo stesso campo sorgente durante la prototipazione. Il modello di documento di MongoDB, abbinato a vettori quantizzati, consente una maggiore agilità a costi inferiori. Lo schema flessibile del documento consente agli sviluppatori di distribuire e confrontare rapidamente i risultati dei modelli di incorporamento senza la necessità di ricostruire l'indice o di effettuare il provisioning di un modello di dati o di un set di infrastrutture completamente nuovo. Migliorare ulteriormente la pertinenza dei risultati di ricerca o del contesto per modelli linguistici di grandi dimensioni (LLM) incorporando vettori da più fonti di pertinenza, come diversi campi sorgente (descrizioni di prodotti, immagini di prodotti, ecc.) incorporati nello stesso modello o in modelli diversi. Come iniziare e cosa succede dopo Ora, con il supporto per l'inserimento di vettori quantizzati scalari, gli sviluppatori possono importare e lavorare con vettori quantizzati dai loro fornitori di modelli di incorporamento preferiti (come Cohere, Nomic, Jina, Mixedbread e altri), direttamente in Atlas Vector Search. Per iniziare, leggi la documentazione e il tutorial . E nelle prossime settimane, ulteriori funzionalità di quantizzazione vettoriale forniranno agli sviluppatori un set di strumenti completo per la creazione e l'ottimizzazione di applicazioni con vettori quantizzati: Il supporto per l'inserimento di vettori quantizzati binari consentirà un'ulteriore riduzione dello spazio di archiviazione, consentendo maggiori risparmi sui costi e offrendo agli sviluppatori la flessibilità di scegliere il tipo di vettori quantizzati più adatto alle loro esigenze. La quantizzazione e il rescoring automatici forniranno funzionalità native per la quantizzazione scalare e la quantizzazione binaria con rescoring in Atlas Vector Search, rendendo più facile per gli sviluppatori sfruttare appieno la quantizzazione vettoriale all'interno della piattaforma. Con il supporto per i vettori quantizzati in MongoDB Atlas Vector Search, puoi creare applicazioni di ricerca semantica e di AI generativa scalabili e ad alte prestazioni con flessibilità ed economicità. Consulta queste risorse per ottenere documentazione e tutorial introduttivi. Consulta la nostra guida rapida per iniziare con Atlas Vector Search oggi stesso.

October 7, 2024

MongoDB.local London 2024: Better Applications, Faster

This post is also available in: Deutsch , Français , Español , Português , Italiano , 한국어 , 简体中文 . Since we kicked off MongoDB’s series of 2024 events in April, we’ve connected with thousands of customers, partners, and community members in cities around the world—from Mexico City to Mumbai. Yesterday marked the nineteenth stop of the 2024 MongoDB.local tour, and we had a blast welcoming folks across industries to MongoDB.local London, where we discussed the latest technology trends, celebrated customer innovations, and unveiled product updates that make it easier than ever for developers to build next-gen applications. Over the past year, MongoDB’s more than 50,000 customers have been telling us that their needs are changing. They’re increasingly focused on three areas: Helping developers build faster and more efficiently Empowering teams to create AI-powered applications Moving from legacy systems to modern platforms Across these areas, there’s a common need for a solid foundation: each requires a resilient, scalable, secure, and highly performant database. The updates we shared at MongoDB.local London reflect these priorities. MongoDB is committed to ensuring that our products are built to exceed our customers’ most stringent requirements, and that they provide the strongest possible foundation for building a wide range of applications, now and in the future. Indeed, during yesterday’s event, Sahir Azam, MongoDB’s Chief Product Officer, discussed the foundational role data plays in his keynote address. He also shared the latest advancement from our partner ecosystem, an AI solution powered by MongoDB, Amazon Web Services, and Anthropic that makes it easier for customers to deploy gen AI customer care applications. MongoDB 8.0: The best version of MongoDB ever The biggest news at .local London was the general availability of MongoDB 8.0 , which provides significant performance improvements, reduced scaling costs, and adds additional scalability, resilience, and data security capabilities to the world’s most popular document database. Architectural optimizations in MongoDB 8.0 have significantly reduced memory usage and query times, and MongoDB 8.0 has more efficient batch processing capabilities than previous versions. Specifically, MongoDB 8.0 features 36% better read throughput, 56% faster bulk writes, and 20% faster concurrent writes during data replication. In addition, MongoDB 8.0 can handle higher volumes of time series data and can perform complex aggregations more than 200% faster—with lower resource usage and costs. Last (but hardly least!), Queryable Encryption now supports range queries, ensuring data security while enabling powerful analytics. For more on MongoDB.local London’s product announcements—which are designed to accelerate application development, simplify AI innovation, and speed developer upskilling—please read on! Accelerating application development Improved scaling and elasticity on MongoDB Atlas capabilities New enhancements to MongoDB Atlas’s control plane allow customers to scale clusters faster, respond to resource demands in real-time, and optimize performance—all while reducing operational costs. First, our new granular resource provisioning and scaling features—including independent shard scaling and extended storage and IOPS on Azure—allow customers to optimize resources precisely where needed. Second, Atlas customers will experience faster cluster scaling with up to 50% quicker scaling times by scaling clusters in parallel by node type. Finally, MongoDB Atlas users will enjoy more responsive auto-scaling, with a 5X improvement in responsiveness thanks to enhancements in our scaling algorithms and infrastructure. These enhancements are being rolled out to all Atlas customers, who should start seeing benefits immediately. IntelliJ plugin for MongoDB Announced in private preview, the MongoDB for IntelliJ Plugin is designed to functionally enhance the way developers work with MongoDB in IntelliJ IDEA, one of the most popular IDEs among Java developers. The plugin allows enterprise Java developers to write and test Java queries faster, receive proactive performance insights, and reduce runtime errors right in their IDE. By enhancing the database-to-IDE integration, JetBrains and MongoDB have partnered to deliver a seamless experience for their shared user-base and unlock their potential to build modern applications faster. Sign up for the private preview here . MongoDB Copilot Participant for VS Code (Public Preview) Now in public preview, the new MongoDB Participant for GitHub Copilot integrates domain-specific AI capabilities directly with a chat-like experience in the MongoDB Extension for VS Code . The participant is deeply integrated with the MongoDB extension, allowing for the generation of accurate MongoDB queries (and exporting them to application code), describing collection schemas, and answering questions with up-to-date access to MongoDB documentation without requiring the developer to leave their coding environment. These capabilities significantly reduce the need for context switching between domains, enabling developers to stay in their flow and focus on building innovative applications. Multicluster support for the MongoDB Enterprise Kubernetes Operator Ensure high availability, resilience, and scale for MongoDB deployments running in Kubernetes through added support for deploying MongoDB and Ops Manager across multiple Kubernetes clusters. Users now have the ability to deploy ReplicaSets, Sharded Clusters (in public preview), and Ops Manager across local or geographically distributed Kubernetes clusters for greater deployment resilience, flexibility, and disaster recovery. This approach enables multi-site availability, resilience, and scalability within Kubernetes, capabilities that were previously only available outside of Kubernetes for MongoDB. To learn more, check out the documentation . MongoDB Atlas Search and Vector Search are now generally available via the Atlas CLI and Docker The local development experience for MongoDB Atlas is now generally available. Use the MongoDB Atlas CLI and Docker to build with MongoDB Atlas in your preferred local environment, and easily access features like Atlas Search and Atlas Vector Search throughout the entire software development lifecycle. The Atlas CLI provides a unified and familiar terminal-based interface that allows you to deploy and build with MongoDB Atlas in your preferred development environment, locally or in the cloud. If you build with Docker, you can also now use Docker and Docker Compose to easily integrate Atlas in your local and continuous integration environments with the Atlas CLI . Avoid repetitive work by automating the lifecycle of your development and testing environments and focus on building application features with full-text search, AI and semantic search, and more. Simplifying AI innovation Reduce costs and increase scale in Atlas Vector Search We announced vector quantization capabilities in Atlas Vector Search . By reducing memory (by up to 96%) and making vectors faster to retrieve, vector quantization allows customers to build a wide range of AI and search applications at higher scale and lower cost. Generally available now, support for scalar quantized vector ingestion lets customers seamlessly import and work with quantized vectors from their embedding model providers of choice—directly in Atlas Vector Search. Coming soon, additional vector quantization features, including automatic quantization, will equip customers with a comprehensive toolset for building and optimizing large-scale AI and search applications in Atlas Vector Search. Additional integrations with popular AI frameworks Ship your next AI-powered project faster with MongoDB, no matter your framework or LLM of choice. AI technologies are advancing rapidly, making it important to build and scale performant applications quickly, and to use your preferred stack as your requirements and available technologies evolve. MongoDB’s enhanced suite of integrations with LangChain, LlamaIndex, Microsoft Semantic Kernel, AutoGen, Haystack, Spring AI, the ChatGPT Retrieval Plugin, and more make it easier than ever to build the next generation of applications on MongoDB . Advancing developer upskilling New MongoDB Learning Badges Faster to achieve and more targeted than a certification, MongoDB's free Learning Badges show your commitment to continuous learning and to proving your knowledge about a specific topic. Follow the learning path, gain new skills, and get a digital badge to show off on LinkedIn. Check out the two new gen AI learning badges! Building gen AI Apps : Learn to create innovative gen AI apps with Atlas Vector Search, including retrieval-augmented generation (RAG) apps. Deploying and Evaluating gen AI Apps : Take your apps from creation to full deployment, focusing on optimizing performance and evaluating results. Learn more To learn more about MongoDB’s recent product announcements and updates, check out our What’s New product announcements page and all of our blog posts about product updates . Happy building!

October 3, 2024

MongoDB.local London 2024:更快、更好的应用程序

自今年 4 月启动 MongoDB 2024 系列活动以来,我们已与世界各地的数千名客户、合作伙伴和社区成员建立了联系,包括墨西哥城到孟买等城市。昨天是 2024 年 MongoDB.local 巡回活动的第十九站,我们很高兴欢迎各行各业的人们来到 MongoDB.local 伦敦大会。在那里我们讨论了最新的技术趋势,庆祝了客户创新,并发布了产品更新,使开发者可以比以往更轻松地构建下一代应用程序。 在过去的一年里,MongoDB 的 50,000 多名客户一直在告诉我们他们的需求正在发生变化。他们越来越关注三个领域: 全面助力开发者更快、更高效地构建应用 帮助团队创建 AI 赋能的应用程序 从传统系统迁移到现代平台 这些领域都需要一个坚实的基础:每个领域都需要一个有弹性、可扩展、安全和高性能的数据库。 我们在 MongoDB.local 伦敦大会上分享的更新反映了这些优先事项。MongoDB 致力于确保我们的产品超越客户最严格的要求,并为现在和将来构建广泛的应用程序奠定最坚实的基础。 在昨天的活动中,MongoDB 的首席产品官 Sahir Azam 在他的主题演讲中讨论了数据所发挥的基础作用。他还分享了我们合作伙伴生态系统的最新进展, 这是一个由 MongoDB、Amazon Web Services 和 Anthropic 提供支持的 AI 解决方案,使客户能够更轻松地部署生成式人工智能客户服务应用程序。 MongoDB 8.0:有史以来最好的 MongoDB 版本 .local 伦敦大会上的最大新闻是 MongoDB 8.0 正式发布。它显著提高了性能,降低了扩展成本,并为全球最受欢迎的文档数据库增加了额外的可扩展性、韧性和数据安全功能。 MongoDB 8.0 的架构优化大大降低了内存使用率和查询时间,而且 MongoDB 8.0 比以前的版本具有更高效的批处理能力。具体来说,MongoDB 8.0 的读取吞吐量提高了 36%,批量写入速度提高了 56%,数据复制期间的并发写入速度提高了 20%。此外,MongoDB 8.0 可以处理更大量的时间序列数据,并且可以将执行复杂聚合的速度提高 200% 以上,同时降低资源使用量和成本。最后, Queryable Encryption 现在支持范围查询,确保数据安全,同时实现强大的分析功能。 如需详细了解 MongoDB.local 伦敦大会的产品公告(旨在加速应用程序开发、简化 AI 创新和加快开发者的技能提升),请继续阅读! 加快应用程序开发 提高 MongoDB Atlas 功能的扩展性和弹性 MongoDB Atlas 控制平面的全新增强功能使客户能够更快地扩展集群、实时响应资源需求并优化性能,同时降低运营成本。 首先,我们新的颗粒度资源预配和扩展功能(包括 Azure 上的独立分片扩展和扩展存储及 IOPS)允许客户在需要时精确优化资源。其次,通过按节点类型并行扩展集群,Atlas 客户将体验到更快的集群扩展,扩展时间最多可缩短 50%。 最后,MongoDB Atlas 用户将享受响应更快的自动扩展,由于我们的扩展算法和基础设施的增强,响应速度将提高 5 倍。这些增强功能正在向所有 Atlas 客户推出,他们将立即开始受益。 适用于 MongoDB 的 IntelliJ 插件 MongoDB for IntelliJ 插件已在私人预览版中发布,旨在从功能上增强开发者在 IntelliJ IDEA(Java 开发者中最流行的 IDE 之一)中使用 MongoDB 的方式。该插件允许企业 Java 开发者在其 IDE 中更快地编写和测试 Java 查询、获得主动的性能洞察并减少运行时错误。 通过增强数据库到 IDE 的集成,JetBrains 和 MongoDB 合作,为其共享用户群提供无缝体验,并释放他们更快构建现代应用程序的潜力。 在此处注册个人预览版 。 MongoDB Copilot Participant for VS Code (公共预览版) 目前,新的 MongoDB Participant for GitHub Copilot 已进入公开预览阶段,它将特定领域的 AI 功能直接与 MongoDB Extension for VS Code 中的聊天式体验相结合。 参与者与 MongoDB 扩展深度集成,允许生成准确的 MongoDB 查询(并将其导出到应用程序代码)、描述集合模式并通过访问最新的 MongoDB 文档来回答问题,而无需开发者离开他们的编码环境。这些功能大大减少了域之间的上下文切换,使开发者能够保持工作流,专注于构建创新应用程序。 MongoDB Enterprise Kubernetes Operator 的多集群支持 通过增加对跨多个 Kubernetes 集群部署 MongoDB 和 Ops Manager 的支持,确保在 Kubernetes 中运行的 MongoDB 部署具有高可用性、韧性和扩展性。 用户现在可以在本地或地理分布的 Kubernetes 集群中部署 ReplicaSet、分片集群(公开预览版)和 Ops Manager,以实现更高的部署韧性、灵活性和灾难恢复能力。这种方法可以在 Kubernetes 中实现多站点可用性、韧性和可扩展性,这些功能以前仅在 Kubernetes 之外为 MongoDB 提供。要了解更多信息, 请查看文档 。 MongoDB Atlas Search 和 Vector Search 现已通过 Atlas CLI 和 Docker 正式发布 MongoDB Atlas 的本地开发体验现已公开。使用 MongoDB Atlas CLI 和 Docker 在您喜欢的本地环境中构建 MongoDB Atlas,并在整个软件开发生命周期中轻松访问 Atlas Search 和 Atlas Vector Search 等功能。Atlas CLI 提供了统一且熟悉的基于终端的界面,允许您在您喜欢的开发环境中(本地或云端)使用 MongoDB Atlas 进行部署和构建。 如果您使用 Docker 进行构建,现在还可以使用 Docker 和 Docker Compose 通过 Atlas CLI 轻松地将 Atlas 集成到您的本地和持续集成环境中。通过自动化开发和测试环境的生命周期来避免重复性工作,并专注于通过全文搜索、 AI 和语义搜索等功能构建应用程序功能。 简化 AI 创新 在 Atlas Vector Search 中降低成本并扩大规模 我们宣布了 Atlas Vector Search 中的向量量化功能。通过减少内存(最多减少 96%)和提高向量检索速度,向量量化使客户能够以更高的规模和更低的成本构建各种人工智能和搜索应用。 标量量化向量导入功能现已全面推出,客户能够直接在 Atlas Vector Search 中操作,从其选择的嵌入模型中无缝地导入和使用量化向量。自动量化等其他向量量化功能也将很快上线,客户将能够利用 Atlas Vector Search 全面的工具集,构建和优化各种大规模 AI 和搜索应用程序。 与流行 AI 框架的额外集成 无论您选择什么框架或 LLM,都可以使用 MongoDB 更快地交付您的下一个 AI 项目。AI 技术正在迅速发展,因此快速构建和扩展高性能应用程序,并随着您的需求和可用技术的发展而使用您首选的堆栈非常重要。 MongoDB 与 LangChain、LlamaIndex、Microsoft Semantic Kernel、AutoGen、Haystack、Spring AI、ChatGPT 检索插件等增强的集成套件,使得在 MongoDB 上构建下一代应用程序变得前所未有的简单 。 促进开发者技能提升 新的 MongoDB 学习徽章 与认证相比, MongoDB 的免费学习徽章更快获得 ,也更有针对性,它表明您致力于不断学习,并证明您对特定主题的了解。遵循学习路径,掌握新技能,并获得可在 LinkedIn 上炫耀的数字徽章。 查看两个新的生成式人工智能学习徽章! 构建生成式人工智能应用 :学习使用 Atlas Vector Search 创建创新的生成式人工智能应用,包括检索增强生成 (RAG) 应用。 部署和评估生成式人工智能应用 :从创建到全面部署应用程序,专注于优化性能和评估结果。 了解详情 要了解有关 MongoDB 最新产品公告和更新的更多信息,请查看我们的新产品 公告页面以 及有关产品 更新的所有博客文章 。快乐构建!

October 3, 2024