Overview
在本指南中,您可以学习如何使用Java驱动程序中的MongoDB 向量搜索功能。Aggregates构建者类提供了可用于创建$vectorSearch管道阶段的vectorSearch()辅助程序。此管道阶段允许您对文档执行语义搜索。语义搜索是一种搜索,可查找与您提供的搜索术语或短语含义相似但不一定相同的信息。
执行向量搜索
要使用此功能,您必须创建向量搜索索引并为向量嵌入创建索引。要了解如何以编程方式创建向量搜索索引,请参阅索引指南中的MongoDB Search 和 Vector Search 索引部分。如要了解有关向量嵌入的更多信息,请参阅 Atlas 文档中的如何为向量搜索的向量嵌入创建索引。
在向量嵌入上创建向量搜索索引后,您可以在管道阶段引用此索引,如下节所示。
向量搜索示例
以下示例展示了如何构建一个聚合管道,从而使用 vectorSearch() 和 project() 方法计算向量搜索分数:
// Create an instance of the BinaryVector class as the query vector BinaryVector queryVector = BinaryVector.floatVector( new float[]{0.0001f, 1.12345f, 2.23456f, 3.34567f, 4.45678f}); // Specify the index name for the vector embedding index String indexName = "mflix_movies_embedding_index"; // Specify the path of the field to search on FieldSearchPath fieldSearchPath = fieldPath("plot_embedding"); // Limit the number of matches to 1 int limit = 1; // Create a pre-filter to only search within a subset of documents VectorSearchOptions options = exactVectorSearchOptions() .filter(gte("year", 2016)); // Create the vectorSearch pipeline stage List<Bson> pipeline = asList( vectorSearch( fieldSearchPath, queryVector, indexName, limit, options), project( metaVectorSearchScore("vectorSearchScore")));
提示
查询向量类型
前面的示例创建了一个 BinaryVector实例提供服务查询向量,但您也可以创建 Double 实例的 List。 但是,我们建议您使用 BinaryVector 类型以提高存储效率。
以下示例展示了如何运行聚合并从上述聚合管道的结果中打印向量搜索分数:
Document found = collection.aggregate(pipeline).first(); double score = found.getDouble("vectorSearchScore").doubleValue(); System.out.println("vectorSearch score: " + score);
查询自动嵌入索引
You can automate vector generation for text searches by querying an auto-embedding MongoDB Vector Search index. To learn about how to create an auto-embedding index, see MongoDB Auto-Embedding Search Index Model.
The following example constructs a MongoDB Vector Search query that searches for semantic similarity to the phrase time travel in the plot field. The query uses an auto-embedding MongoDB Vector Search index on the plot field named auto_embedding_index:
List<Bson> pipeline = asList( vectorSearch( fieldPath("plot"), textQuery("time travel"), "auto_embedding_index", 10L, approximateVectorSearchOptions(150L) ), project( fields(include("title", "plot"), excludeId()) ) ); List<Document> results = collection.aggregate(pipeline).into(new ArrayList<>()); for (Document doc : results) { System.out.println("Title: " + doc.getString("title")); System.out.println("Plot: " + doc.getString("plot")); System.out.println("---"); }
Title: Manuel on the Island of Wonders Plot: Manuel's fantasy travel through Time goes from Long Ago (Episode 1 - O jardim proibido / Le Jardin interdit) through Now (Episode 2 - O pique-nique dos sonhos / Le Pique-nique des rèves), ... --- Title: 11 Minutes Ago Plot: Traveling in 11-minute increments, a time-tumbler from 48-years in the future spends two years of his life weaving through a two-hour wedding reception. --- Title: Time Freak Plot: A neurotic inventor creates a time machine and gets lost traveling around yesterday. --- Title: Timecrimes Plot: A man accidentally gets into a time machine and travels back in time nearly an hour. Finding himself will be the first of a series of disasters of unforeseeable consequences. --- Title: The Little Girl Who Conquered Time Plot: A high-school girl acquires the ability to time travel. --- Title: Time Traveller Plot: A high-school girl acquires the ability to time travel. --- Title: Je t'aime je t'aime Plot: Recovering from an attempted suicide, a man is selected to participate in a time travel experiment that has only been tested on mice. A malfunction in the experiment causes the man to ... --- Title: A.P.E.X. Plot: A time-travel experiment in which a robot probe is sent from the year 2073 to the year 1973 goes terribly wrong thrusting one of the project scientists, a man named Nicholas Sinclair into a... --- Title: The Ah of Life Plot: Theoretical mathematician, Nigel Kline finds himself the subject of his own vertical time study. Entering into Einstein's relativity, three versions of Nigel face off with each other, weaving time and space in a world of fluid moments. --- Title: About Time Plot: At the age of 21, Tim discovers he can travel in time and change what happens and has happened in his own life. His decision to make his world a better place by getting a girlfriend turns out not to be as easy as you might think. ---
注意
When using an auto-embedding index, directly provide the text to search rather than a vector representation of that text.
For more information about auto-embedding MongoDB Vector Search indexes, see the MongoDB Auto-Embedding Search Index Model section of the Indexes guide.
API 文档
要进一步了解本指南所提及的方法和类型,请参阅以下 API 文档: