We have been using C#.Net Driver. Is there any plan to incorporate Vector Search in C#.Net driver ?
Also while querying Index which is Vectorized we need to pass the query which is vector based. For that we need to go and get the embedding done by Hugging face model. Is there support for that in driver or we need to do that outside the driver ?
Basically query itself has to be converted into embeddings. What is recommended way to do that through C#.Net ?
Hello @Marmik_Shah can you elaborate a bit on what you mean by incorporate Vector Search in the C# driver? Are you referring to the helper classes in the C# driver that make it easier to build aggregation pipelines?
We do not have specific support in the drivers to embed your queries, your application would need to send the content to the embedding endpoint first, and then build the query and submit with the driver.
My understanding is I need to use same embeddings as the embeddings used by search index. How would I know which algorithm is used by vector search index?
Is there good documentation/how to around how can I start using Vector Search through C# ?
The Atlas Vector Search index uses an underlying ANN algorithm (HNSW) for doing the approx search for k nearest neighbors among the indexed docs, for the user given query.
using System.Reflection.Emit;
using MongoDB.Bson;
using MongoDB.Bson.Serialization.Attributes;
using MongoDB.Bson.Serialization.Conventions;
using MongoDB.Driver;
using MongoDB.Driver.Search;
public class vectorSearchFilterQuery
{
// define connection to your Atlas cluster
private const string MongoConnectionString = "<connection-string>";
public static void Main(string[] args){
var camelCaseConvention = new ConventionPack { new CamelCaseElementNameConvention() };
ConventionRegistry.Register("CamelCase", camelCaseConvention, type => true);
// connect to your Atlas cluster
var mongoClient = new MongoClient(MongoConnectionString);
// define namespace
var moviesDatabase = mongoClient.GetDatabase("sample_mflix");
var moviesCollection = moviesDatabase.GetCollection<EmbeddedMovie>("embedded_movies");
// define vector embeddings to search
var vector = new[] {0.02421053,...};
// define filter
var yearGtFilter = Builders<EmbeddedMovie>.Filter.Gt("year", 1955);
var yearLtFilter = Builders<EmbeddedMovie>.Filter.Lt("year", 1975);
// define options
var options = new VectorSearchOptions<EmbeddedMovie>() {
Filter = Builders<EmbeddedMovie>.Filter.And(yearGtFilter, yearLtFilter),
IndexName = "vector_index",
NumberOfCandidates = 150
};
// run query
var results = moviesCollection.Aggregate()
.VectorSearch(m => m.Embedding, vector, 10, options)
.Project(Builders<EmbeddedMovie>.Projection
.Include(m => m.Title)
.Include(movie => movie.Plot)
.Include(movie => movie.Year))
.ToList();
// print results
foreach (var movie in results)
{
Console.WriteLine(movie.ToJson());
}
}
}
[BsonIgnoreExtraElements]
public class EmbeddedMovie
{
[BsonIgnoreIfDefault]
public string Title { get; set; }
public string Plot { get; set; }
public int Year { get; set; }
[BsonElement("plot_embedding")]
public double[] Embedding { get; set; }
}
You will generate embeddings using an API call to an embedding model providers like OpenAI, Cohere, or any open source models from a hub like huggingFace. OpenAI text embedding is a good place to get started: