Embedding in mongodb atlas

Instead of embedding using the OpenAI API(in vector search) can we use mongodb embed document ?
reference : How to Do Semantic Search in MongoDB Using Atlas Vector Search | MongoDB
@Jeffery_Schmitz

When using the “Atlas Vector Search” the embedding refers to a vector (array of floats) created using AI models. This must be created using an embedding model from OpenAI, Cohere, or open-source models available on huggingFace (Building Generative AI Applications Using MongoDB: Harnessing the Power of Atlas Vector Search and Open Source Models | MongoDB), others. MTEB leaderboard shows the various embedding models available.

2 Likes

Hi @Prakul_Agarwal

I am trying to implement vector search in atlas. i have collection name projects and on its 3 fields i want to generate embeddings. can you please guide us how to implement it?

Hi @Clima_Champions

For the data already in MongoDB collection, you need to iterate over the data and add embedding field to it:

for doc in collection.find({'plot':{"$exists": True}}).limit(50):
	doc['plot_embedding'] = generate_embedding(doc['plot'])
	collection.replace_one({'_id': doc['_id']}, doc)

where the generate embedding function would be something like this:

openai.api_key = os.getenv("OPENAI_API_KEY")

model = "text-embedding-ada-002"

def generate_embedding(text: str) -> list[float]:
	resp = openai.Embedding.create(
		input=[text], 
		model=model)

	return resp["data"][0]["embedding"] 

all this is explained in the article linked above?

Let me know if there is a specific part I can help with

1 Like