Instead of embedding using the OpenAI API(in vector search) can we use mongodb embed document ?
reference : How to Do Semantic Search in MongoDB Using Atlas Vector Search | MongoDB
@Jeffery_Schmitz
When using the “Atlas Vector Search” the embedding refers to a vector (array of floats) created using AI models. This must be created using an embedding model from OpenAI, Cohere, or open-source models available on huggingFace (Building Generative AI Applications Using MongoDB: Harnessing the Power of Atlas Vector Search and Open Source Models | MongoDB), others. MTEB leaderboard shows the various embedding models available.
I am trying to implement vector search in atlas. i have collection name projects and on its 3 fields i want to generate embeddings. can you please guide us how to implement it?
For the data already in MongoDB collection, you need to iterate over the data and add embedding field to it:
for doc in collection.find({'plot':{"$exists": True}}).limit(50):
doc['plot_embedding'] = generate_embedding(doc['plot'])
collection.replace_one({'_id': doc['_id']}, doc)
where the generate embedding function would be something like this:
openai.api_key = os.getenv("OPENAI_API_KEY")
model = "text-embedding-ada-002"
def generate_embedding(text: str) -> list[float]:
resp = openai.Embedding.create(
input=[text],
model=model)
return resp["data"][0]["embedding"]
all this is explained in the article linked above?
Let me know if there is a specific part I can help with