DeepSeek and the Future of LLMs: Why MongoDB’s LLM-agnostic Approach Matters

Han Heloir, Richmond Alake10 min read • Published Feb 01, 2025 • Updated Feb 01, 2025

AI Atlas Vector Search Python

Rate this tutorial

Once again, we observe how rapidly artificial intelligence evolves today. The emergence of DeepSeek, a groundbreaking open-source large language model (LLM), is a testament to the fast pace of innovation within the domain of LLMs with reasoning capabilities.

Developed with a modest budget of under $6 million, DeepSeek-R1 rivals some of the most advanced commercial models, such as OpenAI's GPT-4. It marks a pivotal moment in the evolution of AI, especially in the battle between proprietary and open-source models.

This article explores what makes DeepSeek-R1 unique, how MongoDB’s LLM-agnostic approach complements it, and why this combination is key to staying ahead in the AI race.

What's covered:

Overview of DeepSeek and its comparison with commercial models like GPT-4
Highlights of DeepSeek's training methodology, open-source accessibility, competitive performance, and cost efficiency
Explanation of how MongoDB avoids vendor lock-in, leverages real-time data, and accelerates AI adoption
Step-by-step outline of implementing a retrieval-augmented generation (RAG) system using MongoDB
Guide to building a question-and-answer system by combining DeepSeek-R1 with MongoDB

All code presented in this article can be accessed in this GitHub repository and notebook.

What makes DeepSeek-R1 a game-changer?

DeepSeek-R1 stands out in the crowded AI landscape as a revolutionary LLM that breaks many traditional molds. While commercial LLMs like OpenAI's GPT-4 and Google's Gemini dominate headlines and many applications’ AI stack, DeepSeek-R1 introduces a cost-efficient, open-source alternative that is not only accessible but also competitive across several key benchmarks.

Here’s a closer look at why this model is transformative and what challenges it brings:

1. Reinforcement learning without supervised data

One of DeepSeek-R1's most striking innovations lies in its training methodology:

Reinforcement learning-based training: Unlike traditional LLMs that rely heavily on human-labeled datasets, DeepSeek-R1 was initially trained solely using reinforcement learning. This method allows the model to learn through trial and error, self-correcting its responses. Another aspect is that it minimizes reliance on expensive, large-scale, human-labeled datasets, drastically reducing training costs.
Hybrid approach in later stages: A hybrid approach was introduced to enhance the model's output quality, combining reinforcement learning with high-quality Chain of Thought (CoT) supervised data. This addition enabled the model to refine its reasoning and provide more apparent, structured answers.

2. Open-source accessibility

DeepSeek-R1 disrupts the status quo by being a fully open-source model:

Free access for all: Developers can download, customize, and deploy the model without licensing constraints.
Smaller, efficient variants: In addition to the main model, which has 671 billion parameters, smaller distilled versions (e.g., 1.5 billion parameters) allow for lightweight deployment on consumer-grade hardware like iPhones or M2 Macs.

3. Performance that rivals commercial giants

DeepSeek-R1 has demonstrated exceptional results in independent benchmarks:

MATH-500 benchmark: DeepSeek-R1 scored 97.3%, slightly surpassing OpenAI's o1-1217, which scored 96.4%. This benchmark evaluates models on diverse high school-level mathematical problems requiring detailed reasoning.
AIME 2024: On the American Invitational Mathematics Examination (AIME) 2024 benchmark, DeepSeek-R1 scored 79.8%, edging out OpenAI's o1-1217, which scored 79.2%. This benchmark assesses advanced multi-step mathematical reasoning.
Codeforces benchmark: DeepSeek-R1 achieved a percentile ranking of 96.3%, closely trailing OpenAI's o1-1217, which scored 96.6%. This benchmark evaluates a model's coding and algorithmic reasoning capabilities.
SWE-bench verified: In software engineering tasks, DeepSeek-R1 scored 49.2%, slightly ahead of OpenAI's o1-1217, which scored 48.9%. This benchmark assesses reasoning in software engineering tasks.
GPQA Diamond: For factual reasoning, DeepSeek-R1 scored 71.5%, while OpenAI's o1-1217 achieved 75.7%. This benchmark measures the ability to answer general-purpose knowledge questions.
MMLU: On the Multitask Language Understanding benchmark, which spans various disciplines, DeepSeek-R1 scored 90.8%, slightly below OpenAI's o1-1217, which scored 91.8%.

![Figure 1: Benchmark performance from DeepSeek-R1][1]

Figure 1: Benchmark performance from DeepSeek-R1

MongoDB’s LLM-agnostic approach: empowering AI with flexibility and scalability

As organizations increasingly adopt LLMs, the need for flexible, adaptable data solutions has become more critical than ever. MongoDB’s LLM-agnostic architecture provides a powerful foundation for building and scaling AI solutions, enabling businesses to navigate the rapidly evolving AI landscape with confidence.

1. Avoid vendor lock-in

The AI ecosystem evolves quickly, with new models and technologies emerging regularly. MongoDB Atlas supports integration with any LLM, whether it’s a proprietary model like GPT-4 or an open-source alternative. This allows organizations to:

Experiment freely: Test different models to identify the best fit for your use case.
Adapt to change: Transition to new technologies as they arise without being tied to a single vendor or incurring high switching costs.

This flexibility ensures that your AI strategy remains agile and future-ready, able to adapt to new opportunities and advancements in the field.

2. Leverage real-time data for smarter AI

Effective AI systems rely on diverse, dynamic datasets to deliver meaningful insights. MongoDB’s flexible schema and real-time capabilities enable organizations to:

Store and process diverse data: Seamlessly integrate structured, unstructured, and semi-structured data to fuel AI models.
Scale effortlessly: Handle high-demand workloads with the scalability required for AI-driven applications.

AI platforms often begin with simple use cases, such as chatbots or documentation search tools, but quickly expand to more complex, business-critical applications. MongoDB’s architecture goes beyond just vector search, offering capabilities like CRUD operations, full-text search, and geospatial processing to handle diverse needs. For example:

Time-series data powers predictive maintenance and anomaly detection in manufacturing.
Geospatial data optimizes delivery routes and enhances location-based services in logistics and retail.
Transactional data helps refine personalization and improve decision-making in customer-focused applications.

This comprehensive approach ensures that your AI applications remain adaptable to future needs, no matter how your use cases evolve.

The pace of innovation in AI is relentless, with new tools and models constantly pushing the boundaries of what’s possible. MongoDB’s design supports this evolution by providing a data foundation that grows alongside your ambitions. Whether you’re just starting your AI journey or scaling existing solutions, MongoDB’s ability to handle various data types and workloads ensures your applications remain relevant and capable.

3. Accelerating AI adoption: why timing matters

Although AI adoption is skyrocketing, many organizations are still in the early stages of their journey. This presents a unique opportunity to build your AI platforms strategically, leveraging modern tools and approaches from the ground up. Whether you’re developing proof-of-concept projects or scaling production applications, MongoDB's flexibility can help you accelerate development and maximize impact.

To succeed with AI, organizations need more than just a database—they need a unified data layer capable of managing diverse data sources, scaling with demand, and adapting to changing business needs. MongoDB provides this foundation, empowering you to:

Start small with focused use cases that demonstrate immediate value.
Expand seamlessly into more complex applications as your needs evolve.
Build systems that are not only powerful today but also adaptable for the future.

As AI continues to reshape industries, having the right tools to manage data and power intelligent applications will be key to staying ahead. MongoDB’s LLM-agnostic approach provides the flexibility, scalability, and future-proofing you need to turn your AI vision into reality.

Practical demo: integrating DeepSeek-R1 with MongoDB

![Figure 2: DeepSeek reference architecture with MongoDB][2] This demo shows how to build a question-and-answer system that leverages an open-source movie dataset, MongoDB’s vector search capabilities, and DeepSeek-R1 for generating context-aware answers. Follow these steps:

Prerequisites

Python is installed on your system.
You have access to a MongoDB Atlas cluster.
The DeepSeek-R1 model is available via its API.

Below is the simplified and complete demo code integrating sentence-transformers/all-MiniLM-L6-v2 with MongoDB for embedding generation and retrieval-augmented generation:

Step 1: Install required libraries

Install the required libraries and set up secure environment variables (e.g. for your MongoDB Atlas connection URI):

1     !pip install --quiet -U pymongo sentence-transformers datasets accelerate

1     import os
2     import getpass
3 
4     # Function to securely get and set environment variables
5     def set_env_securely(var_name, prompt):
6         value = getpass.getpass(prompt)
7         os.environ[var_name] = value

Step 2: Data loading

Load a movie dataset from Hugging Face. In this example, we use the MongoDB/embedded_movies dataset. We convert the dataset into a pandas DataFrame, remove any entries missing the plot, and drop the existing embeddings so we can create new ones.

1     from datasets import load_dataset
2     import pandas as pd
3 
4     # Load Dataset from Hugging Face
5     dataset = load_dataset("MongoDB/embedded_movies")
6 
7     # Convert to a pandas DataFrame and drop records with missing plots
8     dataset_df = pd.DataFrame(dataset['train'])
9     dataset_df = dataset_df.dropna(subset=['fullplot'])
10     print("\nNumber of missing values in each column after removal:")
11     print(dataset_df.isnull().sum())
12 
13     # Drop the existing plot embeddings as new ones will be generated
14     dataset_df = dataset_df.drop(columns=['plot_embedding'])
15     dataset_df.head()

Step 3: Generating embeddings

Generate new embeddings for the movie plots using the Sentence-Transformers model `all-MiniLM-L6-v2`. These embeddings will later power our vector search.

1     from sentence_transformers import SentenceTransformer
2 
3     # Load the embedding model
4     embedding_model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
5 
6     # Function to generate embeddings from text
7     def generate_embedding(text):
8         return embedding_model.encode([text])[0].tolist()
9 
10     # Generate and attach embeddings to the dataset
11     dataset_df["embedding"] = dataset_df["fullplot"].apply(generate_embedding)

Step 4: Setting up MongoDB (operational and vector database)

MongoDB acts as both an operational and a vector database for the RAG system. MongoDB Atlas specifically provides a database solution that efficiently stores, queries and retrieves vector embeddings.

Creating a database and collection within MongoDB is made simple with MongoDB Atlas.

First, register for a MongoDB Atlas account. For existing users, sign into MongoDB Atlas.
Follow the instructions. Select Atlas UI as the procedure to deploy your first cluster.

Follow MongoDB’s steps to get the connection string from the Atlas UI. After setting up the database and obtaining the Atlas cluster connection URI, securely store the URI within your development environment.

1     # Securely set MongoDB URI
2     set_env_securely("MONGO_URI", "Enter your MONGO URI: ")

1     import pymongo
2 
3     def get_mongo_client(mongo_uri):
4         """Establish and validate connection to MongoDB."""
5         client = pymongo.MongoClient(mongo_uri, appname="devrel.showcase.rag.deepseek_rag_movies.python")
6         ping_result = client.admin.command('ping')
7         if ping_result.get('ok') == 1.0:
8             print("Connection to MongoDB successful")
9             return client
10         else:
11             print("Connection to MongoDB failed")
12         return None
13 
14     MONGO_URI = os.environ['MONGO_URI']
15     mongo_client = get_mongo_client(MONGO_URI)
16 
17     DB_NAME = "movies_database"
18     COLLECTION_NAME = "movies_collection"
19 
20     # Create (or get) the database and collection
21     db = mongo_client[DB_NAME]
22     collection = db[COLLECTION_NAME]
23 
24     # Clear any existing data in the collection
25     collection.delete_many({})

Step 5: Data Ingestion

Insert the processed movie documents (with new embeddings) into MongoDB:

1     documents = dataset_df.to_dict('records')
2     collection.insert_many(documents)
3     print("Data ingestion into MongoDB completed")

Step 6: Vector Index Creation

Create a vector search index on the embedding field in your MongoDB collection. This index uses cosine similarity on 384-dimensional embeddings

1     import time
2     from pymongo.operations import SearchIndexModel
3 
4     embedding_field_name = "embedding"  # Field name containing embeddings
5     vector_search_index_name = "vector_index"
6 
7     def create_vector_index_definition(dimensions):
8         return {
9             "fields": [
10                 {
11                     "type": "vector",
12                     "path": embedding_field_name,
13                     "numDimensions": dimensions,
14                     "similarity": "cosine"
15                 }
16             ]
17         }
18 
19     def setup_vector_search_index(collection, index_definition, index_name="vector_index"):
20         new_vector_search_index_model = SearchIndexModel(
21             definition=index_definition,
22             name=index_name,
23             type="vectorSearch"
24         )
25         try:
26             result = collection.create_search_index(model=new_vector_search_index_model)
27             print(f"Creating index '{index_name}'...")
28             print(f"Waiting for 30 seconds to allow index '{index_name}' to be created...")
29             time.sleep(30)
30             print(f"Index '{index_name}' is ready.")
31             return result
32         except Exception as e:
33             print(f"Error creating new vector search index '{index_name}': {str(e)}")
34             return None
35 
36     DIMENSIONS = 384
37     vector_index_definition = create_vector_index_definition(dimensions=DIMENSIONS)
38     setup_vector_search_index(collection, vector_index_definition, vector_search_index_name)

Step 7: Vector search function

Define a function that performs a vector search in MongoDB. Given a user query, the function generates an embedding, runs a search pipeline, and returns the top matching movie documents.

1     def vector_search(user_query, top_k=150):
2         """
3         Perform a vector search in the MongoDB collection based on the user query.
4         """
5         query_embedding = generate_embedding(user_query)
6         if query_embedding is None:
7             return "Invalid query or embedding generation failed."
8 
9         vector_search_stage = {
10             "$vectorSearch": {
11                 "index": vector_search_index_name,
12                 "queryVector": query_embedding,
13                 "path": embedding_field_name,
14                 "numCandidates": top_k,
15                 "limit": 5,  # Return top 5 matches
16             }
17         }
18 
19         project_stage = {
20             "$project": {
21                 "_id": 0,
22                 "fullplot": 1,
23                 "title": 1,
24                 "genres": 1,
25                 "score": {"$meta": "vectorSearchScore"},
26             }
27         }
28 
29         pipeline = [vector_search_stage, project_stage]
30         results = collection.aggregate(pipeline)
31         return list(results)

Step 8: Semantic search example

Run a sample query to perform semantic search over the movie dataset:

1     import pprint
2 
3     query = "What are the some interesting action movies to watch that include business?"
4     results = vector_search(query)
5 
6     print(f"\nTop 5 results for query '{query}':")
7     for result in results:
8         print(f"Title: {result['title']}, Score: {result['score']:.4f}")
9 
10     # Optionally, preview results as a DataFrame
11     pd.DataFrame(results).head()

Step 9: Retrieval augmented generation (RAG) with DeepSeek-R1

Integrate the DeepSeek-R1 model (loaded from Hugging Face) to generate a context-aware answer by combining the user query with the retrieved documents. In this example, we load the model `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` and run a RAG query.

1     from transformers import AutoTokenizer, AutoModelForCausalLM
2 
3     # Load DeepSeek model and tokenizer
4     tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B", device_map="cuda")
5     model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")
6     model.to("cuda")
7 
8     def rag_query(user_query):
9         # Retrieve matching documents using vector search
10         retrieved_docs = vector_search(user_query)
11         # Combine the query and search results into a context string
12         combined_information = f"Query: {user_query}\nContinue to answer the query using the Search Results:\n{retrieved_docs}."
13         
14         # Tokenize input and move to GPU
15         input_ids = tokenizer(combined_information, return_tensors="pt").to("cuda")
16         response = model.generate(**input_ids, max_new_tokens=1000)
17         return tokenizer.decode(response[0], skip_special_tokens=False)
18 
19     # Example query to generate a concise answer
20     print(rag_query("What's a romantic movie that I can watch with my wife? Make your response concise"))

In this tutorial, you learned how to:

Load and Clean Data: Use the Hugging Face dataset for movies, cleaning the data and generating new embeddings.
Set Up MongoDB: Connect to MongoDB Atlas, ingest data, and configure a vector search index.
Perform Semantic Search: Implement a vector search pipeline to retrieve relevant movie documents based on a query.
Integrate DeepSeek-R1: Leverage a DeepSeek-R1 model for retrieval-augmented generation to produce context-aware answers.

All code presented in this article can be accessed in this GitHub repository and notebook.

Frequently asked questions (FAQs)

1. What is DeepSeek-R1, and how does it compare to GPT-4?

DeepSeek-R1 is an open-source large language model (LLM) developed with a budget under $6 million. It rivals commercial models like OpenAI's GPT-4 by offering competitive performance across several benchmarks, cost efficiency, and open accessibility. Unlike GPT-4, DeepSeek-R1 is fully open-source, allowing developers to customize and deploy it without licensing constraints.

2. How does DeepSeek-R1 achieve cost efficiency compared to other LLMs?

DeepSeek-R1 was developed with a modest budget of under $6 million, significantly lower than the billions spent by companies like OpenAI and Google. Additionally, using the DeepSeek API costs approximately $22 per million tokens, which is substantially less than OpenAI’s GPT-4 at $60 per million tokens.

3. What makes MongoDB’s LLM-agnostic approach beneficial for businesses?

MongoDB’s LLM-agnostic architecture allows businesses to integrate and switch between different large language models without being tied to a single vendor. This flexibility enables organizations to experiment with various models, adapt to new technologies seamlessly, and leverage real-time, diverse datasets to enhance their AI-driven applications.

4. Can DeepSeek-R1 be deployed on consumer-grade hardware?

Yes, DeepSeek-R1 offers smaller, distilled versions with as few as 1.5 billion parameters, enabling lightweight deployment on consumer-grade hardware such as iPhones or M2 Macs. This makes it accessible for a wide range of applications without requiring significant computational resources.

5. How does the integration of MongoDB enhance the capabilities of DeepSeek-R1?

Integrating MongoDB with DeepSeek-R1 provides a robust data foundation that supports real-time, structured, and unstructured data retrieval. MongoDB’s multi-modal search capabilities, including vector similarity, full-text, and metadata filtering, enhance the accuracy and relevance of the AI’s responses, making applications more dynamic and contextually aware.

Top Comments in Forums

There are no comments on this article yet.

Start the Conversation

Rate this tutorial

Quickstart

MongoDB Atlas Serverless Instances: Quick Start

Aug 13, 2024 | 4 min read

Article

Keeping Your Costs Down With MongoDB Atlas Serverless Instances

Oct 01, 2024 | 3 min read

Article

How to Easily Pause and Resume MongoDB Atlas Clusters

Sep 11, 2024 | 5 min read

Article

Atlas Online Archive: Efficiently Manage the Data Lifecycle

Jan 13, 2025 | 8 min read

What makes DeepSeek-R1 a game-changer?
MongoDB’s LLM-agnostic approach: empowering AI with flexibility and scalability
Practical demo: integrating DeepSeek-R1 with MongoDB

Atlas

DeepSeek and the Future of LLMs: Why MongoDB’s LLM-agnostic Approach Matters

What makes DeepSeek-R1 a game-changer?

1. Reinforcement learning without supervised data

2. Open-source accessibility

3. Performance that rivals commercial giants

MongoDB’s LLM-agnostic approach: empowering AI with flexibility and scalability

1. Avoid vendor lock-in

2. Leverage real-time data for smarter AI

3. Accelerating AI adoption: why timing matters

Practical demo: integrating DeepSeek-R1 with MongoDB

Step 1: Install required libraries

Step 2: Data loading

Step 3: Generating embeddings

Step 4: Setting up MongoDB (operational and vector database)

Step 5: Data Ingestion

Step 6: Vector Index Creation

Step 7: Vector search function

Step 8: Semantic search example

Step 9: Retrieval augmented generation (RAG) with DeepSeek-R1

Frequently asked questions (FAQs)

1. What is DeepSeek-R1, and how does it compare to GPT-4?

2. How does DeepSeek-R1 achieve cost efficiency compared to other LLMs?

3. What makes MongoDB’s LLM-agnostic approach beneficial for businesses?

4. Can DeepSeek-R1 be deployed on consumer-grade hardware?

5. How does the integration of MongoDB enhance the capabilities of DeepSeek-R1?

Top Comments in Forums

Related

MongoDB Atlas Serverless Instances: Quick Start

Keeping Your Costs Down With MongoDB Atlas Serverless Instances

How to Easily Pause and Resume MongoDB Atlas Clusters

Atlas Online Archive: Efficiently Manage the Data Lifecycle

Table of Contents

1	import os
2	import getpass
3
4	# Function to securely get and set environment variables
5	def set_env_securely(var_name, prompt):
6	value = getpass.getpass(prompt)
7	os.environ[var_name] = value

1	from datasets import load_dataset
2	import pandas as pd
3
4	# Load Dataset from Hugging Face
5	dataset = load_dataset("MongoDB/embedded_movies")
6
7	# Convert to a pandas DataFrame and drop records with missing plots
8	dataset_df = pd.DataFrame(dataset['train'])
9	dataset_df = dataset_df.dropna(subset=['fullplot'])
10	print("\nNumber of missing values in each column after removal:")
11	print(dataset_df.isnull().sum())
12
13	# Drop the existing plot embeddings as new ones will be generated
14	dataset_df = dataset_df.drop(columns=['plot_embedding'])
15	dataset_df.head()

1	from sentence_transformers import SentenceTransformer
2
3	# Load the embedding model
4	embedding_model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
5
6	# Function to generate embeddings from text
7	def generate_embedding(text):
8	return embedding_model.encode([text])[0].tolist()
9
10	# Generate and attach embeddings to the dataset
11	dataset_df["embedding"] = dataset_df["fullplot"].apply(generate_embedding)

1	# Securely set MongoDB URI
2	set_env_securely("MONGO_URI", "Enter your MONGO URI: ")

1	import pymongo
2
3	def get_mongo_client(mongo_uri):
4	"""Establish and validate connection to MongoDB."""
5	client = pymongo.MongoClient(mongo_uri, appname="devrel.showcase.rag.deepseek_rag_movies.python")
6	ping_result = client.admin.command('ping')
7	if ping_result.get('ok') == 1.0:
8	print("Connection to MongoDB successful")
9	return client
10	else:
11	print("Connection to MongoDB failed")
12	return None
13
14	MONGO_URI = os.environ['MONGO_URI']
15	mongo_client = get_mongo_client(MONGO_URI)
16
17	DB_NAME = "movies_database"
18	COLLECTION_NAME = "movies_collection"
19
20	# Create (or get) the database and collection
21	db = mongo_client[DB_NAME]
22	collection = db[COLLECTION_NAME]
23
24	# Clear any existing data in the collection
25	collection.delete_many({})

1	documents = dataset_df.to_dict('records')
2	collection.insert_many(documents)
3	print("Data ingestion into MongoDB completed")

1	import time
2	from pymongo.operations import SearchIndexModel
3
4	embedding_field_name = "embedding" # Field name containing embeddings
5	vector_search_index_name = "vector_index"
6
7	def create_vector_index_definition(dimensions):
8	return {
9	"fields": [
10	{
11	"type": "vector",
12	"path": embedding_field_name,
13	"numDimensions": dimensions,
14	"similarity": "cosine"
15	}
16	]
17	}
18
19	def setup_vector_search_index(collection, index_definition, index_name="vector_index"):
20	new_vector_search_index_model = SearchIndexModel(
21	definition=index_definition,
22	name=index_name,
23	type="vectorSearch"
24	)
25	try:
26	result = collection.create_search_index(model=new_vector_search_index_model)
27	print(f"Creating index '{index_name}'...")
28	print(f"Waiting for 30 seconds to allow index '{index_name}' to be created...")
29	time.sleep(30)
30	print(f"Index '{index_name}' is ready.")
31	return result
32	except Exception as e:
33	print(f"Error creating new vector search index '{index_name}': {str(e)}")
34	return None
35
36	DIMENSIONS = 384
37	vector_index_definition = create_vector_index_definition(dimensions=DIMENSIONS)
38	setup_vector_search_index(collection, vector_index_definition, vector_search_index_name)

1	def vector_search(user_query, top_k=150):
2	"""
3	Perform a vector search in the MongoDB collection based on the user query.
4	"""
5	query_embedding = generate_embedding(user_query)
6	if query_embedding is None:
7	return "Invalid query or embedding generation failed."
8
9	vector_search_stage = {
10	"$vectorSearch": {
11	"index": vector_search_index_name,
12	"queryVector": query_embedding,
13	"path": embedding_field_name,
14	"numCandidates": top_k,
15	"limit": 5, # Return top 5 matches
16	}
17	}
18
19	project_stage = {
20	"$project": {
21	"_id": 0,
22	"fullplot": 1,
23	"title": 1,
24	"genres": 1,
25	"score": {"$meta": "vectorSearchScore"},
26	}
27	}
28
29	pipeline = [vector_search_stage, project_stage]
30	results = collection.aggregate(pipeline)
31	return list(results)

1	import pprint
2
3	query = "What are the some interesting action movies to watch that include business?"
4	results = vector_search(query)
5
6	print(f"\nTop 5 results for query '{query}':")
7	for result in results:
8	print(f"Title: {result['title']}, Score: {result['score']:.4f}")
9
10	# Optionally, preview results as a DataFrame
11	pd.DataFrame(results).head()

1	from transformers import AutoTokenizer, AutoModelForCausalLM
2
3	# Load DeepSeek model and tokenizer
4	tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B", device_map="cuda")
5	model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B")
6	model.to("cuda")
7
8	def rag_query(user_query):
9	# Retrieve matching documents using vector search
10	retrieved_docs = vector_search(user_query)
11	# Combine the query and search results into a context string
12	combined_information = f"Query: {user_query}\nContinue to answer the query using the Search Results:\n{retrieved_docs}."
13
14	# Tokenize input and move to GPU
15	input_ids = tokenizer(combined_information, return_tensors="pt").to("cuda")
16	response = model.generate(**input_ids, max_new_tokens=1000)
17	return tokenizer.decode(response[0], skip_special_tokens=False)
18
19	# Example query to generate a concise answer
20	print(rag_query("What's a romantic movie that I can watch with my wife? Make your response concise"))