DeepSeek and the Future of LLMs: Why MongoDB’s LLM-agnostic Approach Matters
Han Heloir, Richmond Alake10 min read • Published Feb 01, 2025 • Updated Feb 01, 2025
Rate this tutorial
Once again, we observe how rapidly artificial intelligence evolves today. The emergence of DeepSeek, a groundbreaking open-source large language model (LLM), is a testament to the fast pace of innovation within the domain of LLMs with reasoning capabilities.
Developed with a modest budget of under $6 million, DeepSeek-R1 rivals some of the most advanced commercial models, such as OpenAI's GPT-4. It marks a pivotal moment in the evolution of AI, especially in the battle between proprietary and open-source models.
This article explores what makes DeepSeek-R1 unique, how MongoDB’s LLM-agnostic approach complements it, and why this combination is key to staying ahead in the AI race.
What's covered:
- Overview of DeepSeek and its comparison with commercial models like GPT-4
- Highlights of DeepSeek's training methodology, open-source accessibility, competitive performance, and cost efficiency
- Explanation of how MongoDB avoids vendor lock-in, leverages real-time data, and accelerates AI adoption
- Step-by-step outline of implementing a retrieval-augmented generation (RAG) system using MongoDB
- Guide to building a question-and-answer system by combining DeepSeek-R1 with MongoDB
DeepSeek-R1 stands out in the crowded AI landscape as a revolutionary LLM that breaks many traditional molds. While commercial LLMs like OpenAI's GPT-4 and Google's Gemini dominate headlines and many applications’ AI stack, DeepSeek-R1 introduces a cost-efficient, open-source alternative that is not only accessible but also competitive across several key benchmarks.
Here’s a closer look at why this model is transformative and what challenges it brings:
One of DeepSeek-R1's most striking innovations lies in its training methodology:
- Reinforcement learning-based training: Unlike traditional LLMs that rely heavily on human-labeled datasets, DeepSeek-R1 was initially trained solely using reinforcement learning. This method allows the model to learn through trial and error, self-correcting its responses. Another aspect is that it minimizes reliance on expensive, large-scale, human-labeled datasets, drastically reducing training costs.
- Hybrid approach in later stages: A hybrid approach was introduced to enhance the model's output quality, combining reinforcement learning with high-quality Chain of Thought (CoT) supervised data. This addition enabled the model to refine its reasoning and provide more apparent, structured answers.
DeepSeek-R1 disrupts the status quo by being a fully open-source model:
- Free access for all: Developers can download, customize, and deploy the model without licensing constraints.
- Smaller, efficient variants: In addition to the main model, which has 671 billion parameters, smaller distilled versions (e.g., 1.5 billion parameters) allow for lightweight deployment on consumer-grade hardware like iPhones or M2 Macs.
DeepSeek-R1 has demonstrated exceptional results in independent benchmarks:
- MATH-500 benchmark: DeepSeek-R1 scored 97.3%, slightly surpassing OpenAI's o1-1217, which scored 96.4%. This benchmark evaluates models on diverse high school-level mathematical problems requiring detailed reasoning.
- AIME 2024: On the American Invitational Mathematics Examination (AIME) 2024 benchmark, DeepSeek-R1 scored 79.8%, edging out OpenAI's o1-1217, which scored 79.2%. This benchmark assesses advanced multi-step mathematical reasoning.
- Codeforces benchmark: DeepSeek-R1 achieved a percentile ranking of 96.3%, closely trailing OpenAI's o1-1217, which scored 96.6%. This benchmark evaluates a model's coding and algorithmic reasoning capabilities.
- SWE-bench verified: In software engineering tasks, DeepSeek-R1 scored 49.2%, slightly ahead of OpenAI's o1-1217, which scored 48.9%. This benchmark assesses reasoning in software engineering tasks.
- GPQA Diamond: For factual reasoning, DeepSeek-R1 scored 71.5%, while OpenAI's o1-1217 achieved 75.7%. This benchmark measures the ability to answer general-purpose knowledge questions.
- MMLU: On the Multitask Language Understanding benchmark, which spans various disciplines, DeepSeek-R1 scored 90.8%, slightly below OpenAI's o1-1217, which scored 91.8%.
![Figure 1: Benchmark performance from DeepSeek-R1][1]
Figure 1: Benchmark performance from DeepSeek-R1
As organizations increasingly adopt LLMs, the need for flexible, adaptable data solutions has become more critical than ever. MongoDB’s LLM-agnostic architecture provides a powerful foundation for building and scaling AI solutions, enabling businesses to navigate the rapidly evolving AI landscape with confidence.
The AI ecosystem evolves quickly, with new models and technologies emerging regularly. MongoDB Atlas supports integration with any LLM, whether it’s a proprietary model like GPT-4 or an open-source alternative. This allows organizations to:
- Experiment freely: Test different models to identify the best fit for your use case.
- Adapt to change: Transition to new technologies as they arise without being tied to a single vendor or incurring high switching costs.
This flexibility ensures that your AI strategy remains agile and future-ready, able to adapt to new opportunities and advancements in the field.
Effective AI systems rely on diverse, dynamic datasets to deliver meaningful insights. MongoDB’s flexible schema and real-time capabilities enable organizations to:
- Store and process diverse data: Seamlessly integrate structured, unstructured, and semi-structured data to fuel AI models.
- Scale effortlessly: Handle high-demand workloads with the scalability required for AI-driven applications.
AI platforms often begin with simple use cases, such as chatbots or documentation search tools, but quickly expand to more complex, business-critical applications. MongoDB’s architecture goes beyond just vector search, offering capabilities like CRUD operations, full-text search, and geospatial processing to handle diverse needs. For example:
- Time-series data powers predictive maintenance and anomaly detection in manufacturing.
- Geospatial data optimizes delivery routes and enhances location-based services in logistics and retail.
- Transactional data helps refine personalization and improve decision-making in customer-focused applications.
This comprehensive approach ensures that your AI applications remain adaptable to future needs, no matter how your use cases evolve.
The pace of innovation in AI is relentless, with new tools and models constantly pushing the boundaries of what’s possible. MongoDB’s design supports this evolution by providing a data foundation that grows alongside your ambitions. Whether you’re just starting your AI journey or scaling existing solutions, MongoDB’s ability to handle various data types and workloads ensures your applications remain relevant and capable.
Although AI adoption is skyrocketing, many organizations are still in the early stages of their journey. This presents a unique opportunity to build your AI platforms strategically, leveraging modern tools and approaches from the ground up. Whether you’re developing proof-of-concept projects or scaling production applications, MongoDB's flexibility can help you accelerate development and maximize impact.
To succeed with AI, organizations need more than just a database—they need a unified data layer capable of managing diverse data sources, scaling with demand, and adapting to changing business needs. MongoDB provides this foundation, empowering you to:
- Start small with focused use cases that demonstrate immediate value.
- Expand seamlessly into more complex applications as your needs evolve.
- Build systems that are not only powerful today but also adaptable for the future.
As AI continues to reshape industries, having the right tools to manage data and power intelligent applications will be key to staying ahead. MongoDB’s LLM-agnostic approach provides the flexibility, scalability, and future-proofing you need to turn your AI vision into reality.
![Figure 2: DeepSeek reference architecture with MongoDB][2]
This demo shows how to build a question-and-answer system that leverages an open-source movie dataset, MongoDB’s vector search capabilities, and DeepSeek-R1 for generating context-aware answers. Follow these steps:
- Python is installed on your system.
- You have access to a MongoDB Atlas cluster.
- The DeepSeek-R1 model is available via its API.
Below is the simplified and complete demo code integrating sentence-transformers/all-MiniLM-L6-v2 with MongoDB for embedding generation and retrieval-augmented generation:
Install the required libraries and set up secure environment variables (e.g. for your MongoDB Atlas connection URI):
1 !pip install --quiet -U pymongo sentence-transformers datasets accelerate
1 import os 2 import getpass 3 4 # Function to securely get and set environment variables 5 def set_env_securely(var_name, prompt): 6 value = getpass.getpass(prompt) 7 os.environ[var_name] = value
Load a movie dataset from Hugging Face. In this example, we use the MongoDB/embedded_movies dataset. We convert the dataset into a pandas DataFrame, remove any entries missing the plot, and drop the existing embeddings so we can create new ones.
1 from datasets import load_dataset 2 import pandas as pd 3 4 # Load Dataset from Hugging Face 5 dataset = load_dataset("MongoDB/embedded_movies") 6 7 # Convert to a pandas DataFrame and drop records with missing plots 8 dataset_df = pd.DataFrame(dataset['train']) 9 dataset_df = dataset_df.dropna(subset=['fullplot']) 10 print("\nNumber of missing values in each column after removal:") 11 print(dataset_df.isnull().sum()) 12 13 # Drop the existing plot embeddings as new ones will be generated 14 dataset_df = dataset_df.drop(columns=['plot_embedding']) 15 dataset_df.head()
Generate new embeddings for the movie plots using the Sentence-Transformers model `
all-MiniLM-L6-v2`
. These embeddings will later power our vector search.1 from sentence_transformers import SentenceTransformer 2 3 # Load the embedding model 4 embedding_model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2') 5 6 # Function to generate embeddings from text 7 def generate_embedding(text): 8 return embedding_model.encode([text])[0].tolist() 9 10 # Generate and attach embeddings to the dataset 11 dataset_df["embedding"] = dataset_df["fullplot"].apply(generate_embedding)
MongoDB acts as both an operational and a vector database for the RAG system. MongoDB Atlas specifically provides a database solution that efficiently stores, queries and retrieves vector embeddings.
Creating a database and collection within MongoDB is made simple with MongoDB Atlas.
- Follow the instructions. Select Atlas UI as the procedure to deploy your first cluster.
Follow MongoDB’s steps to get the connection string from the Atlas UI. After setting up the database and obtaining the Atlas cluster connection URI, securely store the URI within your development environment.
1 # Securely set MongoDB URI 2 set_env_securely("MONGO_URI", "Enter your MONGO URI: ")
1 import pymongo 2 3 def get_mongo_client(mongo_uri): 4 """Establish and validate connection to MongoDB.""" 5 client = pymongo.MongoClient(mongo_uri, appname="devrel.showcase.rag.deepseek_rag_movies.python") 6 ping_result = client.admin.command('ping') 7 if ping_result.get('ok') == 1.0: 8 print("Connection to MongoDB successful") 9 return client 10 else: 11 print("Connection to MongoDB failed") 12 return None 13 14 MONGO_URI = os.environ['MONGO_URI'] 15 mongo_client = get_mongo_client(MONGO_URI) 16 17 DB_NAME = "movies_database" 18 COLLECTION_NAME = "movies_collection" 19 20 # Create (or get) the database and collection 21 db = mongo_client[DB_NAME] 22 collection = db[COLLECTION_NAME] 23 24 # Clear any existing data in the collection 25 collection.delete_many({})
Insert the processed movie documents (with new embeddings) into MongoDB:
1 documents = dataset_df.to_dict('records') 2 collection.insert_many(documents) 3 print("Data ingestion into MongoDB completed")
Create a vector search index on the
embedding
field in your MongoDB collection. This index uses cosine similarity on 384-dimensional embeddings1 import time 2 from pymongo.operations import SearchIndexModel 3 4 embedding_field_name = "embedding" # Field name containing embeddings 5 vector_search_index_name = "vector_index" 6 7 def create_vector_index_definition(dimensions): 8 return { 9 "fields": [ 10 { 11 "type": "vector", 12 "path": embedding_field_name, 13 "numDimensions": dimensions, 14 "similarity": "cosine" 15 } 16 ] 17 } 18 19 def setup_vector_search_index(collection, index_definition, index_name="vector_index"): 20 new_vector_search_index_model = SearchIndexModel( 21 definition=index_definition, 22 name=index_name, 23 type="vectorSearch" 24 ) 25 try: 26 result = collection.create_search_index(model=new_vector_search_index_model) 27 print(f"Creating index '{index_name}'...") 28 print(f"Waiting for 30 seconds to allow index '{index_name}' to be created...") 29 time.sleep(30) 30 print(f"Index '{index_name}' is ready.") 31 return result 32 except Exception as e: 33 print(f"Error creating new vector search index '{index_name}': {str(e)}") 34 return None 35 36 DIMENSIONS = 384 37 vector_index_definition = create_vector_index_definition(dimensions=DIMENSIONS) 38 setup_vector_search_index(collection, vector_index_definition, vector_search_index_name)
Define a function that performs a vector search in MongoDB. Given a user query, the function generates an embedding, runs a search pipeline, and returns the top matching movie documents.
1 def vector_search(user_query, top_k=150): 2 """ 3 Perform a vector search in the MongoDB collection based on the user query. 4 """ 5 query_embedding = generate_embedding(user_query) 6 if query_embedding is None: 7 return "Invalid query or embedding generation failed." 8 9 vector_search_stage = { 10 "$vectorSearch": { 11 "index": vector_search_index_name, 12 "queryVector": query_embedding, 13 "path": embedding_field_name, 14 "numCandidates": top_k, 15 "limit": 5, # Return top 5 matches 16 } 17 } 18 19 project_stage = { 20 "$project": { 21 "_id": 0, 22 "fullplot": 1, 23 "title": 1, 24 "genres": 1, 25 "score": {"$meta": "vectorSearchScore"}, 26 } 27 } 28 29 pipeline = [vector_search_stage, project_stage] 30 results = collection.aggregate(pipeline) 31 return list(results)
Run a sample query to perform semantic search over the movie dataset:
1 import pprint 2 3 query = "What are the some interesting action movies to watch that include business?" 4 results = vector_search(query) 5 6 print(f"\nTop 5 results for query '{query}':") 7 for result in results: 8 print(f"Title: {result['title']}, Score: {result['score']:.4f}") 9 10 # Optionally, preview results as a DataFrame 11 pd.DataFrame(results).head()
Integrate the DeepSeek-R1 model (loaded from Hugging Face) to generate a context-aware answer by combining the user query with the retrieved documents. In this example, we load the model `
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`
and run a RAG query.1 from transformers import AutoTokenizer, AutoModelForCausalLM 2 3 # Load DeepSeek model and tokenizer 4 tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B", device_map="cuda") 5 model = AutoModelForCausalLM.from_pretrained("deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B") 6 model.to("cuda") 7 8 def rag_query(user_query): 9 # Retrieve matching documents using vector search 10 retrieved_docs = vector_search(user_query) 11 # Combine the query and search results into a context string 12 combined_information = f"Query: {user_query}\nContinue to answer the query using the Search Results:\n{retrieved_docs}." 13 14 # Tokenize input and move to GPU 15 input_ids = tokenizer(combined_information, return_tensors="pt").to("cuda") 16 response = model.generate(**input_ids, max_new_tokens=1000) 17 return tokenizer.decode(response[0], skip_special_tokens=False) 18 19 # Example query to generate a concise answer 20 print(rag_query("What's a romantic movie that I can watch with my wife? Make your response concise"))
In this tutorial, you learned how to:
- Load and Clean Data: Use the Hugging Face dataset for movies, cleaning the data and generating new embeddings.
- Set Up MongoDB: Connect to MongoDB Atlas, ingest data, and configure a vector search index.
- Perform Semantic Search: Implement a vector search pipeline to retrieve relevant movie documents based on a query.
- Integrate DeepSeek-R1: Leverage a DeepSeek-R1 model for retrieval-augmented generation to produce context-aware answers.
DeepSeek-R1 is an open-source large language model (LLM) developed with a budget under $6 million. It rivals commercial models like OpenAI's GPT-4 by offering competitive performance across several benchmarks, cost efficiency, and open accessibility. Unlike GPT-4, DeepSeek-R1 is fully open-source, allowing developers to customize and deploy it without licensing constraints.
DeepSeek-R1 was developed with a modest budget of under $6 million, significantly lower than the billions spent by companies like OpenAI and Google. Additionally, using the DeepSeek API costs approximately $22 per million tokens, which is substantially less than OpenAI’s GPT-4 at $60 per million tokens.
MongoDB’s LLM-agnostic architecture allows businesses to integrate and switch between different large language models without being tied to a single vendor. This flexibility enables organizations to experiment with various models, adapt to new technologies seamlessly, and leverage real-time, diverse datasets to enhance their AI-driven applications.
Yes, DeepSeek-R1 offers smaller, distilled versions with as few as 1.5 billion parameters, enabling lightweight deployment on consumer-grade hardware such as iPhones or M2 Macs. This makes it accessible for a wide range of applications without requiring significant computational resources.
Integrating MongoDB with DeepSeek-R1 provides a robust data foundation that supports real-time, structured, and unstructured data retrieval. MongoDB’s multi-modal search capabilities, including vector similarity, full-text, and metadata filtering, enhance the accuracy and relevance of the AI’s responses, making applications more dynamic and contextually aware.
Top Comments in Forums
There are no comments on this article yet.