Build a Local RAG Implementation with Atlas Vector Search

On this page

Background

Prerequisites
Create a Local Deployment or Atlas Cluster
Set Up the Environment
Generate Embeddings with a Local Model
Create the Atlas Vector Search Index
Answer Questions with the Local LLM

This tutorial demonstrates how to implement retrieval-augmented generation (RAG) locally, without the need for API keys or credits. To learn more about RAG, see Retrieval-Augmented Generation (RAG) with Atlas Vector Search.

Specifically, you perform the following actions:

Create a local Atlas deployment or deploy a cluster on the cloud.
Set up the environment.
Use a local embedding model to generate vector embeddings.
Create an Atlas Vector Search index on your data.
Use a local LLM to answer questions on your data.

➤ Use the Select your language drop-down menu to set the language of the examples on this page.

Select your language

Tip

Work with a runnable version of this tutorial as a Python notebook.

Background

To complete this tutorial, you can either create a local Atlas deployment by using the Atlas CLI or deploy a cluster on the cloud. The Atlas CLI is the command-line interface for MongoDB Atlas, and you can use the Atlas CLI to interact with Atlas from the terminal for various tasks, including creating local Atlas deployments. To learn more, see Manage Local and Cloud Deployments from the Atlas CLI.

Note

Local Atlas deployments are intended for testing only. For production environments, deploy a cluster.

You also use the following open-source models in this tutorial:

Nomic Embed Text embedding model
Mistral 7B generative model

There are several ways to download and deploy LLMs locally. In this tutorial, you download Ollama and pull the open source models listed above to perform RAG tasks.

This tutorial also uses the Microsoft.Extensions.AI.Ollama package to connect to these models and integrate them with Atlas Vector Search. If you prefer different models or a different framework, you can adapt this tutorial by replacing the Ollama model names with their equivalents for your preferred setup.

You also use the following open-source models in this tutorial:

Nomic Embed Text embedding model
Mistral 7B generative model

There are several ways to download and deploy LLMs locally. In this tutorial, you download Ollama and pull the open source models listed above to perform RAG tasks.

This tutorial also uses the Go language port of LangChain, a popular open-source LLM framework, to connect to these models and integrate them with Atlas Vector Search. If you prefer different models or a different framework, you can adapt this tutorial by replacing the Ollama model names or LangChain library components with their equivalents for your preferred setup.

There are several ways to download and deploy LLMs locally. In this tutorial, you download Ollama and pull the following open source models to perform RAG tasks:

Nomic Embed Text embedding model
Mistral 7B generative model

This tutorial also uses LangChain4j, a popular open-source LLM framework for Java, to connect to these models and integrate them with Atlas Vector Search. If you prefer different models or a different framework, you can adapt this tutorial by replacing the Ollama model names or LangChain4j library components with their equivalents for your preferred setup.

You also use the following open-source models in this tutorial:

mxbai-embed-large-v1 embedding model
Mistral 7B generative model

There are several ways to download and deploy LLMs locally. In this tutorial, you download the Mistral 7B model by using GPT4All, an open-source ecosystem for local LLM development.

When working through this tutorial, you use an interactive Python notebook. This environment allows you to create and execute individual code blocks without running the entire file each time.

You also use the following open-source models in this tutorial:

mxbai-embed-large-v1 embedding model
Mistral 7B generative model

There are several ways to download and deploy LLMs locally. In this tutorial, you download the Mistral 7B model by using GPT4All, an open-source ecosystem for local LLM development.