Docs Menu

Build AI Agents with Vertex AI Agent Engine and Atlas

Vertex AI Agent Engine is a Google Cloud service that helps you build and scale AI agents in production. You can use the Agent Engine with MongoDB Atlas and your preferred framework to build AI agents for a variety of use cases, including agentic RAG.

The following tutorial demonstrates how you can use the Agent Engine with Atlas to build a RAG agent that can answer questions about sample data. It uses Atlas Vector Search with LangChain to implement the retrieval tools for the agent.

Before you begin, ensure you have the following:

Create an interactive Python notebook by saving a file with the .ipynb extension in Google Colab. This notebook allows you to run Python code snippets individually, and you'll use it to run the code in this tutorial.

1

In your notebook environment, install the required packages:

!pip install --upgrade --quiet \
"google-cloud-aiplatform[langchain,agent_engines]" requests datasets pymongo langchain langchain-community langchain-mongodb langchain-google-vertexai google-cloud-aiplatform langchain_google_genai requests beautifulsoup4
2

Run the following code in your notebook to create the MongoDB collections and Atlas Vector Search indexes used to store and query your data for this tutorial. Replace <connection-string> with your cluster's connection string.

Note

Your connection string should use the following format:

mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net
from pymongo import MongoClient
from pymongo.operations import SearchIndexModel
client = MongoClient("<connection-string>") # Replace with your connection string
db = client["AGENT-ENGINE"]
stars_wars_collection = db["sample_starwars_embeddings"]
stars_trek_collection = db["sample_startrek_embeddings"]
# Create your index model, then create the search index
search_index_model = SearchIndexModel(
definition={
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 768,
"similarity": "cosine"
}
]
},
name="vector_index",
type="vectorSearch"
)
# Create the indexes
stars_wars_collection.create_search_index(model=search_index_model)
stars_trek_collection.create_search_index(model=search_index_model)

To learn more about creating an Atlas Vector Search index, see How to Index Fields for Vector Search.

3

Run the following code in your notebook, replacing the placeholder values with your Google Cloud project ID, region, and staging bucket:

PROJECT_ID = "<your-project-id>" # Replace with your project ID
LOCATION = "<gcp-region>" # Replace with your preferred region, e.g. "us-central1"
STAGING_BUCKET = "gs://<your-bucket-name>" # Replace with your bucket
import vertexai
vertexai.init(project=PROJECT_ID, location=LOCATION, staging_bucket=STAGING_BUCKET)

Run the following code to scrape sample data from Wikipedia about Star Wars and Star Trek, convert the text into vector embeddings using the text-embedding-005 model, and then store this data in the corresponding collections in Atlas.

import requests
from bs4 import BeautifulSoup
from pymongo import MongoClient
import certifi
from vertexai.language_models import TextEmbeddingModel
# Scrape the website content
def scrape_website(url):
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
content = ' '.join([p.text for p in soup.find_all('p')])
return content
# Split the content into chunks of 1000 characters
def split_into_chunks(text, chunk_size=1000):
return [text[i:i+chunk_size] for i in range(0, len(text), chunk_size)]
def get_text_embeddings(chunks):
model = TextEmbeddingModel.from_pretrained("text-embedding-005")
embeddings = model.get_embeddings(chunks)
return [embedding.values for embedding in embeddings]
def write_to_mongoDB(embeddings, chunks, db_name, coll_name):
client = MongoClient("<connection-string>", tlsCAFile=certifi.where()) # Replace placeholder with your Atlas connection string
db = client[db_name]
collection = db[coll_name]
for i in range(len(chunks)):
collection.insert_one({
"chunk": chunks[i],
"embedding": embeddings[i]
})
# Process Star Wars data
content = scrape_website("https://en.wikipedia.org/wiki/Star_Wars")
chunks = split_into_chunks(content)
embeddings_starwars = get_text_embeddings(chunks)
write_to_mongoDB(embeddings_starwars, chunks, "AGENT-ENGINE", "sample_starwars_embeddings")
# Process Star Trek data
content = scrape_website("https://en.wikipedia.org/wiki/Star_Trek")
chunks = split_into_chunks(content)
embeddings_startrek = get_text_embeddings(chunks)
write_to_mongoDB(embeddings_startrek, chunks, "AGENT-ENGINE", "sample_startrek_embeddings")

Tip

You can view your data in the Atlas UI by navigating to the AGENT-ENGINE database and selecting the sample_starwars_embeddings and sample_startrek_embeddings collections.

In this section, you define tools that the agent can use to query your collections using Atlas Vector Search, create a memory system to maintain conversation context, and then initialize the agent using LangChain.

1

Create the following two tools:

Run the following code to create a tool that uses Atlas Vector Search to query the sample_starwars_embeddings collection:

def star_wars_query_tool(
query: str
):
"""
Retrieves vectors from a MongoDB database and uses them to answer a question related to Star wars.
Args:
query: The question to be answered about star wars.
Returns:
A dictionary containing the response to the question.
"""
from langchain.chains import ConversationalRetrievalChain, RetrievalQA
from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_google_vertexai import VertexAIEmbeddings, ChatVertexAI
from langchain.memory import ConversationBufferMemory, ConversationBufferWindowMemory
from langchain.prompts import PromptTemplate
prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Do not return any answers from your own knowledge. Respond only in 2 or 3 sentences.
{context}
Question: {question}
"""
PROMPT = PromptTemplate(
template=prompt_template, input_variables=["context", "question"]
)
# Replace with your connection string to your Atlas cluster
connection_string = "<connection-string>"
embeddings = VertexAIEmbeddings(model_name="text-embedding-005")
vs = MongoDBAtlasVectorSearch.from_connection_string(
connection_string=connection_string,
namespace="AGENT-ENGINE.sample_starwars_embeddings",
embedding=embeddings,
index_name="vector_index",
embedding_key="embedding",
text_key="chunk",
)
llm = ChatVertexAI(
model_name="gemini-pro",
convert_system_message_to_human=True,
max_output_tokens=1000,
)
retriever = vs.as_retriever(
search_type="mmr", search_kwargs={"k": 10, "lambda_mult": 0.25}
)
memory = ConversationBufferWindowMemory(
memory_key="chat_history", k=5, return_messages=True
)
conversation_chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=retriever,
memory=memory,
combine_docs_chain_kwargs={"prompt": PROMPT},
)
response = conversation_chain({"question": query})
return response

Run the following code to create a tool that uses Atlas Vector Search to query the sample_startrek_embeddings collection:

def star_trek_query_tool(
query: str
):
"""
Retrieves vectors from a MongoDB database and uses them to answer a question related to star trek.
Args:
query: The question to be answered about star trek.
Returns:
A dictionary containing the response to the question.
"""
from langchain.chains import ConversationalRetrievalChain, RetrievalQA
from langchain_mongodb import MongoDBAtlasVectorSearch
from langchain_google_vertexai import VertexAIEmbeddings, ChatVertexAI
from langchain.memory import ConversationBufferMemory, ConversationBufferWindowMemory
from langchain.prompts import PromptTemplate
prompt_template = """Use the following pieces of context to answer the question at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. Do not return any answers from your own knowledge. Respond only in 2 or 3 sentences.
{context}
Question: {question}
"""
PROMPT = PromptTemplate(
template=prompt_template, input_variables=["context", "question"]
)
# Replace with your connection string to your Atlas cluster
connection_string = "<connection-string>"
embeddings = VertexAIEmbeddings(model_name="text-embedding-005")
vs = MongoDBAtlasVectorSearch.from_connection_string(
connection_string=connection_string,
namespace="AGENT-ENGINE.sample_startrek_embeddings",
embedding=embeddings,
index_name="vector_index",
embedding_key="embedding",
text_key="chunk",
)
llm = ChatVertexAI(
model_name="gemini-pro",
convert_system_message_to_human=True,
max_output_tokens=1000,
)
retriever = vs.as_retriever(
search_type="mmr", search_kwargs={"k": 10, "lambda_mult": 0.25}
)
memory = ConversationBufferWindowMemory(
memory_key="chat_history", k=5, return_messages=True
)
conversation_chain = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=retriever,
memory=memory,
combine_docs_chain_kwargs={"prompt": PROMPT},
)
response = conversation_chain({"question": query})
return response
2

You can use LangChain to create memory for your agent so that it can maintain conversation context across multiple prompts:

from langchain.memory import ChatMessageHistory
# Initialize session history
store = {}
def get_session_history(session_id: str):
if session_id not in store:
store[session_id] = ChatMessageHistory()
return store[session_id]
3

Create the agent using LangChain. This agent uses the tools and memory system that you defined.

from vertexai.preview.reasoning_engines import LangchainAgent
# Specify the language model
model = "gemini-1.5-pro-001"
# Initialize the agent with your tools
agent = LangchainAgent(
model=model,
chat_history=get_session_history,
model_kwargs={"temperature": 0},
tools=[star_wars_query_tool, star_trek_query_tool],
agent_executor_kwargs={"return_intermediate_steps": True},
)

To test the agent with a sample query:

# Test your agent
response = agent.query(
input="Who was the antagonist in Star wars and who played them? ",
config={"configurable": {"session_id": "demo"}},
)
display(Markdown(response["output"]))
The main antagonist in the Star Wars series is Darth Vader, a dark lord of the Sith. He was originally played by David Prowse in the original trilogy, and later voiced by James Earl Jones. In the prequel trilogy, he appears as Anakin Skywalker, and was played by Hayden Christensen.

In this section, you deploy your agent to the Vertex AI Agent Engine as a managed service. This allows you to scale your agent and use it in production without managing the underlying infrastructure.

1

Run the following code to configure and deploy the agent in the Vertex AI Agent Engine:

from vertexai import agent_engines
remote_agent = agent_engines.create(
agent,
requirements=[
"google-cloud-aiplatform[agent_engines,langchain]",
"cloudpickle==3.0.0",
"pydantic>=2.10",
"requests",
"langchain-mongodb",
"pymongo",
"langchain-google-vertexai",
],
)
2

Run the following code to retrieve the project number associated with your project ID. This project number will be used to construct the complete resource name for your deployed agent:

from googleapiclient import discovery
from IPython.display import display, Markdown
# Retrieve the project number associated with your project ID
service = discovery.build("cloudresourcemanager", "v1")
request = service.projects().get(projectId=PROJECT_ID)
response = request.execute()
project_number = response["projectNumber"]
print(f"Project Number: {project_number}")
# The deployment creates a unique ID for your agent that you can find in the output
3

Run the following code to use your agent. Replace the placeholder with your agent's full resource name:

Note

After deployment, your agent will have a unique resource name in the following format:

projects/<project-number>/locations/<gcp-region>/reasoningEngines/<unique-id>

from vertexai.preview import reasoning_engines
# Replace with your agent's full resource name from the previous step
REASONING_ENGINE_RESOURCE_NAME = "<resource-name>"
remote_agent = reasoning_engines.ReasoningEngine(REASONING_ENGINE_RESOURCE_NAME)
response = remote_agent.query(
input="tell me about episode 1 of star wars",
config={"configurable": {"session_id": "demo"}},
)
print(response["output"])
response = remote_agent.query(
input="Who was the main character in this series",
config={"configurable": {"session_id": "demo"}},
)
print(response["output"])
Star Wars: Episode I - The Phantom Menace was the first film installment released as part of the prequel trilogy. It was released on May 19, 1999. The main plot lines involve the return of Darth Sidious, the Jedi's discovery of young Anakin Skywalker, and the invasion of Naboo by the Trade Federation.
The main character in Star Wars is Luke Skywalker. He is a young farm boy who dreams of adventure and becomes a Jedi Knight. He fights against the evil Galactic Empire alongside his friends, Princess Leia and Han Solo.

You can also ask the agent about Star Trek using the same session:

response = remote_agent.query(
input="what is episode 1 of star trek?",
config={"configurable": {"session_id": "demo"}},
)
print(response["output"])
Episode 1 of Star Trek is called "The Man Trap". It was first aired on September 8, 1966. The story involves the Enterprise crew investigating the disappearance of a crew on a scientific outpost. It turns out that the crew members were killed by a creature that can take on someone else's form after it kills them.

You can also debug and optimize your agents by enabling tracing in the Agent Engine. Refer to the Vertex AI Agent Engine documentation for other features and examples.

To learn more about the LangChain MongoDB integration, see Integrate Atlas Vector Search with LangChain.