How to Ingest Quantized Vectors

You can convert your embeddings to BSON BinData vector subtype float32 or vector subtype int8 vectors.

Use Cases

We recommend the BSON binData vector subtype for the following use cases:

You need to index quantized vector output from embedding models.
You have a large number of float vectors but want to reduce the storage and WiredTiger footprint (such as disk and memory usage) in mongod.

Benefits

The BinData vector format requires about three times less disk space in your cluster compared to arrays of elements. It allows you to index your vectors with alternate types such as int8 vectors, reducing the memory needed to build the Atlas Vector Search index for your collection.

If you don't already have binData vectors, you can convert your embeddings to this format by using any supported driver before writing your data to a collection. This page walks you through the steps for converting your embeddings to the BinData vector subtype.

Supported Drivers

BSON BinData vector subtype float32 and int8 vector conversion is supported by PyMongo Driver v4.10 or later.

Prerequisites

To convert your embeddings to BSON BinData vector subtype, you need the following:

An Atlas cluster running MongoDB version 6.0.11, 7.0.2, or later.
Ensure that your IP address is included in your Atlas project's access list.
An environment to run interactive Python notebooks such as Colab.

Access to an embedding model that supports byte vector output.

The following embedding model providers support int8 binData vectors:

Embedding Model Provider	Embedding Model
Cohere	`embed-english-v3.0`
Nomic	`nomic-embed-text-v1.5`
Jina	`jina-embeddings-v2-base-en`
Mixedbread	`mxbai-embed-large-v1`

You can use any of these embedding model providers to generate binData vectors. Scalar quantization preserves recall for these models because these models are all trained to be quantization aware. Therefore, recall degradation for scalar quantized embeddings produced by these models is minimal even at lower dimensions like 384.

Procedure

The examples in this procedure use either new data or existing data and Cohere's embed-english-v3.0 model. The example for new data uses sample text strings, which you can replace with your own data. The example for existing data uses a subset of documents without any embeddings from the listingsAndReviews collection in the sample_airbnb database, which you can replace with your own database and collection (with or without any embeddings). Select the tab based on whether you want to create binData vectors for new data or for data you already have in your Atlas cluster.

Create an interactive Python notebook by saving a file with the .ipynb extension, and then perform the following steps in the notebook. To try the example, replace the placeholders with valid values.

Install the required libraries.

Run the following command to install the PyMongo Driver. If necessary, you can also install libraries from your embedding model provider. This operation might take a few minutes to complete.

pip install pymongo

You must install PyMongo v4.10 or later driver.

Example

Install PyMongo and Cohere

pip --quiet install pymongo cohere

Load the data for which you want to generate BSON vectors in your notebook.

Example

Sample Data to Import

data = [
   "The Great Wall of China is visible from space.",
   "The Eiffel Tower was completed in Paris in 1889.",
   "Mount Everest is the highest peak on Earth at 8,848m.",
   "Shakespeare wrote 37 plays and 154 sonnets during his lifetime.",
   "The Mona Lisa was painted by Leonardo da Vinci.",
]

(Conditional) Generate embeddings from your data.

This step is required if you haven't yet generated embeddings from your data. If you've already generated embeddings, skip this step. To learn more about generating embeddings from your data, see How to Create Vector Embeddings.

Example

Generate Embeddings from Sample Data Using Cohere

Placeholder	Valid Value
`<COHERE-API-KEY>`	API key for Cohere.

import cohere
api_key = "<COHERE-API-KEY>"
co = cohere.Client(api_key)
generated_embeddings = co.embed(
   texts=data,
   model="embed-english-v3.0",
   input_type="search_document",
   embedding_types=["float", "int8"]
).embeddings
float32_embeddings = generated_embeddings.float
int8_embeddings = generated_embeddings.int8

Generate the BSON vectors from your embeddings.

You can use the PyMongo driver to convert your native vector embedding to BSON vectors.

Example

Define and Run a Function to Generate BSON Vectors

from bson.binary import Binary, BinaryVectorDtype
def generate_bson_vector(vector, vector_dtype):
   return Binary.from_vector(vector, vector_dtype)
# For all vectors in your collection, generate BSON vectors of float32 and int8 embeddings
bson_float32_embeddings = []
bson_int8_embeddings = []
for i, (f32_emb, int8_emb) in enumerate(zip(float32_embeddings, int8_embeddings)):
   bson_float32_embeddings.append(generate_bson_vector(f32_emb, BinaryVectorDtype.FLOAT32))
   bson_int8_embeddings.append(generate_bson_vector(int8_emb, BinaryVectorDtype.INT8))

Create documents with the BSON vector embeddings.

If you already have the BSON vector embeddings inside of documents in your collection, skip this step.

Example

Create Documents from the Sample Data

Placeholder	Valid Value
`<FIELD-NAME-FOR-INT8-TYPE>`	Name of field with `int8` values.
`<FIELD-NAME-FOR-FLOAT32-TYPE>`	Name of field with `float32` values.

def create_docs_with_bson_vector_embeddings(bson_float32_embeddings, bson_int8_embeddings, data):
  docs = []
  for i, (bson_f32_emb, bson_int8_emb, text) in enumerate(zip(bson_float32_embeddings, bson_int8_embeddings, data)):
     doc = {
          "_id":i,
          "data": text,
          "<FIELD-NAME-FOR-INT8-TYPE>":bson_int8_emb,
          "<FIELD-NAME-FOR-FLOAT32-TYPE>":bson_f32_emb,
     }
     docs.append(doc)
  return docs
documents = create_docs_with_bson_vector_embeddings(bson_float32_embeddings, bson_int8_embeddings, data)

Load your data into your Atlas cluster.

You can load your data from the Atlas UI and programmatically. To learn how to load your data from the Atlas UI, see Insert Your Data. The following steps and associated examples demonstrate how to load your data programmatically by using the PyMongo driver.

Connect to your Atlas cluster.

Placeholder	Valid Value
`<ATLAS-CONNECTION-STRING>`	Atlas connection string. To learn more, see Connect via Drivers.

Example

import pymongo
MONGO_URI = "<ATLAS-CONNECTION-STRING>"
def get_mongo_client(mongo_uri):
  # establish the connection
  client = pymongo.MongoClient(mongo_uri)
if not MONGO_URI:
  print("MONGO_URI not set in environment variables")

Load the data into your Atlas cluster.
Placeholder
Valid Value
<DB-NAME>
Name of the database.
<COLLECTION-NAME>
Name of the collection in the specified database.
Example
client = pymongo.MongoClient(MONGO_URI) db = client["<DB-NAME>"] db.create_collection("<COLLECTION-NAME>") col = db["<COLLECTION-NAME>"] col.insert_many(documents)

Create the Atlas Vector Search index on the collection.

You can create Atlas Vector Search indexes by using the Atlas UI, Atlas CLI, Atlas Administration API, and MongoDB drivers. To learn more, see How to Index Fields for Vector Search.

Example

Create Index for the Sample Collection

Placeholder	Valid Value
`<FIELD-NAME-FOR-INT8-TYPE>`	Name of field with `int8` values.
`<FIELD-NAME-FOR-FLOAT32-TYPE>`	Name of field with `float32` values.

import time
from pymongo.operations import SearchIndexModel
vector_search_index_definition = {
  "fields":[
    {
      "type": "vector",
      "path": "<FIELD-NAME-FOR-FLOAT32-TYPE>",
      "similarity": "dotProduct",
      "numDimensions": 1024,
    },
    {
      "type": "vector",
      "path": "<FIELD-NAME-FOR-INT8-TYPE>",
      "similarity": "dotProduct",
      "numDimensions": 1024,
    }
  ]
}
search_index_model = SearchIndexModel(definition=vector_search_index_definition, name="<INDEX-NAME>", type="vectorSearch")
col.create_search_index(model=search_index_model)

Define a function to run the Atlas Vector Search queries.

The function to run Atlas Vector Search queries must perform the following actions:

Convert the query text to a BSON vector.
Define the pipeline for the Atlas Vector Search query.

Example

Placeholder	Valid Value
`<FIELD-NAME-FOR-FLOAT32-TYPE>`	Name of field with `float32` values.
`<INDEX-NAME>`	Name of `vector` type index.
`<NUMBER-OF-CANDIDATES-TO-CONSIDER>`	Number of nearest neighbors to use during the search.
`<NUMBER-OF-DOCUMENTS-TO-RETURN>`	Number of documents to return in the results.

def run_vector_search(query_text, collection, path):
  query_text_embeddings = co.embed(
    texts=[query_text],
    model="embed-english-v3.0",
    input_type="search_query",
    embedding_types=["float", "int8"]
  ).embeddings
  if path == "<FIELD-NAME-FOR-FLOAT32-TYPE>":
    query_vector = query_text_embeddings.float[0]
    vector_dtype = BinaryVectorDtype.FLOAT32
  else:
    query_vector = query_text_embeddings.int8[0]
    vector_dtype = BinaryVectorDtype.INT8
  bson_query_vector = generate_bson_vector(query_vector, vector_dtype)
  pipeline = [
    {
      '$vectorSearch': {
        'index': '<INDEX-NAME>',
        'path': path,
        'queryVector': bson_query_vector,
        'numCandidates': <NUMBER-OF-CANDIDATES-TO-CONSIDER>,
        'limit': <NUMBER-OF-DOCUMENTS-TO-RETURN>
       }
     },
     {
       '$project': {
         '_id': 0,
         'data': 1,
         'score': { '$meta': 'vectorSearchScore' }
        }
     }
  ]
  return collection.aggregate(pipeline)

Run the Atlas Vector Search query.

You can run Atlas Vector Search queries programmatically. To learn more, see Run Vector Search Queries.

Example

from pprint import pprint
query_text = "tell me a science fact"
float32_results = run_vector_search(query_text, col, "<FIELD-NAME-FOR-FLOAT32-TYPE>")
int8_results = run_vector_search(query_text, col, "<FIELD-NAME-FOR-INT8-TYPE>")
print("results from float32 embeddings")
pprint(list(float32_results))
print("--------------------------------------------------------------------------")
print("results from int8 embeddings")
pprint(list(int8_results))

results from float32 embeddings
[{'data': 'Mount Everest is the highest peak on Earth at 8,848m.',
  'score': 0.4222325384616852},
 {'data': 'The Great Wall of China is visible from space.',
  'score': 0.4112812876701355},
 {'data': 'The Mona Lisa was painted by Leonardo da Vinci.',
  'score': 0.3871753513813019},
 {'data': 'The Eiffel Tower was completed in Paris in 1889.',
  'score': 0.38428616523742676},
 {'data': 'Shakespeare wrote 37 plays and 154 sonnets during his lifetime.',
  'score': 0.37546128034591675}]
--------------------------------------------------------------------------
results from int8 embeddings
[{'data': 'Mount Everest is the highest peak on Earth at 8,848m.',
  'score': 4.619598996669083e-07},
 {'data': 'The Great Wall of China is visible from space.',
  'score': 4.5106872903488693e-07},
 {'data': 'The Mona Lisa was painted by Leonardo da Vinci.',
  'score': 4.0036800896814384e-07},
 {'data': 'The Eiffel Tower was completed in Paris in 1889.',
  'score': 3.9345573554783186e-07},
 {'data': 'Shakespeare wrote 37 plays and 154 sonnets during his lifetime.',
  'score': 3.797164538354991e-07}]

Install the required libraries.

Run the following command to install the PyMongo Driver. If necessary, you can also install libraries from your embedding model provider. This operation might take a few minutes to complete.

pip install pymongo

You must install PyMongo v4.10 or later driver.

Example

Install PyMongo and Cohere

pip --quiet install pymongo cohere

Define the functions to generate vector embeddings and convert embeddings to BSON-compatible format.

You must define functions that perform the following by using an embedding model:

Generate embeddings from your existing data if your existing data doesn't have any embeddings.
Convert the embeddings to BSON vectors.

Example

Function to Generate and Convert Embeddings

Placeholder	Valid Value
`<COHERE-API-KEY>`	API key for Cohere.

1 import os
2 import pymongo
3 import cohere
4 from bson.binary import Binary, BinaryVectorDtype
5 
6 # Specify your OpenAI API key and embedding model
7 os.environ["COHERE_API_KEY"] = "<COHERE-API-KEY>"
8 cohere_client = cohere.Client(os.environ["COHERE_API_KEY"])
9 
10 # Function to generate embeddings using Cohere
11 def get_embedding(text):
12     response = cohere_client.embed(
13       texts=[text],
14       model='embed-english-v3.0',
15       input_type='search_document'
16     )
17     embedding = response.embeddings[0]
18     return embedding
19 
20 # Function to convert embeddings to BSON-compatible format
21 def generate_bson_vector(vector, vector_dtype):
22     return Binary.from_vector(vector, vector_dtype)

Connect to the Atlas cluster and retrieve existing data.

You must provide the following:

Connection string to connect to your Atlas cluster that contains the database and collection for which you want to generate embeddings.
Name of the database that contains the collection for which you want to generate embeddings.
Name of the collection for which you want to generate embeddings.

Example

Connect to Atlas Cluster for Accessing Data

Placeholder	Valid Value
`<ATLAS-CONNECTION-STRING>`	Atlas connection string. To learn more, see Connect via Drivers.

1 # Connect to your Atlas cluster
2 mongo_client = pymongo.MongoClient("<ATLAS-CONNECTION-STRING>")
3 db = mongo_client["sample_airbnb"]
4 collection = db["listingsAndReviews"]
5 
6 # Filter to exclude null or empty summary fields
7 filter = { "summary": {"$nin": [None, ""]} }
8 
9 # Get a subset of documents in the collection
10 documents = collection.find(filter).limit(50)
11 
12 # Initialize the count of updated documents
13 updated_doc_count = 0

Generate, convert, and load embeddings into your collection.

Generate embeddings from your data using any embedding model if your data doesn't already have embeddings. To learn more about generating embeddings from your data, see How to Create Vector Embeddings.
Convert the embeddings to BSON vectors (as shown on line 7 in the following example).
Upload the embeddings to your collection on the Atlas cluster.

These operation might take a few minutes to complete.

Example

Generate, Convert, and Load Embeddings to Collection

1 for doc in documents:
2     # Generate embeddings based on the summary
3     summary = doc["summary"]
4     embedding = get_embedding(summary)  # Get float32 embedding
5 
6     # Convert the float32 embedding to BSON format
7     bson_float32 = generate_bson_vector(embedding, BinaryVectorDtype.FLOAT32)
8 
9     # Update the document with the BSON embedding
10     collection.update_one(
11         {"_id": doc["_id"]},
12         {"$set": {"embedding": bson_float32}}
13     )
14     updated_doc_count += 1
15 
16 print(f"Updated {updated_doc_count} documents with BSON embeddings.")

Create the Atlas Vector Search index on the collection.

You can create Atlas Vector Search indexes by using the Atlas UI, Atlas CLI, Atlas Administration API, and MongoDB drivers in your preferred language. To learn more, see How to Index Fields for Vector Search.

Example

Create Index for the Collection

Placeholder	Valid Value
`<INDEX-NAME>`	Name of `vector` type index.

1 from pymongo.operations import SearchIndexModel
2 
3 vector_search_index_definition = {
4   "fields":[
5     {
6       "type": "vector",
7       "path": "embedding",
8       "similarity": "dotProduct",
9       "numDimensions": 1024,
10     }
11   ]
12 }
13 
14 search_index_model = SearchIndexModel(definition=vector_search_index_definition, name="<INDEX-NAME>", type="vectorSearch")
15 
16 collection.create_search_index(model=search_index_model)

The index should take about one minute to build. While it builds, the index is in an initial sync state. When it finishes building, you can start querying the data in your collection.

Define a function to run the Atlas Vector Search queries.

The function to run Atlas Vector Search queries must perform the following actions:

Generate embeddings for the query text.
Convert the query text to a BSON vector.
Define the pipeline for the Atlas Vector Search query.

Example

Function to Run Atlas Vector Search Query

Placeholder	Valid Value
`<INDEX-NAME>`	Name of `vector` type index.
`<NUMBER-OF-CANDIDATES-TO-CONSIDER>`	Number of nearest neighbors to use during the search.
`<NUMBER-OF-DOCUMENTS-TO-RETURN>`	Number of documents to return in the results.

1 def run_vector_search(query_text, collection, path):
2   query_embedding = get_embedding("query_text")
3   bson_query_vector = generate_bson_vector(query_embedding, BinaryVectorDtype.FLOAT32)
4 
5   pipeline = [
6     {
7       '$vectorSearch': {
8         'index': '<INDEX-NAME>',
9         'path': path,
10         'queryVector': bson_query_vector,
11         'numCandidates': <NUMBER-OF-CANDIDATES-TO-CONSIDER>, # for example, 20
12         'limit': <NUMBER-OF-DOCUMENTS-TO-RETURN> # for example, 5
13        }
14      },
15      {
16        '$project': {
17          '_id': 0,
18          'name': 1,
19          'summary': 1,
20          'score': { '$meta': 'vectorSearchScore' }
21         }
22      }
23   ]
24 
25   return collection.aggregate(pipeline)

Run the Atlas Vector Search query.

You can run Atlas Vector Search queries programmatically. To learn more, see Run Vector Search Queries.

Example

Run a Sample Atlas Vector Search Query

1 from pprint import pprint
2 
3 query_text = "ocean view"
4 float32_results = run_vector_search(query_text, collection, "embedding")
5 
6 print("results from float32 embeddings")
7 pprint(list(float32_results))

results from float32 embeddings
[{'name': 'Your spot in Copacabana',
  'score': 0.5468248128890991,
  'summary': 'Having a large airy living room. The apartment is well divided. '
             'Fully furnished and cozy. The building has a 24h doorman and '
             'camera services in the corridors. It is very well located, close '
             'to the beach, restaurants, pubs and several shops and '
             'supermarkets. And it offers a good mobility being close to the '
             'subway.'},
 {'name': 'Twin Bed room+MTR Mongkok shopping&My',
  'score': 0.527062714099884,
  'summary': 'Dining shopping conveniently located Mongkok subway E1, airport '
             'shuttle bus stops A21. Three live two beds, separate WC, 24-hour '
             'hot water. Free WIFI.'},
{'name': 'Quarto inteiro na Tijuca',
  'score': 0.5222363471984863,
  'summary': 'O quarto disponível tem uma cama de solteiro, sofá e computador '
             'tipo desktop para acomodação.'},
 {'name': 'Makaha Valley Paradise with OceanView',
  'score': 0.5175154805183411,
  'summary': 'A beautiful and comfortable 1 Bedroom Air Conditioned Condo in '
             'Makaha Valley - stunning Ocean & Mountain views All the '
             'amenities of home, suited for longer stays. Full kitchen & large '
             "bathroom.  Several gas BBQ's for all guests to use & a large "
             'heated pool surrounded by reclining chairs to sunbathe.  The '
             'Ocean you see in the pictures is not even a mile away, known as '
             'the famous Makaha Surfing Beach. Golfing, hiking,snorkeling  '
             'paddle boarding, surfing are all just minutes from the front '
             'door.'},
 {'name': 'Cozy double bed room 東涌鄉村雅緻雙人房',
  'score': 0.5149975419044495,
  'summary': 'A comfortable double bed room at G/F. Independent entrance. High '
             'privacy. The room size is around 100 sq.ft. with a 48"x72" '
             'double bed. The village house is close to the Hong Kong Airport, '
             'AsiaWorld-Expo, HongKong-Zhuhai-Macau Bridge, Disneyland, '
             'Citygate outlets, 360 Cable car, shopping centre, main tourist '
             'attractions......'}]

Your results might vary because you randomly selected 50 documents from the sample_airbnb.listingsAndReviews namespace in step 3. The selected documents and generated embeddings might be different in your environment.

For an advanced demonstration of this procedure on sample data using Cohere's embed-english-v3.0 embedding model, see this notebook.

Back

How to Create Vector Embeddings

Create and Manage Indexes

Placeholder	Valid Value
`<DB-NAME>`	Name of the database.
`<COLLECTION-NAME>`	Name of the collection in the specified database.

1	import os
2	import pymongo
3	import cohere
4	from bson.binary import Binary, BinaryVectorDtype
5
6	# Specify your OpenAI API key and embedding model
7	os.environ["COHERE_API_KEY"] = "<COHERE-API-KEY>"
8	cohere_client = cohere.Client(os.environ["COHERE_API_KEY"])
9
10	# Function to generate embeddings using Cohere
11	def get_embedding(text):
12	response = cohere_client.embed(
13	texts=[text],
14	model='embed-english-v3.0',
15	input_type='search_document'
16	)
17	embedding = response.embeddings[0]
18	return embedding
19
20	# Function to convert embeddings to BSON-compatible format
21	def generate_bson_vector(vector, vector_dtype):
22	return Binary.from_vector(vector, vector_dtype)

1	# Connect to your Atlas cluster
2	mongo_client = pymongo.MongoClient("<ATLAS-CONNECTION-STRING>")
3	db = mongo_client["sample_airbnb"]
4	collection = db["listingsAndReviews"]
5
6	# Filter to exclude null or empty summary fields
7	filter = { "summary": {"$nin": [None, ""]} }
8
9	# Get a subset of documents in the collection
10	documents = collection.find(filter).limit(50)
11
12	# Initialize the count of updated documents
13	updated_doc_count = 0

1	for doc in documents:
2	# Generate embeddings based on the summary
3	summary = doc["summary"]
4	embedding = get_embedding(summary) # Get float32 embedding
5
6	# Convert the float32 embedding to BSON format
7	bson_float32 = generate_bson_vector(embedding, BinaryVectorDtype.FLOAT32)
8
9	# Update the document with the BSON embedding
10	collection.update_one(
11	{"_id": doc["_id"]},
12	{"$set": {"embedding": bson_float32}}
13	)
14	updated_doc_count += 1
15
16	print(f"Updated {updated_doc_count} documents with BSON embeddings.")

1	from pymongo.operations import SearchIndexModel
2
3	vector_search_index_definition = {
4	"fields":[
5	{
6	"type": "vector",
7	"path": "embedding",
8	"similarity": "dotProduct",
9	"numDimensions": 1024,
10	}
11	]
12	}
13
14	search_index_model = SearchIndexModel(definition=vector_search_index_definition, name="<INDEX-NAME>", type="vectorSearch")
15
16	collection.create_search_index(model=search_index_model)

1	def run_vector_search(query_text, collection, path):
2	query_embedding = get_embedding("query_text")
3	bson_query_vector = generate_bson_vector(query_embedding, BinaryVectorDtype.FLOAT32)
4
5	pipeline = [
6	{
7	'$vectorSearch': {
8	'index': '<INDEX-NAME>',
9	'path': path,
10	'queryVector': bson_query_vector,
11	'numCandidates': <NUMBER-OF-CANDIDATES-TO-CONSIDER>, # for example, 20
12	'limit': <NUMBER-OF-DOCUMENTS-TO-RETURN> # for example, 5
13	}
14	},
15	{
16	'$project': {
17	'_id': 0,
18	'name': 1,
19	'summary': 1,
20	'score': { '$meta': 'vectorSearchScore' }
21	}
22	}
23	]
24
25	return collection.aggregate(pipeline)

1	from pprint import pprint
2
3	query_text = "ocean view"
4	float32_results = run_vector_search(query_text, collection, "embedding")
5
6	print("results from float32 embeddings")
7	pprint(list(float32_results))