For AI agents: a documentation index is available at https://www.mongodb.com/docs/llms.txt — markdown versions of all pages are available by appending .md to any URL path.
Make the MongoDB docs better! We value your opinion. Share your feedback for a chance to win $100.
MongoDB Branding Shape
Click here >
Docs Menu

Models Overview

Voyage AI provides state-of-the-art embedding and reranking models. MongoDB's Embedding and Reranking API provides access to the latest Voyage AI models. This page describes the available models and when to use them.

What are embedding models and rerankers?

Expand this section to learn more about which model to choose for your use case.

For text embeddings, we recommend:

  • voyage-4-large for the best quality

  • voyage-4-lite for the lowest latency and cost

  • voyage-4 for a balance between quality and performance

  • A domain-specific model if your application is in one of the listed domains.

For other use cases, we recommend:

  • voyage-multimodal-3.5 for text, image, and video embeddings

  • voyage-context-4 for chunk-level and document-level retrieval tasks

  • rerank-2.5 for adding reranking to most applications

  • rerank-2.5-lite for adding reranking to latency-sensitive applications

Voyage AI provides the following text embedding models to capture the semantic meaning of text.

For details and example usage, see Text Embeddings.

Use the following models for most AI search and retrieval applications.

Model
Context Length
Dimensions
Description

voyage-4-large

32,000 tokens

1024 (default), 256, 512, 2048

The best general-purpose and multilingual retrieval quality. All embeddings created with the 4 series are compatible with each other.

To learn more, see the blog post.

voyage-4

32,000 tokens

1024 (default), 256, 512, 2048

Optimized for general-purpose and multilingual retrieval quality. All embeddings created with the 4 series are compatible with each other.

To learn more, see the blog post.

voyage-4-lite

32,000 tokens

1024 (default), 256, 512, 2048

Optimized for latency and cost. All embeddings created with the 4 series are compatible with each other.

To learn more, see the blog post.

Use the following models for specialized domains to achieve better accuracy.

Model
Context Length
Dimensions
Description

voyage-code-3

32,000 tokens

1024 (default), 256, 512, 2048

Optimized for code retrieval and documentation.

To learn more, see the blog post.

voyage-finance-2

32,000 tokens

1024

Optimized for finance retrieval and RAG applications.

To learn more, see the blog post.

voyage-law-2

16,000 tokens

1024

Optimized for legal retrieval and RAG applications.

To learn more, see the blog post.

Voyage also provides the following open-weight models.

Model
Context Length
Dimensions
Description

voyage-4-nano

32,000 tokens

512 (default), 128, 256

Open-weight model available on Hugging Face. All embeddings created with the 4 series are compatible with eachother

To learn more, see the blog post.

The following older models are still accessible from the API, but we recommend using the new models for better quality and efficiency.

The latest models perform better than the legacy models in all aspects, such as quality, context length, latency, and throughput.

Model
Context Length
Dimensions
Description

voyage-3-large

32,000 tokens

1024 (default), 256, 512, 2048

Previous generation of text embeddings for general-purpose and multilingual retrieval quality.

To learn more, see the blog post.

voyage-3.5

32,000 tokens

1024 (default), 256, 512, 2048

Previous generation of text embeddings optimized for general-purpose and multilingual retrieval quality.

To learn more, see the blog post.

voyage-3.5-lite

32,000 tokens

1024 (default), 256, 512, 2048

Previous generation of text embeddings optimized for latency and cost.

To learn more, see the blog post.

voyage-code-2

16,000 tokens

1536

Optimized for code retrieval (17% better than alternatives). Previous generation of code embeddings.

To learn more, see the blog post.

Voyage AI provides the following models that generate embeddings while incorporating surrounding context for improved retrieval accuracy.

For details and example usage, see Contextualized Chunk Embeddings.

Model
Context Length
Dimensions
Description

In preview: voyage-context-4

120,000 tokens

1024 (default), 256, 512, 2048

Contextualized chunk embeddings optimized for general-purpose and multilingual retrieval quality.

voyage-context-3

120,000 tokens

1024 (default), 256, 512, 2048

Contextualized chunk embeddings optimized for general-purpose and multilingual retrieval quality.

To learn more, see the blog post.

Voyage AI provides the following embedding models that process text, images, and video.

For details and example usage, see Multimodal Embeddings.

Model
Context Length
Dimensions
Description

voyage-multimodal-3.5

32,000 tokens

1024 (default), 256, 512, 2048

Rich multimodal embedding model that can vectorize interleaved text and visual data, such as screenshots of PDFs, slides, tables, figures, videos, and more.

To learn more, see the blog post.

The following older models are still accessible from the API, but we recommend using the new models for better quality and efficiency.

The latest models perform better than the legacy models in all aspects, such as quality, context length, latency, and throughput.

Model
Context Length
Dimensions
Description

voyage-multimodal-3

32,000 tokens

1024

Processes text and images into unified embeddings. Supports images from 50,000 to 2 million pixels.

To learn more, see the blog post.

Voyage AI provides the following reranking models to refine your search results.

For details and example usage, see Rerankers.

Model
Context Length
Description

rerank-2.5

32,000

Highest accuracy. Recommended for most applications.

To learn more, see the blog post.

rerank-2.5-lite

32,000

Fast and cost-effective model optimized for latency-sensitive applications.

To learn more, see the blog post.

The following older models are still accessible from the API, but we recommend using the new models for better quality and efficiency.

The latest models perform better than the legacy models in all aspects, such as quality, context length, latency, and throughput.

Model
Context Length
Description

rerank-2

16,000 tokens

Our generalist second-generation reranker optimized for quality with multilingual support.

To learn more, see the blog post.

rerank-2-lite

8,000 tokens

Our generalist second-generation reranker optimized for both latency and quality with multilingual support.

To learn more, see the blog post.

Model pricing is usage-based, with charges billed to the Atlas account linked to the API key used for access. All models include a free tier. Get started with 200 million free tokens for most models, or 50 million tokens for specialized models.

Pricing is based on the number of tokens in your documents and queries. The free tier includes 200 million tokens for most models, and 50 million tokens for the following specialized models: voyage-finance-2, voyage-law-2, voyage-code-2.

Model
Price per 1K tokens
Price per 1M tokens
Free tokens

voyage-4-large

$0.00012

$0.12

200 million

voyage-4

$0.00006

$0.06

200 million

voyage-4-lite

$0.00002

$0.02

200 million

voyage-code-3

$0.00018

$0.18

200 million

voyage-finance-2
voyage-law-2
voyage-code-2

$0.00012

$0.12

50 million

The following table shows the pricing for older text embedding models. Free tokens are not offered for these models.

Model
Price per 1K tokens
Price per 1M tokens
Free tokens

voyage-3-large

$0.00018

$0.18

0

voyage-3.5

$0.00006

$0.06

0

voyage-3.5-lite

$0.00002

$0.02

0

Pricing is based on the number of tokens in your documents and queries.

Model
Price per 1K tokens
Price per 1M tokens
Free tokens

voyage-context-4

$0.00012

$0.12

200 million

Pricing is based on text tokens and image pixels. The free tier includes 200 million text tokens and 150 billion pixels for multimodal models. Images are processed between 50,000 pixels (minimum) and 2 million pixels (maximum), with costs ranging from $0.00003 to $0.0012 per image. For pricing purposes, each video frame is considered an image.

Note

Images with fewer than 50,000 pixels are upscaled, processed, and charged as a 50,000-pixel image. Images containing over 2 million pixels are downsampled and charged as 2 million-pixel images.

Model
Price per 1M tokens
Price per 1B pixels
Free tier

voyage-multimodal-3.5

$0.12

$0.60

200M tokens, 150B pixels

Image resolution
Number of pixels
Price per image
Price per 1K images

200px × 200px

40,000

$0.00003

$0.03

1000px × 1000px

1 million

$0.0006

$0.60

2000px × 2000px

4 million

$0.0012

$1.20

4000px × 4000px

16 million

$0.0012

$1.20

Example

The cost to vectorize a single input with 1,000 text tokens ($0.00012) and two 4 million-pixel images (2 × $0.0012) would be $0.00252.

The following table shows the pricing for older multimodal models. Free tokens are not offered for these models.

Model
Price per 1M tokens
Price per 1B pixels
Free tier

voyage-multimodal-3

$0.12

$0.60

0

Pricing is based on total processed tokens, calculated as (query tokens × number of documents) + sum of tokens in all documents. The free tier includes 200 million tokens for the latest reranker models.

Model
Price per 1K tokens
Price per 1M tokens
Est. price per request*
Free tokens

rerank-2.5

$0.00005

$0.05

$0.0025

200 million

rerank-2.5-lite

$0.00002

$0.02

$0.001

200 million

* Estimated price assumes 100 documents per request, with the sum of query tokens and tokens per document totaling 500.

The following table shows the pricing for older rerankers. Free tokens are not offered for these models.

Model
Price per 1K tokens
Price per 1M tokens
Est. price per request*
Free tokens

rerank-2

$0.00005

$0.05

$0.0025

0

rerank-2-lite

$0.00002

$0.02

$0.001

0