You can access Voyage AI models by using the Embedding and Reranking API, which is available through MongoDB Atlas. Use the following methods to access the API:
REST API: for language-agnostic access.
Python client: official client for ease of use.
This page summarizes how to access the API. For full details about the API, including rate limits and usage tiers, see the API Reference.
API Keys
The Embedding and Reranking API uses API keys to monitor usage and manage permissions. To create and manage your model API keys, use the MongoDB Atlas UI. For instructions, see Model API Keys.
REST API
The following examples demonstrate how to call the embedding service through the REST API. The API supports cURL, as well as HTTP requests from any programming language.
Authentication is handled through the model API Key, which you must include in the authorization header of every API request as a Bearer token.
To learn more, see the full API specification.
Python Client
To install the official Python package using pip:
pip install --upgrade voyageai
Use the --upgrade or -U option to install the latest version of
the package. This gives you access to the most recent features and bug
fixes. For model-specific parameters, see the usage examples for each
model page.
Important
You must use version 0.3.7 or later of the Python client library. This version adds support for the Embedding and Reranking API.
The voyageai.Client class provides a synchronous interface to
invoke Voyage's API. Create a client object and use it to access
Voyage AI models.
Example
The following example shows how to initialize the client with custom settings and generate embeddings:
import voyageai # Initialize the client with custom settings vo = voyageai.Client( api_key="<model-api-key>", # Or use VOYAGE_API_KEY environment variable max_retries=3, # Retry up to 3 times on rate limit errors timeout=30 # Timeout after 30 seconds ) # Generate embeddings result = vo.embed( texts=["MongoDB is redefining what a database is in the AI era."], model="voyage-4-large" ) print(f"Embedding dimension: {len(result.embeddings[0])}") print(f"Total tokens used: {result.total_tokens}")
The following table describes the parameters you can pass when initializing the client:
Parameter | Type | Required | Description |
|---|---|---|---|
| String | No | Model API key. Defaults to None. If None, the client searches for the API key in the following order:
NOTE: The Python client automatically routes requests to the correct API endpoint based on the API key format:
You can override this behavior by setting the
|
| Integer | No | Maximum number of retries for each API request in case of rate limit errors or temporary server unavailability. Defaults to 0. The client employs a wait-and-retry strategy to handle such errors and raises an exception upon reaching the maximum retry limit. By default, the client does not retry. |
| Integer | No | Maximum time in seconds to wait for a response from the API before aborting the request. Defaults to None. If the specified timeout is exceeded, the request is terminated and a timeout exception is raised. By default, no timeout constraint is enforced. |
| String | No | Custom base URL for API requests. By default, the client automatically detects the correct endpoint based on the provided API key. |