Deploy a Federated Database Instance
On this page
This page describes how to deploy a federated database instance for accessing data in an HTTP data store.
Required Access
To deploy a federated database instance, you must have Project Owner
access to the project.
Users with Organization Owner
access must add themselves as a Project Owner
to the project before deploying a federated database instance.
Prerequisites
Before you begin, you will need to:
Create a MongoDB Atlas account, if you do not have one already.
Format your data store using one of the supported data formats.
Note
If your file format is
CSV
orTSV
, you must include a header row in your data. See CSV and TSV for more information.Make your data store accessible over the public internet.
Important
If your HTTP data store is not accessible over HTTPS, you must use the JSON Editor to configure your data store. In your JSON configuration, you must set the
stores.[n].allowInsecure
setting totrue
.Atlas Data Federation does not support HTTP data store URLs that require authentication.
Procedure
To create a new Data Federation database using the Atlas CLI, run the following command:
atlas dataFederation create <name> [options]
To learn more about the command syntax and parameters, see the Atlas CLI documentation for atlas dataFederation create.
Select the cloud provider where Atlas Data Federation will process your queries against your federated database instance.
You can select AWS, Azure, or Google Cloud. Once your federated database instance is created, you can't change the cloud provider where Atlas Data Federation processes your queries.
Specify your data store.
Select the dataset for your federated database instance from the Data Sources section.
Click Add Data Sources to select your data store.
Specify your data store.
Choose HTTP(S) to configure a federated database instance for data in publicly accessible HTTP and HTTPS URLs.
Corresponds to
stores.[n].provider
JSON configuration setting.Enter a name for your HTTP data store into the HTTP(S) Store Name field.
Note
The data store's name must be unique within your federated database instance.
Corresponds to
stores.[n].name
JSON configuration setting.Enter the publicly accessible URL of the file where data is stored.
Atlas Data Federation supports JSON, BSON, CSV, TSV, Avro (gzipped or uncompressed), Parquet, and ORC file types.
Tip
Click Use Sample URL to add a sample HTTP data store.
For each additional HTTP data store that you want to add, click Add Another URL, then enter the HTTP data store URLs.
Corresponds to
stores.[n].urls
JSON configuration setting.Click Next to configure virtual databases and collections.
Create the virtual databases, collections, and views and map the databases, collections, and views to your data store.
(Optional) Click the for the:
Database to edit the database name. Defaults to
VirtualDatabase[n]
.Corresponds to
databases.[n].name
JSON configuration setting.Collection to edit the collection name. Defaults to
VirtualCollection[n]
.Corresponds to
databases.[n].collections.[n].name
JSON configuration setting.View to edit the view name.
You can click:
Add Database to add databases and collections.
associated with the database to add collections to the database.
associated with the collection to add views on the collection. To create a view, you must specify:
The name of the view.
The pipeline to apply to the view.
The view definition pipeline cannot include the
$out
or the$merge
stage. If the view definition includes nested pipeline stages such as$lookup
or$facet
, this restriction applies to those nested pipelines as well.
To learn more about views, see:
associated with the database, collection, or view to remove it.
Select HTTP from the dropdown in the Data Sources section.
Drag and drop the data store to map with the collection.
Corresponds to
databases.[n].collections.[n].dataSources
JSON configuration setting.
Your configuration for an HTTP data store should look similar to the following:
1 { 2 "stores" : [ 3 { 4 "name" : "<string>", 5 "provider": "<string>", 6 "defaultFormat" : "<string>", 7 "allowInsecure": <boolean>, 8 "urls": ["<string>"] 9 } 10 ], 11 "databases" : [ 12 { 13 "name" : "<string>", 14 "collections" : [ 15 { 16 "name" : "<string>", 17 "dataSources" : [ 18 { 19 "storeName" : "<string>", 20 "allowInsecure" : <boolean>, 21 "urls" : ["<string>"], 22 "defaultFormat" : "<string>", 23 "provenanceFieldName": "<string>" 24 } 25 ] 26 } 27 ], 28 "views" : [ 29 { 30 "name" : "<string>", 31 "source" : "<string>", 32 "pipeline" : "<string>" 33 } 34 ] 35 } 36 ] 37 }
For more information on the configuration settings, see HTTP URL.
Define your HTTP data store.
Edit the JSON configuration settings shown in the UI for
stores
. Yourstores
cofiguration setting should resemble the following:"stores" : [ { "name" : "<string>", "provider" : "<string>", "allowInsecure": <boolean>, "urls" : ["<string>"], "defaultFormat" : "<string>" } ] To learn more about these configuration settings, see
stores
.Define your federated database instance virtual databases, collections, and views.
Edit the JSON configuration settings shown in the UI for
databases
. Yourdatabases
cofiguration setting should resemble the following:"databases" : [ { "name" : "<string>", "collections" : [ { "name" : "<string>", "dataSources" : [ { "storeName" : "<string>", "allowInsecure" : <boolean>, "urls" : ["<string>"], "defaultFormat" : "<string>", "provenanceFieldName": "<string>" } ] } ] } ] To learn more about these configuration settings, see
databases
.