Docs Menu
Docs Home
/
MongoDB Atlas
/ / /

create

On this page

  • Syntax
  • Parameters
  • Output
  • Examples
  • Verify Collection
  • Troubleshoot Errors

The create command creates a collection for existing stores or a view on a collection in the federated database instance storage configuration.

The wildcard "*" can be used with the create command in two ways:

  • As the name of the collection to dynamically create collections that maps to files and folders in the specified stores federated database instance store.

  • In the path parameter to create a collection that maps to multiple files and folders in the specified file path on the stores federated database instance store.

Click on the tab to learn more about creating a collection or a view.

This tab contains the syntax and parameters for creating a collection. Choose the tab for your federated database instance store to learn more about the syntax and parameters for that federated database instance store.

This tab contains the syntax and parameters for creating a collection for S3 federated database instance store.

This tab contains the syntax and parameters for creating a collection for Azure federated database instance store.

This tab contains the syntax and parameters for creating a collection for Google Cloud federated database instance store.

This tab contains the syntax and parameters for creating a collection for federated database instance store.

This tab contains the syntax and parameters for creating a collection for a HTTP federated database instance store.

This tab contains the syntax and parameters for creating a collection for an Online Archive federated database instance store.

This tab contains the syntax and parameters for creating a view for a source collection or collection that you specify through an SQL statement inside in the pipeline.

This tab describes the syntax and parameters for creating a standard view that runs an aggregation pipeline on another collection.

This tab describes the syntax and parameters for creating an SQL view using the $sql stage.

db.runCommand({ "create" : "<collection-name>|*", "dataSources" : [{ "storeName" : "<store-name>", "path" : "<path-to-files-or-folders>", "defaultFormat" : "<file-extension>", "omitAttributes": <boolean> }]})
db.runCommand({ "create" : "<collection-name>|*", "dataSources" : [{ "storeName" : "<store-name>", "path" : "<path-to-files-or-folders>", "defaultFormat" : "<file-extension>", "omitAttributes": <boolean> }]})
db.runCommand({ "create" : "<collection-name>|*", "dataSources" : [{ "storeName" : "<store-name>", "path" : "<path-to-files-or-folders>", "defaultFormat" : "<file-extension>", "omitAttributes": <boolean> }]})
db.runCommand({ "create" : "<collection-name>"|"*", "dataSources" : [{ "storeName" : "<store-name>", "database" : "<atlas-database-name>", "collection" : "<atlas-collection-name>" | "collectionRegex": "<regex-pattern>" }]})
db.runCommand({ "create" : "<collection-name>", "dataSources" : [{ "storeName" : "<store-name>", "allowInsecure" : true|false, "urls" : [ "<url>" ], "defaultFormat" : "<file-extension>" }]})
db.runCommand({ "create" : "<collection-name>|*", "dataSources" : [{ "storeName" : "<store-name>", "datasetName" : "<online-archive-dataset-name>", "datasetPrefix": "<prefix-name>", "trimLevel" : <trim-number>, "maxDatasets": <maximum-number-of-datasets> }]})
db.runCommand({ "create" : "<view-name>", "viewOn" :" <collection-name>", "pipeline" : ["<stage1>","<stage2>",...] })
db.runCommand({ "create" : "<view-name>", "pipeline" : ["$sql": {"statement": "<SQL-statement>", "excludeNamespaces": true | false ]} })
Parameter
Type
Description
Necessity

<collection-name>|*

string

Either the name of the collection to which Data Federation maps the data contained in the federated database instance store or the wildcard "*" to dynamically create collections.

You can generate collection names dynamically from file paths by specifying * for the collection name and the collectionName() function in the dataSources.collection field. By default, Atlas Data Federation creates up to 100 wildcard collections. You can customize the maximum number of wildcard collections that Atlas Data Federation automatically generates using the databases.[n].maxWildcardCollections parameter. Note that each wildcard collection can contain only one dataSource.

Required

dataSources

object

Array of objects where each object represents a federated database instance store in the stores array to map with the collection.

Required

dataSources.storeName

string

Name of a federated database instance store to map to the collection. The value must match the stores.[n].name in the stores array.

Required

dataSources.path

string

Path to the files and folders. Specify / to capture all files and folders from the prefix path. See Define Path for S3 Data for more information.

Required

dataSources.defaultFormat

string

Format that Data Federation defaults to if it encounters a file without an extension while querying the federated database instance store. The following values are valid:

.json, .json.gz, .bson, .bson.gz, .avro, .avro.gz, .orc, .tsv, .tsv.gz, .csv, .csv.gz, .parquet

If omitted, Data Federation attempts to detect the file type by processing a few bytes of the file.

Optional

dataSources.omitAttributes

boolean

Flag that specifies whether to omit the attributes (key and value pairs) that Atlas Data Federation adds to the collection. You can specify one of the following values:

  • false - to add the attributes

  • true - to omit the attributes

If omitted, defaults to false and Atlas Data Federation adds the attributes.

Optional

Parameter
Type
Description
Necessity

<collection-name>|*

string

Either the name of the collection to which Data Federation maps the data contained in the federated database instance store or the wildcard "*" to dynamically create collections.

You can generate collection names dynamically from file paths by specifying * for the collection name and the collectionName() function in the dataSources.collection field. By default, Atlas Data Federation creates up to 100 wildcard collections. You can customize the maximum number of wildcard collections that Atlas Data Federation automatically generates using the databases.[n].maxWildcardCollections parameter. Note that each wildcard collection can contain only one dataSource.

Required

dataSources

object

Array of objects where each object represents a federated database instance store in the stores array to map with the collection.

Required

dataSources.storeName

string

Name of a federated database instance store to map to the collection. The value must match the stores.[n].name in the stores array.

Required

dataSources.path

string

Path to the files and folders. Specify / to capture all files and folders from the prefix path. See Define Path for S3 Data for more information.

Required

dataSources.defaultFormat

string

Format that Data Federation defaults to if it encounters a file without an extension while querying the federated database instance store. The following values are valid:

.json, .json.gz, .bson, .bson.gz, .avro, .avro.gz, .orc, .tsv, .tsv.gz, .csv, .csv.gz, .parquet

If omitted, Data Federation attempts to detect the file type by processing a few bytes of the file.

Optional

dataSources.omitAttributes

boolean

Flag that specifies whether to omit the attributes (key and value pairs) that Atlas Data Federation adds to the collection. You can specify one of the following values:

  • false - to add the attributes

  • true - to omit the attributes

If omitted, defaults to false and Atlas Data Federation adds the attributes.

Optional

Parameter
Type
Description
Necessity

<collection-name>|*

string

Either the name of the collection to which Data Federation maps the data contained in the federated database instance store or the wildcard "*" to dynamically create collections.

You can generate collection names dynamically from file paths by specifying * for the collection name and the collectionName() function in the dataSources.collection field. By default, Atlas Data Federation creates up to 100 wildcard collections. You can customize the maximum number of wildcard collections that Atlas Data Federation automatically generates using the databases.[n].maxWildcardCollections parameter. Note that each wildcard collection can contain only one dataSource.

Required

dataSources

object

Array of objects where each object represents a federated database instance store in the stores array to map with the collection.

Required

dataSources.storeName

string

Name of a federated database instance store to map to the collection. The value must match the stores.[n].name in the stores array.

Required

dataSources.path

string

Path to the files and folders. Specify / to capture all files and folders from the prefix path. See Define Path for S3 Data for more information.

Required

dataSources.defaultFormat

string

Format that Data Federation defaults to if it encounters a file without an extension while querying the federated database instance store. The following values are valid:

.json, .json.gz, .bson, .bson.gz, .avro, .avro.gz, .orc, .tsv, .tsv.gz, .csv, .csv.gz, .parquet

If omitted, Data Federation attempts to detect the file type by processing a few bytes of the file.

Optional

dataSources.omitAttributes

boolean

Flag that specifies whether to omit the attributes (key and value pairs) that Atlas Data Federation adds to the collection. You can specify one of the following values:

  • false - to add the attributes

  • true - to omit the attributes

If omitted, defaults to false and Atlas Data Federation adds the attributes.

Optional

Parameter
Type
Description
Necessity

<collection-name>|*

string

Either the name of the collection to which Data Federation maps the data contained in the federated database instance store or the wildcard "*" to dynamically create collections.

You can generate collection names dynamically by specifying * for the collection name and omitting the dataSources.collection parameter.

For dynamically generated databases, you can generate wildcard collections by specifying * for the collection name and by omitting the following parameters:

  • dataSources.collection

  • dataSources.database

For wildcard (*) collections, you can use the dataSources.collectionRegex parameter to generate collections with names that match a regex pattern.

Required

dataSources

object

Array of objects where each object represents a federated database instance store in the stores array to map with the collection.

Required

dataSources.storeName

string

Name of a federated database instance store to map to the collection. The value must match the stores.[n].name in the stores array.

Required

dataSources.database

string

Name of the database that contains the collection in the Atlas cluster. You must omit this parameter to generate wildcard (*) collections for dynamically generated databases.

Conditional

dataSources.collection

string

Name of the collection in the Atlas database. For creating a wildcard (*) collection, you must omit this parameter.

Conditional

dataSources.collectionRegex

string

Regex pattern to use for creating the wildcard (*) collection. To learn more about the regex syntax, see Go programming language.

To use regex patterns for wildcard (*) collection names, you must not specify the dataSources.collection parameter.

Optional

Parameter
Type
Description
Necessity

<collection-name>

string

The name of the collection to which Atlas Data Federation maps the data contained in the federated database instance store. You can't generate collection names dynamically by specifying *.

Required

dataSources

object

Array of objects where each object represents a federated database instance store in the stores array to map with the collection.

Required

dataSources.storeName

string

Name of a federated database instance store to map to the collection. The value must match the stores.[n].name in the stores array.

Required

dataSources.allowInsecure

boolean

Validates the scheme in the specified URLs. Value can be one of the following:

  • true to allow insecure HTTP scheme

  • false to only allow secure HTTPS scheme (default)

If true, Atlas Data Federation:

  • Does not verify the server's certificate chain and hostname.

  • Accepts any certificate with any hostname presented by the server.

WARNING: If you set this to true, your data might become vulnerable to a man-in-the-middle attack, which can compromise the confidentiality and integrity of your data. Set this to true only for testing and getting started with Atlas Data Federation.

If omitted, defaults to false.

Optional

dataSources.urls

array of strings or empty array

The URLs of the publicly accessible data files. You can't specify URLs that require authentication. Atlas Data Federation creates a partition for each URL. If empty or omitted, Atlas Data Federation uses the URLs from the store specified in the dataSources.storeName parameter.

Required

dataSources.defaultFormat

string

Format that Data Federation defaults to if it encounters a file without an extension while querying the federated database instance store. The following values are valid:

.json, .json.gz, .bson, .bson.gz, .avro, .avro.gz, .orc, .tsv, .tsv.gz, .csv, .csv.gz, .parquet

If omitted, Data Federation attempts to detect the file type by processing a few bytes of the file.

If included, the specified format only applies to the URLs in the dataSource.

Optional

Parameter
Type
Description
Necessity

<collection-name>

string

The name of the collection to which Atlas Data Federation maps the data contained in the federated database instance store. To dynamically generate collection names, you must do the following:

  • Set the value of the <collection-name> field to *.

  • Provide values for the datasetPrefix and trimLevel fields.

  • Omit the datasetName field.

Required

dataSources

object

Array of objects where each object represents a federated database instance store in the stores array to map with the collection.

You can specify multiple dataSources for a wildcard collection only if all the dataSources for the collection map to online archive stores.

Required

dataSources.storeName

string

Name of a federated database instance store to map to the collection. The value must match the stores.[n].name in the stores array.

Required

datasetName

string

Name of the online archive dataset to map with the collection. The online archive datasetName is in the following format:

<version>$<type>$<subtype>$<clusterName>$<dbName>$<collectionName>$<snapshotId>

You can't specify the datasetName for wildcard collections. You can't specify the datasetName for non-wildcard collections if you specify datasetPrefix.

Conditional

datasetPrefix

string

Required for wildcard collections only. Optional for non-wildcard collection.

Dataset name prefix to match against online archive dataset names.

If you specify this parameter for wildcard collections, Atlas Data Federation maps collections to only the dataset names whose prefix match the datasetPrefix that you specify.

If you specify this parameter for non-wildcard collections, Atlas Data Federation maps the latest dataset (for the most recently captured snapshot) to the collection. You must omit datasetName to specify this parameter for non-wildcard collections.

Conditional

maxDatasets

int

For wildcard collections only.

Maximum number of datasets from which to dynamically generate collections for the data source. Value must be greater than 0. Atlas Data Federation returns datasets in reverse alphabetical order.

Optional

trimLevel

int

For wildcard collections only.

Number of characters to trim from the left of the dataset name before mapping the remaining fields to a wildcard collection name. Value must be greater than 0. If omitted, defaults to 0.

You can set this setting for wildcard collections only.

Optional

Parameter
Type
Description
Necessity

<view-name>

string

The name of the view. A view name must be unique. It cannot be the same as a collection name or any other view name in the same database.

Required

viewOn

string

The name of the source collection on which to create the view.

Required

pipeline

array of stages

The array of aggregation pipeline stages to use to create the view.

The view definition pipeline cannot include the $out or the $merge stage, even inside nested pipeline stages like $lookup or $facet.

Required

Parameter
Type
Description
Necessity

<view-name>

string

The name of the view. A view name must be unique. It cannot be the same as a collection name or any other view name in the same database.

Required

pipeline

array of stages

The Aggregation pipeline stages to apply to the collection. For SQL views, pipeline must start with a $sql stage that specifies a source collection in the SQL statement.

Required

The command returns the following output if it succeeds. You can verify the results by running the commands in Verify Collection. If it fails, see Troubleshoot Errors below for recommended solutions.

{ ok: 1 }

The following examples use the sample airbnb data on an AWS S3 store with the following settings:

Store Name

egS3Store

Region

us-east-2

Bucket

test-data-federation

Prefix

json

Delimiter

/

Sample Dataset

airbnb

The following examples use the sample airbnb data on an Azure Blob Storage container with the following settings:

Store Name

egAzureStore

Prefix

sample

Delimiter

/

Sample Dataset

airbnb

The following examples use the sample airbnb data on an Google Cloud store with the following settings:

Store Name

egGCStore

Region

us-central1

Bucket

test-data-federation

Prefix

json

Delimiter

/

Sample Dataset

airbnb

The following examples use the sample_airbnb.listingsAndReviews collection from the sample dataset on the Atlas cluster with the following settings:

Store Name

egAtlasStore

Sample Dataset

sample_airbnb.listingsAndReviews

Review Load Sample Data into Your Atlas Cluster to load the sample dataset in your Atlas cluster.

The following examples use the following URLs:

  • https://atlas-data-lake.s3.amazonaws.com/json/sample_airbnb/listingsAndReviews.json

  • https://atlas-data-lake.s3.amazonaws.com/json/sample_weatherdata/data.json

The following examples use an online archive with the following settings:

Store Name

adlStore

Online Archive Name

v1$atlas$archive$testCluster$sample_airbnb$219eb1cb-20a6-4ce3-800a-aaefd6c227c6$66d512939b1fa57fe057aa22

The following command creates a collection named airbnb in the sampleDB database in the storage configuration.

The airbnb collection maps to the airbnb sample dataset in the json folder in the S3 store named egS3Store.

use sampleDB
db.runCommand({ "create" : "airbnb", "dataSources" : [{ "storeName" : "egS3Store", "path" : "/json/airbnb", "defaultFormat" : ".json" }]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
airbnb
db.runCommand({"storageGetConfig" : 1 })
{
"ok" : 1,
"storage" : {
"stores" : [{
"name" : "egS3Store",
"provider" : "s3",
"region" : "us-east-2",
"bucket" : "test-data-federation",
"delimiter" : "/",
"prefix" : ""
}],
"databases" : [{
"name" : "sampleDB",
"collections" : [{
"name" : "airbnb",
"dataSources" : [{
"storeName" : "egS3Store",
"path" : "/json/airbnb",
"defaultFormat" : ".json"
}]
}]
}]
}
}

The airbnb collection maps to the airbnb sample dataset in the sample folder in the Azure store named egAzureStore.

use sampleDB
db.runCommand({ "create" : "airbnb", "dataSources" : [{ "storeName" : "egAzureStore", "path" : "/sample/airbnb", "defaultFormat" : ".json" }]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
airbnb

The airbnb collection maps to the airbnb sample dataset in the json folder in the Google Cloud store named egGCStore.

use sampleDB
db.runCommand({ "create" : "airbnb", "dataSources" : [{ "storeName" : "egGCStore", "path" : "/json/airbnb", "defaultFormat" : ".json" }]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
airbnb
db.runCommand({"storageGetConfig" : 1 })
{
"ok" : 1,
"storage" : {
"stores" : [{
"name" : "egGCStore",
"provider" : "gcs",
"region" : "us-central1",
"bucket" : "test-data-federation",
"delimiter" : "/",
"prefix" : ""
}],
"databases" : [{
"name" : "sampleDB",
"collections" : [{
"name" : "airbnb",
"dataSources" : [{
"storeName" : "egGCStore",
"path" : "/json/airbnb",
"defaultFormat" : ".json"
}]
}]
}]
}
}

The airbnb collection maps to the listingsAndReviews sample collection in the sample_airbnb database on the Atlas cluser.

use sampleDB
db.runCommand({ "create" : "airbnb", "dataSources" : [{ "storeName" : "egAtlasStore", "database" : "sample_airbnb", "collection" : "listingsAndReviews" }]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
airbnb
db.runCommand({"storageGetConfig":1})
{
"ok" : 1,
"storage" : {
"stores" : [{
"name" : "egAtlasStore",
"provider" : "atlas",
"clusterName" : "myTestCluster",
"projectId" : "<project-id>"
}],
"databases" : [{
"name" : "sampleDB",
"collections" : [{
"name" : "airbnb",
"dataSources" : [{
"storeName" : "egAtlasStore",
"database" : "sample_airbnb",
"collection" : "listingsAndReview"
}]
}]
}]
}
}

The airbnb collection includes a partition for each URL in the collection. The allowInsecure flag is not set and defaults to false.

use sampleDB
db.runCommand({ "create" : "airbnb", "dataSources" : [{ "storeName" : "http-store", "urls": ["https://atlas-data-lake.s3.amazonaws.com/json/sample_airbnb/listingsAndReviews.json","https://atlas-data-lake.s3.amazonaws.com/json/sample_weatherdata/data.json"], "defaultFormat" : ".json" }]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
airbnb
db.runCommand({"storageGetConfig":1})
{
"ok" : 1,
"storage" : {
"stores" : [
{
"name" : "http-store",
"provider" : "http",
"urls" : [
"https://atlas-data-lake.s3.amazonaws.com/json/sample_airbnb/listingsAndReviews.json",
"https://atlas-data-lake.s3.amazonaws.com/json/sample_weatherdata/data.json"
],
"defaultFormat" : ".json"
}
],
"databases" : [
{
"name" : "sampleDb",
"collections" : [
{
"name" : "airbnb",
"dataSources" : [
{
"storeName" : "http-store",
"defaultFormat" : ".json",
"urls" : [
"https://atlas-data-lake.s3.amazonaws.com/json/sample_airbnb/listingsAndReviews.json",
"https://atlas-data-lake.s3.amazonaws.com/json/sample_weatherdata/data.json"
]
}
]
}
]
}
]
}
}

The airbnb collection maps to the online archive datasets in the sample_airbnb.listingsAndReviews pipeline.

use sampleDB
db.runCommand({ "create" : "airbnb", "dataSources" : [{ "storeName" : "adlStore", "datasetName" : "v1$atlas$snapshot$testCluster$sample_airbnb$listingsAndReviews" }]})
{ "ok" : 1 }

These examples show how the wildcard "*" can be specified with the create command.

The following example uses the create command to dynamically create collections.

The following example uses the create command to dynamically create collections for the files in the path /json/ in the egS3Store federated database instance store. It uses the collectionName() function to name the collections after the filenames in the specified path.

use sampleDB
db.runCommand({ "create" : "*", "dataSources" : [{ "storeName" : "egS3Store", "path": "/json/{collectionName()}"}]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
airbnb
db.runCommand({"storageGetConfig" : 1 })
{
"ok" : 1,
"storage" : {
"stores" : [{
"name" : "egS3Store",
"provider" : "s3",
"region" : "us-east-2",
"bucket" : "test-data-federation",
"delimiter" : "/",
"prefix" : ""
}],
"databases" : [{
"name" : "sampleDB",
"collections" : [{
"name" : "*",
"dataSources" : [{
"storeName" : "egS3Store",
"path" : "/json/{collectionName()}"
}]
}]
}]
}
}

The following example uses the create command to create a collection named egCollection that maps to a Atlas Data Federation store named egS3Store. The egS3Store contains the sample dataset, airbnb, in a folder named json.

use sampleDB
db.runCommand({ "create" : "egCollection", "dataSources" : [{ "storeName" : "egS3Store", "path": "/json/*"}]}})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
egCollection
db.runCommand({"storageGetConfig" : 1 })
{
"ok" : 1,
"storage" : {
"stores" : [{
"name" : "egS3Store",
"provider" : "s3",
"region" : "us-east-2",
"bucket" : "test-data-federation",
"delimiter" : "/",
"prefix" : ""
}],
"databases" : [{
"name" : "sample",
"collections" : [{
"name" : "egCollection",
"dataSources" : [{
"storeName" : "egS3Store",
"path" : "/json/*"
}]
}]
}]
}
}

The following examples show how to specify a wildcard "*" with the create command.

The following example uses the create command to dynamically create collections for the files in the path /sample/ in the egAzureStore federated database instance store. It uses the collectionName() function to name the collections after the filenames in the specified path.

use sampleDB
db.runCommand({ "create" : "*", "dataSources" : [{ "storeName" : "egAzureStore", "path": "/json/{collectionName()}"}]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
airbnb

The following example uses the create command to create a collection named egCollection that maps to a Atlas Data Federation store named egAzureStore. The egAzureStore contains the sample dataset, airbnb, in a folder named sample.

use sampleDB
db.runCommand({ "create" : "egCollection", "dataSources" : [{ "storeName" : "egAzureStore", "path": "/sample/*"}]}})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
egCollection

These examples show how the wildcard "*" can be specified with the create command.

The following example uses the create command to dynamically create collections.

The following example uses the create command to dynamically create collections for the files in the path /json/ in the egGCStore federated database instance store. It uses the collectionName() function to name the collections after the filenames in the specified path.

use sampleDB
db.runCommand({ "create" : "*", "dataSources" : [{ "storeName" : "egGCStore", "path": "/json/{collectionName()}"}]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
airbnb
db.runCommand({"storageGetConfig" : 1 })
{
"ok" : 1,
"storage" : {
"stores" : [{
"name" : "egGCStore",
"provider" : "gcs",
"region" : "us-central1",
"bucket" : "test-data-federation",
"delimiter" : "/",
"prefix" : ""
}],
"databases" : [{
"name" : "sampleDB",
"collections" : [{
"name" : "*",
"dataSources" : [{
"storeName" : "egGCStore",
"path" : "/json/{collectionName()}"
}]
}]
}]
}
}

The following example uses the create command to create a collection named egCollection that maps to a Atlas Data Federation store named egGCStore. The egS3Store contains the sample dataset, airbnb, in a folder named json.

use sampleDB
db.runCommand({ "create" : "egCollection", "dataSources" : [{ "storeName" : "egGCStore", "path": "/json/*"}]}})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
egCollection
db.runCommand({"storageGetConfig" : 1 })
{
"ok" : 1,
"storage" : {
"stores" : [{
"name" : "egGCStore",
"provider" : "gcs",
"region" : "us-central1",
"bucket" : "test-data-federation",
"delimiter" : "/",
"prefix" : ""
}],
"databases" : [{
"name" : "sample",
"collections" : [{
"name" : "egCollection",
"dataSources" : [{
"storeName" : "egGCStore",
"path" : "/json/*"
}]
}]
}]
}
}

These examples show how the wildcard "*" can be specified with the create command.

The following example uses the create command to dynamically create collections for the documents in the sample_airbnb database on the Atlas cluster name myTestCluster.

use sampleDB
db.runCommand({ "create" : "*", "dataSources" : [{ "storeName" : "egAtlasStore", "database": "sample_airbnb"}]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

db.runCommand({storageGetConfig:1})
{
"ok" : 1,
"storage" : {
"stores" : [{
"name" : "egAtlasStore",
"provider" : "atlas",
"clusterName" : "myTestCluster",
"projectId" : "<project-id>"
}],
"databases" : [{
"name" : "sampleDB",
"collections" : [{
"name" : "*",
"dataSources" : [{
"storeName" : "egAtlasStore",
"database" : "sample_airbnb"
}]
}]
}]
}
}
show collections
listingsAndReviews

The following example uses the create command to dynamically create collections whose names match a specified regex pattern in the sample_airbnb database on the Atlas cluster named myTestCluster.

use sampleDB
db.runCommand({ "create" : "*", "dataSources" : [{ "storeName" : "egAtlasStore", "database": "sample_airbnb", "collectionRegex" : "^list" }]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

db.runCommand({storageGetConfig:1})
{
"ok" : 1,
"storage" : {
"stores" : [
{
"name" : "egAtlasStore",
"provider" : "atlas",
"clusterName" : "myTestCluster",
"projectId" : "<project-id>"
}
],
"databases" : [
{
"name" : "sbxDb",
"collections" : [
{
"name" : "*",
"dataSources" : [
{
"storeName" : "egAtlasStore",
"database" : "sample_airbnb",
"collectionRegex" : "^list"
}
]
}
]
}
]
}
}
show collections
listingsAndReviews

The following example uses the create command to dynamically create collections for dynamically created databases.

use *
db.runCommand({ "create" : "*", "dataSources" : [{ "storeName" : "egAtlasStore" }]})
{ "ok" : 1 }

The following command shows that the collection was successfully created:

db.runCommand({storageGetConfig:1})
{
"ok" : 1,
"storage" : {
"stores" : [
{
"name" : "egAtlasStore",
"provider" : "atlas",
"clusterName" : "myTestCluster",
"projectId" : "<project-id>"
}
],
"databases" : [
{
"name" : "*",
"collections" : [
{
"name" : "*",
"dataSources" : [
{
"storeName" : "egAtlasStore"
}
]
}
]
}
]
}
}

Wildcard "*" collections are not supported on this data store.

The following example uses the create command to dynamically create collections whose names match a specified prefix name for the online archive.

use sample
db.runCommand({ "create" : "*", "dataSources" : [{ "storeName" : "adlStore", "datasetPrefix": "v1$atlas$snapshot$testCluster$sample_airbnb$listingsAndReviews", "trimLevel": 4 }]})
{ ok: 1 }

The following command shows that the collection was successfully created:

show collections
sample_airbnb_listingsAndReviews_20220602T124437Z
sample_airbnb_listingsAndReviews_20220603T124603Z
sample_airbnb_listingsAndReviews_20220604T124420Z
sample_airbnb_listingsAndReviews_20220605T124454Z
sample_airbnb_listingsAndReviews_20220606T124419Z
sample_airbnb_listingsAndReviews_20220607T124515Z

The following command creates a collection named egCollection in the sampleDB database in the storage configuration. The egCollection collection maps to the following sample datasets:

  • airbnb dataset in the json folder in the S3 store named egS3Store

  • airbnb dataset in the sample_airbnb.listingsAndReviews collection on the Atlas cluster named myTestCluster

  • airbnb dataset in the URL https://atlas-data-lake.s3.amazonaws.com/json/sample_airbnb/listingsAndReviews.json

Warning

You can't create a federated database instance that maps to both AWS S3 buckets and Azure Blob Storage containers. Atlas Data Federation doesn't support federated queries across different cloud providers.

use sampleDB
db.runCommand({ "create" : "egCollection", "dataSources" : [{ "storeName" : "egS3Store", "path" : "/json/airbnb" },{ "storeName" : "egAtlasStore", "database": "sample_airbnb", "collection": "listingsAndReviews" },{"storeName" : "egHttpStore", "urls": ["https://atlas-data-lake.s3.amazonaws.com/json/sample_airbnb/listingsAndReviews.json"]}]})
{ "ok" : 1 }

The following commands show that the collection was successfully created:

show collections
egCollection
db.runCommand({"storageGetConfig":1})
{
"ok" : 1,
"storage" : {
"stores" : [
{
"name" : "egS3Store",
"provider" : "s3",
"region" : "us-east-2",
"bucket" : "test-data-federation",
"delimiter" : "/",
"prefix" : ""
},
{
"name" : "egAtlasStore",
"provider" : "atlas",
"clusterName" : "myTestCluster",
"projectId" : "<project-id>"
},
{
"name" : "egHttpStore",
"provider" : "http",
"urls" : ["https://atlas-data-lake.s3.amazonaws.com/json/sample_airbnb/listingsAndReviews.json"]
}
],
"databases" : [
{
"name" : "sampleDB",
"collections" : [{
"name" : "egCollection",
"dataSources" : [
{
"storeName" : "egS3Store",
"path" : "json/airbnb"
},
{
"storeName" : "egAtlasStore",
"database" : "sample_airbnb",
"collection" : "listingsAndReviews"
},
{
"storeName" : "egHttpStore",
"urls" : ["https://atlas-data-lake.s3.amazonaws.com/json/sample_airbnb/listingsAndReviews.json"]
}
]
}]
}
]
}
}

The following command creates a view named listings on the airbnb collection in the sample database with the name and property_type fields:

use sampleDB
db.runCommand({ "create" : "listings", "viewOn" : "airbnb", "pipeline" : [{$project: {"property_type":1, "name": 1}}] })
{ "ok" : 1 }
use sampleDB
db.runCommand({ "create" : "listings", "pipeline": [{$sql: {statement: "SELECT property_type, name FROM airbnb"} }] })
{ "ok" : 1 }

The listCollections and storageGetConfig commands return the following output:

db.runCommand({"listCollections":1})
{
"ok" : 1,
"cursor" : {
"firstBatch" : [
{
"name" : "airbnb",
"type" : "collection",
"info" : {
"readOnly" : true
}
},
{
"name" : "listings",
"type" : "view",
"info" : {
"readOnly" : true
}
}
],
"id" : NumberLong(0),
"ns" : "egS3Store.$cmd.listCollections"
}
}
db.runCommand({"storageGetConfig":1})
{
"ok" : 1,
"storage" : {
"stores" : [
{
"name" : "egS3Store",
"provider" : "s3",
"region" : "us-east-2",
"bucket" : "test-data-federation",
"delimiter" : "/"
}
],
"databases" : [
{
"name" : "sample",
"collections" : [
{
"name" : "airbnb",
"dataSources" : [
{
"storeName" : "egS3Store",
"path" : "json/airbnb/*"
}
]
},
{
"name" : "*",
"dataSources" : [
{
"storeName" : "egS3Store",
"path" : "json/{collectionName()}"
}
]
}
],
"views" : [
{
"name" : "listings",
"source" : "airbnb",
"pipeline" : "[{\"$project\":{\"property_type\":{\"$numberInt\":\"1\"},\"name{\"$numberInt\":\"1\"}}}]"
}
]
}
]
}
}

You can verify that the command successfully created the collection or view by running any of the following commands:

show collections
db.runCommand({ "storageGetConfig" : 1 })
db.runCommand({ "listCollections" : 1 })

If the command fails, it returns one of the following errors:

Store Name Does Not Exist

{
"ok" : 0,
"errmsg" : "store name does not exist",
"code" : 9,
"codeName" : "FailedToParse"
}

Solution: Ensure that the specified storeName matches the name of a store in the stores array. You can run the listStores command to retrieve the list of stores in your federated database instance storage configuration.

Collection Name Already Exists

{
"ok" : 0,
"errmsg" : "collection name already exists in the database",
"code" : 9,
"codeName" : "FailedToParse"
}

Solution: Ensure that the collection name is unique. You can run the show collections command to retrieve the list of existing collections.

If the command fails, it returns the following error:

View Name Exists

{
"ok" : 0,
"errmsg" : "a view '<database>.<view>' already exists, correlationID = <1603aaffdbc91ba93de6364a>",
"code" : 48,
"codeName" : "NamespaceExists"
}

Solution: Ensure that the view name is unique. You can run the listCollections command to retrieve the list of existing views on a collection.

Back

List Stores