View Atlas Data Lake Pipelines - Preview
You can view all of your Data Lake pipelines and view the details of a specified Data Lake Pipeline in your project through the Atlas UI, Data Lake Pipelines API, and the Atlas CLI. You can also retrieve all of your completed Data Lake pipeline data ingestion jobs from the API and the Atlas CLI.
Procedure
To return all data lake pipelines for your project using the Atlas CLI, run the following command:
atlas dataLakePipelines list [options]
To learn more about the command syntax and parameters, see the Atlas CLI documentation for atlas dataLakePipelines list.
View Details of an Atlas Data Lake Pipeline
To return the details for the specified data lake pipeline for your project using the Atlas CLI, run the following command:
atlas dataLakePipelines describe <pipelineName> [options]
To learn more about the command syntax and parameters, see the Atlas CLI documentation for atlas dataLakePipelines describe.
View All Available Schedules for an Atlas Data Lake Pipeline
To return all available schedules for the specified data lake pipeline using the Atlas CLI, run the following command:
atlas dataLakePipelines availableSchedules list [options]
To learn more about the command syntax and parameters, see the Atlas CLI documentation for atlas dataLakePipelines availableSchedules list.
View All Available Backup Snapshots for an Atlas Data Lake Pipeline
To return all available backup snapshots for the specified data lake pipeline using the Atlas CLI, run the following command:
atlas dataLakePipelines availableSnapshots list [options]
To learn more about the command syntax and parameters, see the Atlas CLI documentation for atlas dataLakePipelines availableSnapshots list.
View Atlas Data Lake Pipeline Runs
To returns all data lake pipeline runs for your project using the Atlas CLI, run the following command:
atlas dataLakePipelines runs list [options]
To learn more about the command syntax and parameters, see the Atlas CLI documentation for atlas dataLakePipelines runs list.
View Details of an Atlas Data Lake Pipeline Run
To return the details for the specified data lake pipeline run for your project using the Atlas CLI, run the following command:
atlas dataLakePipelines runs describe <pipelineRunId> [options]
To learn more about the command syntax and parameters, see the Atlas CLI documentation for atlas dataLakePipelines runs describe.
To retrieve all your Data Lake pipelines for a project through the API,
send a GET
request to the Data Lake pipelines
endpoint. To learn more about
the pipelines
endpoint syntax and parameters for retrieving all of
your Data Lake pipelines, see Return All Data Lake
Pipelines from One Project.
To retrieve one of your Data Lake pipelines through the API, send a
GET
request to the Data Lake
pipelines
endpoint with the name of the Data Lake pipeline that you
want to retrieve. To learn more about the pipelines
endpoint syntax
and parameters for retrieving one of your Data Lake pipelines, see
Return One Data Lake Pipeline.
View Atlas Data Lake Pipeline Runs
To retrieve all the completed Data Lake pipeline data ingestion jobs for a
project through the API, send a GET
request to the
Data Lake runs
endpoint. To
learn more about the API syntax and options for the runs
endpoint, see Return All Data Lake Pipeline Runs from
One Project.
To retrieve the details of one of your completed Data Lake pipeline data
ingestion jobs through the API, send a GET
request to the
Data Lake runs
endpoint with
the unique identifier of the completed Data Lake pipeline data ingestion
job that you want to retrieve. To learn more about the API syntax and
options for the runs
endpoint, see Return One Data
Lake Pipeline Run.
Log in to MongoDB Atlas.
Go to Atlas Data Lake in the Atlas UI.
If it's not already displayed, select the organization that contains your project from the Organizations menu in the navigation bar.
If it's not already displayed, select your project from the Projects menu in the navigation bar.
In the sidebar, click Data Lake under the Deployment heading.
View Data Lake pipelines.
The page displays all the Data Lake pipelines in the project. For each Data Lake pipeline, the service also displays the following information:
Column Name | Description |
---|---|
Pipeline Name | Name of your Data Lake pipeline. Each pipeline can produce
multiple datasets. You can expand the name to view the
datasets in the pipeline. |
Data Source | Source for the data in the pipeline datasets. For data
from a collection on the Atlas cluster, this column
shows the cluster name, the database name, and the
collection name separated by | . |
Data Size | Size of data for each dataset. |
Last Run Time | Date and time when the pipeline ran to ingest data for
each dataset. |
Status | Status of the pipeline. Value can be one of the following for a pipeline:
|
Frequency | Frequency at which cluster data is ingested and stored
for querying. |
Actions | Actions you can take for each pipeline. You can click one of the following:
|