Docs Menu
Docs Home
/
MongoDB Atlas
/ /

Review Alert Conditions

On this page

  • Host Alerts
  • Query Targeting Alerts
  • Cloud Backup Alerts
  • Replica Set Alerts
  • Sharded Cluster Alerts
  • App Services Alerts
  • Serverless Alerts
  • User Alerts
  • Project Alerts
  • Billing Alerts
  • Service Account Alerts
  • Federation Alerts
  • Encryption at Rest Alerts
  • Maintenance Window Alerts
  • MongoDB Support Access Grant Alerts
  • Atlas Stream Processing Alerts

This page describes the conditions for which you can trigger alerts. You specify conditions and thresholds when configuring alerts. To learn more, see Alerts Workflow.

Note

M0 Free clusters and M2/M5 Shared clusters only trigger alerts related to the metrics supported by those clusters. See Atlas M0 (Free Cluster), M2, and M5 Limits for complete documentation on M0/M2/M5 alert and metric limitations.

The conditions in this section apply if you select Host as the alert target when configuring the alert. You can apply the condition to all hosts or to specific type of host, such as primaries or config servers.

Important

During live migration, Atlas disables host alerts.

Atlas triggers certain host alerts based on cluster monitoring, and are thus subject to variations in granularity. To learn more, see Monitoring Data Storage Granularity.

Host has index suggestions

Raised if Performance Advisor has index suggestions for the host.

If the query targeting ratio for a host is greater than 8000 and if Performance Advisor determines that the host benefits from one or more indexes to improve performance of inefficient queries, this alert triggers and directs you to create the suggested indexes.

This alert is only available for M10+ clusters, and is enabled by default for M10+ clusters that have Performance Advisor enabled. This alert does not trigger for clusters where Performance Advisor is disabled.

The following alert conditions measure the rate of asserts for a MongoDB process, as collected from the MongoDB serverStatus command's asserts document. You can view asserts through cluster monitoring.

Asserts: Msg is

Raised if the rate of message asserts meets the specified threshold. Message asserts are internal server errors. Stack traces are logged for these.

Asserts: Regular is

Raised if the rate of regular asserts meets the specified threshold.

Asserts: User is

Raised if the rate of errors generated by users meets the specified threshold.

Asserts: Warning is

Raised if the rate of warnings meets the specified threshold.

You can configure alerts for the following cluster events. View the activity feed to review all auto-scaling events that occurred.

For each event in this section, to receive alerts, you must first configure an alert to notify you or members of your organization of this type of an auto-scaling event.

To learn how Atlas scales a cluster up or down, see Configure Auto-Scaling.

Auto-scaling: Compute auto-scaling initiated for base tier

Raised if Atlas starts compute auto-scaling for any of the operational nodes in your dedicated cluster. Atlas can scale disk capacity as part of this event.

Auto-scaling: Compute auto-scaling initiated for analytics tier

Raised if Atlas starts compute auto-scaling for any of the analytics nodes in your dedicated cluster. Atlas can scale disk capacity as part of this event.

Auto-scaling: Compute auto-scaling down didn't initiate for base tier due to storage requirements

Raised if Atlas couldn't start compute auto-scaling for any of the operational nodes in your dedicated cluster as the configured storage size isn't supported by the target cluster tier.

Auto-scaling: Compute auto-scaling down didn't initiate for analytics tier due to storage requirements

Raised if Atlas couldn't start compute auto-scaling for any of the analytics nodes in your dedicated cluster as the configured storage size isn't supported by the target cluster tier.

Auto-scaling: Compute auto-scaling didn't initiate for base tier due to maximum configured cluster tier

Raised if Atlas couldn't scale up an operational node because your cluster reached a maximum cluster tier configured for auto-scaling.

Auto-scaling: Compute auto-scaling didn't initiate for analytics tier due to maximum configured cluster tier

Raised if Atlas couldn't scale up an analytics node because your cluster reached a maximum cluster tier configured for auto-scaling.

Auto-scaling: Compute auto-scaling didn't initiate for base tier due to insufficient oplog size

Raised if Atlas couldn't scale up an operational node due to insufficient oplog size. To learn more, see Set Minimum Oplog Window.

Auto-scaling: Compute auto-scaling didn't initiate for analytics tier due to insufficient oplog size

Raised if Atlas couldn't scale up an analytics node due to insufficient oplog size. To learn more, see Set Minimum Oplog Window.

Auto-scaling: Disk auto-scaling initiated

Raised if Atlas starts auto-scaling disk capacity.

Auto-scaling: Disk auto-scaling didn't initiate due to the cluster reaching maximum available disk size

Raised if Atlas couldn't scale up the disk size because the cluster has reached maximum available disk size.

Auto-scaling: Disk auto-scaling didn't initiate due to insufficient oplog size

Raised if Atlas couldn't scale up the disk size because the cluster's oplog size isn't sufficient.

The following alert conditions measure the amount of CPU and memory used by Atlas Search processes. You can view Atlas Search metrics through cluster monitoring.

Atlas Search: Index Replication Lag is

Raised if the approximate number of milliseconds that Atlas Search is behind in replicating changes from the oplog of mongod is above or below the threshold.

Atlas Search: Index Size on Disk is

Raised if the total size of all Atlas Search indexes on disk in bytes is above or below the threshold.

Atlas Search: Max Number of Lucene Docs is

Raised if the upper bound on the number of Lucene docs used to store Atlas Search indexes for a given replica set or shard is above the threshold.

Atlas Search: Mongot stopped replication

Raised on dedicated search nodes if the replication is interrupted by the Atlas Search mongot process due to high disk utilization.

Atlas Search: Number of Error Queries is

Raised if the number of queries for which Atlas Search is unable to return a response is above or below the threshold.

Atlas Search: Number of Index Fields is

Raised if the total number of unique fields present in the Atlas Search index is above or below the threshold.

Atlas Search: Number of Successful Queries is

Raised if the number of queries for which Atlas Search successfully returned a response is above or below the threshold.

Atlas Search: Total Number of Queries is

Raised if the number of queries submitted to Atlas Search is above or below the threshold.

Atlas Search Opcounter: Delete is

Raised if the total number of documents or fields (specified in the index definition) removed per second is above or below the threshold.

Atlas Search Opcounter: Getmore is

Raised if the total number of getmore commands run on all Atlas Search queries per second is above or below the threshold.

Atlas Search Opcounter: Insert is

Raised if the total number of documents or fields (specified in the index definition) that Atlas Search indexes per second is above or below the threshold.

Atlas Search Opcounter: Update is

Raised if the total number of documents or fields (specified in the index definition) that Atlas Search updates per second is above or below the threshold.

Insufficient disk space to support rebuilding search indexes

Raised if your cluster runs out of enough free disk space to support your Atlas Search indexes.

Search Memory: Resident is

Raised if the total bytes of resident memory occupied by the Atlas Search process is above or below the threshold.

Search Memory: Shared is

Raised if the total bytes of shared memory occupied by the Atlas Search process is above or below the threshold.

Search Memory: Virtual is

Raised if the total bytes of virtual memory occupied by the Atlas Search process is above or below the threshold.

Search Process: CPU (Kernel) % is

Raised if the percentage of time the CPU spent servicing operating system calls for the Atlas Search process is above the threshold.

Search Process: CPU (User) % is

Raised if the percentage of time the CPU spent servicing the Atlas Search process is above the threshold.

Search Process: Disk space used is

Raised if the total bytes of disk space used by the Atlas Search process is above the threshold.

Note

If you apply the condition to all hosts, it applies to dedicated Search Nodes as well.

Search Process: Ran out of memory

Raised if the search process (mongot) runs out of memory. If the search process runs out of memory, indexing and queries fail.

The following alert conditions measure the average execution time of reads, writes, or commands for a MongoDB process, as collected from the MongoDB serverStatus command's opLatencies document. You can view asserts through cluster monitoring.

Average Execution Time: Commands is

Average execution time for command operations meets your specified threshold.

Average Execution Time: Reads is

Average execution time for read operations meets your specified threshold.

Average Execution Time: Writes is

Average execution time for write operations meets your specified threshold.

The following alert conditions measure the rate of database operations on a MongoDB process since the process last started, as collected from the MongoDB serverStatus command's opcounters document. You can view opcounters through cluster monitoring.

Opcounter: Cmd is

Raised if the rate of commands performed meets the specified threshold.

Opcounter: Delete is

Raised if the rate of deletes meets the specified threshold.

Opcounter: Getmores is

Raised if the rate of getmore operations to retrieve the next cursor batch meets the specified threshold.

Tip

See also:

To learn more, see Cursor Batches in the MongoDB manual.

Opcounter: Insert is

Raised if the rate of inserts meets the specified threshold.

Opcounter: Query is

Raised if the rate of queries meets the specified threshold.

Opcounter: Update is

Raised if the rate of updates meets the specified threshold.

The following alert conditions measure the rate of database operations on MongoDB secondaries, as collected from the MongoDB serverStatus command's opcountersRepl document. You can view these metrics on the Opcounters - Repl chart, accessed through cluster monitoring.

Opcounter: Repl Cmd is

Raised if the rate of replicated commands meets the specified threshold.

Opcounter: Repl Delete is

Raised if the rate of replicated deletes meets the specified threshold.

Opcounter: Repl Insert is

Raised if the rate of replicated inserts meets the specified threshold.

Opcounter: Repl Update is

Raised if the rate of replicated updates meets the specified threshold.

You might set alerts for the scan and order operations for a MongoDB process.

Operations: Scan and Order is

Average rate per second over your specified threshold of queries that return sorted results that can't perform the sort operation using an index.

Note

How It's Measured

MongoDB reports on the Replication Oplog using the metrics.operation.scanAndOrder document that the serverStatus command returns.

Logical Size is

Raised if the total size of the data and indexes is outside the specified threshold.

Applicable for Atlas Free Clusters Only

The following alert conditions measure memory for a MongoDB process, as collected from the MongoDB serverStatus command's mem document. You can view these metrics on the Atlas Memory and Non-Mapped Virtual Memory charts, accessed through cluster monitoring.

Memory: Computed is

Raised if the size of virtual memory that is not accounted for by memory-mapping meets the specified threshold. If this number is very high (multiple gigabytes), it indicates that excessive memory is being used outside of memory mapping.

Tip

See also:

To learn how to use this metric, view the Non-Mapped Virtual Memory chart and click the chart's i icon.

Memory: Resident is

Raised if the size of the resident memory meets the specified threshold. It is typical over time, on a dedicated database server, for the size of the resident memory to approach the amount of physical RAM on the box.

Memory: Virtual is

Raised if the size of virtual memory for the mongod process meets the specified threshold. You can use this alert to flag excessive memory outside of memory mapping.

Tip

See also:

To learn more, click the Memory chart's i icon.

System Memory: Available is

Raised if the amount of available system memory drops below the specified threshold.

System Memory: Max Available is

Raised if the maximum amount of available system memory drops below the specified threshold.

System Memory: Max Used is

Raised if the maximum system memory usage value meets the specified threshold.

System Memory: Used is

Raised if the total system memory used minus buffers, cached, and free memory meets the specified threshold.

The following alert condition measures connections to a MongoDB process, as collected from the MongoDB serverStatus command's connections document. You can view this metric on the Atlas Connections chart, accessed through cluster monitoring.

Connections is

Raised if the number of active connections to the host meets the specified average.

Connections % of configured limit is

Raised if the number of open connections to the host exceeds the specified percentage.

The following alert conditions measure operations waiting on locks, as collected from the MongoDB serverStatus command's globalLock document. You can view these metrics on the Atlas Queues chart, accessed through cluster monitoring.

Queues: Readers is

Raised if the number of operations waiting on a read lock meets the specified average.

Queues: Total is

Raised if the number of operations waiting on a lock of any type meets the specified average.

Queues: Writers is

Raised if the number of operations waiting on a write lock meets the specified average.

The following alert condition measures the rate of page faults for a MongoDB process, as collected from the MongoDB serverStatus command's extra_info.page_faults field.

Page Faults is

Raised if the rate of page faults (whether or not an exception is thrown) meets the specified threshold. You can view this metric on the Atlas Page Faults chart, accessed through cluster monitoring.

The following alert conditions measure the number of cursors for a MongoDB process, as collected from the MongoDB serverStatus command's metrics.cursor document. You can view these metrics on the Atlas Cursors chart, accessed through cluster monitoring.

Cursors: Open is

Raised if the number of cursors the server is maintaining for clients meets the specified average.

Cursors: Timed Out is

Raised if the number of timed-out cursors the server is maintaining for clients meets the specified average.

The following alert conditions measure throughput for MongoDB process, as collected from the MongoDB serverStatus command's network document. You can view these metrics on a host's Network chart, accessed through cluster monitoring.

Network: Bytes In is

Raised if the number of bytes sent to MongoDB meets the specified threshold.

Network: Bytes Out is

Raised if the number of bytes sent from MongoDB meets the specified threshold.

Network: Num Requests is

Raised if the number of requests sent to MongoDB meets the specified average.

The following alert conditions apply to the MongoDB process's oplog. You can view these metrics on the following charts, accessed through cluster monitoring:

  • Oplog GB/Hour

  • Replication Headroom

  • Replication Lag

  • Replication Oplog Window

The following alert conditions apply to the oplog:

Oplog Data Per Hour is

Raised when the amount of data per hour being written to a primary's oplog meets the specified threshold.

Replication Headroom is

Raised when the difference between the sync source member's oplog window and the replication lag time on the secondary meets the specified threshold.

Replication Lag is

Raised if the approximate amount of time that the secondary is behind the primary meets the specified threshold. Atlas calculates replication lag using the approach described in Check the Replication Lag in the MongoDB manual.

Replication Oplog Window is

Raised if the approximate amount of time available in the primary's replication oplog meets the specified threshold.

The following alert conditions apply to database storage, as collected for a MongoDB process by the MongoDB dbStats command. For details on how Atlas handles reaching database storage limits, refer to the FAQ page. These conditions are based on the summed total of all databases on the MongoDB process:

Note

Atlas retrieves database metrics every 20 minutes by default but adjusts frequency when necessary to reduce the impact on database performance.

DB Data Size is

Raised if approximate size of all documents (and their paddings) meets the specified threshold.

DB Storage is

Raised if the allocated storage meets the specified threshold. This alert condition can be viewed on a host's DB Storage chart, accessed through cluster monitoring.

The following alert conditions apply to the MongoDB process's WiredTiger storage engine, as collected from the MongoDB serverStatus command's wiredTiger.cache and queues.execution documents.

You can view these metrics on the following charts, accessed through cluster monitoring:

  • Cache Activity

  • Cache Usage

  • Tickets Available

The following are the alert conditions that apply to WiredTiger:

Cache: Bytes Read Into Cache is

Raised when the number of bytes read into the WiredTiger cache meets the specified threshold.

Cache: Bytes Written From Cache is

Raised when the number of bytes written from the WiredTiger cache meets the specified threshold.

Cache: Dirty Bytes is

Raised when the number of dirty bytes in the WiredTiger cache meets the specified threshold.

Cache: Used Bytes is

Raised when the number of used bytes in the WiredTiger cache meets the specified threshold.

Tickets Available: Reads is

Raised if the number of read tickets available to the WiredTiger storage engine meet the specified threshold.

Tickets Available: Writes is

Raised if the number of write tickets available to the WiredTiger storage engine meet the specified threshold.

For clusters running on MongoDB version 7.0 and later, don't use the number of tickets as a metric for overload alerts. Starting in MongoDB version 7.0, Atlas dynamically adjusts the number of tickets. Instead, use the number of queued readers and writers as an overload metric.

The following alert conditions measure usage on your Atlas server clusters:

Note

Currently, Atlas uses a single partition for data, index, and journal files. Even though the alerts reference individual paritions, they point to the same metric.

Note

All hardware metrics have burst reporting equivalents with distinct configurable alerts. To learn more, see Burst Reporting.

Disk Queue depth on Data Partition is

Raised if the average length of the queue of requests issued to the data partition that MongoDB uses exceeds the specified threshold.

Disk read IOPS on Data Partition is

Raised if the average number of disk read operations per second exceeds the specified threshold.

Disk read latency on Data Partition is

Raised if the amount of latency on disk read operations exceeds the specified threshold.

Disk space % used on Data Partition is

The percentage of disk space used on any partition that contains the MongoDB collection's data.

To find possible solutions for this alert, see Alert Resolutions.

Disk write IOPS on Data Partition is

Raised if the average number of disk write operations per second exceeds the specified threshold.

Disk write latency on Data Partition is

Raised if the amount of latency on disk write operations exceeds the specified threshold.

Max disk queue depth on Data Partition is

Raised if the maximum average length of the queue of requests issued to the data partition that MongoDB uses exceeds the specified threshold.

Max disk read IOPS on Data Partition is

Raised if the maximum average number of disk read operations per second exceeds the specified threshold.

Max disk read latency on Data Partition is

Raised if the maximum amount of latency on disk read operations exceeds the specified threshold.

Max disk space % used on Data Partition is

Raised if the maximum percentage of disk space used on any partition that contains the MongoDB collection's data exceeds the specified threshold.

Max disk write IOPS on Data Partition is

Raised if the maximum average number of disk write operations per second exceeds the specified threshold.

Max disk write latency on Data Partition is

Raised if the maximum amount of latency on disk write operations exceeds the specified threshold.

Max System Network In is

Raised if the maximum number of bytes sent to MongoDB meets the specified threshold.

Max System Network Out is

Raised if the maximum number of bytes sent from MongoDB meets the specified threshold.

System: CPU (Steal) % is

Applicable when the EC2 cluster credit balance is exhausted.

The percentage by which the CPU usage exceeds the guaranteed baseline CPU credit accumulation rate. CPU credits are units of CPU utilization that you accumulate. The credits accumulate at a constant rate to provide a guaranteed level of performance. These credits can be used for additional CPU performance. When the credit balance is exhausted, only the guaranteed baseline of CPU performance is provided, and the amount of excess is shown as steal percent.

Note

Atlas triggers this alert only for AWS EC2 clusters that support Burstable Performance. Currently, these are M10 and M20 cluster types.

System: CPU (User) % is

The CPU usage of the processes on the node, normalized by the number of CPUs. This value is scaled to a range of 0-100%.

System: Max CPU (Steal) % is

Raised if the maximum percentage by which the CPU usage exceeds the guaranteed baseline CPU credit accumulation rate exceeds the specified threshold.

System: Max CPU (User) % is

Raised if the maximum CPU usage of the processes on the node, normalized by the number of CPUs, exceeds the specified threshold.

System Network In is

Raised if the average rate of physical bytes received per second by the eth0 network interface reaches the specified threshold.

System Network Out is

Raised if the average rate of physical bytes transmitted per second by the eth0 network interface reaches the specified threshold.

Restarts in Last Hour is

Raised if the number of times a host restarts within the previous hour exceeds the specified threshold.

Host is Down

Raised if Atlas is unable to reach a host for several minutes.

Important

You should only configure this alert if you depend on secondary reads. For more information on secondary reads, see Query using Pre-Defined Replica Set Tags and Read Preference.

This alert is generally triggered by one of the following conditions:

  • The cluster has experienced a failure and is being auto-healed.

  • The cluster could not be reached because of a network issue.

MongoDB Atlas checks that the downtime did not occur because of your actions, such as rolling index builds. If MongoDB Atlas confirms that the downtime was not intentional, MongoDB Atlas attempts to replace the affected node. If failures occur, Atlas clusters maintain node availability for both reads and writes as long as a majority of nodes are running. To learn more, see How does MongoDB Atlas deliver high availability?.

The following alert conditions apply to swap space usage:

Swap Usage: Free is

Raised if the amount of available swap space drops below the specified threshold.

Swap Usage: Max Free is

Raised if the maximum amount of available swap space drops below the specified threshold.

Swap Usage: Max Used is

Raised if the maximum total amount of swap space in use reaches the specified threshold.

Swap Usage: Used is

Raised if the total amount of swap space in use reaches the specified threshold.

The following alert conditions apply to sort operations:

Sort: Spill to disk during sort is

Raised if the number of writes to disk caused by $sort stages meets the specified threshold.

The following host conditions do not apply to Atlas. Atlas will not generate alerts for the following conditions:

  • Accesses Not In Memory: Total is

  • Background Flush Average is

  • B-tree: accesses is

  • B-tree: hits is

  • B-tree: misses is

  • B-tree: miss ratio is

  • Cursors: Client Cursors Size is

  • Effective Lock % is

  • Journaling Commits in Write Lock is

  • Journaling MB is

  • Journaling Write Data Files MB is

  • Memory: Mapped is

  • Page Fault Exceptions Thrown: Total is

The following alerts apply to indexes on your collections. Either alert might indicate a missing or inefficient index.

Tip

See also:

To learn more about indexing to improve performance, see Indexing Strategies.

Query Targeting: Scanned / Returned

Raised if the ratio of index keys scanned to documents returned meets or exceeds the specified threshold.

Query Targeting: Scanned Objects / Returned

Raised if the ratio of documents scanned to documents returned meets or exceeds the specified threshold.

The change streams cursors that the Atlas Search process (mongot) uses to keep Atlas Search indexes updated can contribute to the query targeting ratio and trigger query targeting alerts if the ratio is high.

The following alerts apply to Cloud Backup snapshots.

Backup restore failed

Raised when a restore fails.

Backup restore succeeded

Raised when a restore succeeds.

Fallback snapshot failed

Raised when a fallback snapshot fails.

Fallback snapshot taken

Raised when a regular backup fails, but Atlas was able to take a fallback snapshot.

Tip

See also:

Last snapshot too old

Raised when too much time has passed since the last successful snapshot.

Snapshot download request failed

Raised when a download request fails.

Snapshot schedule fell behind

Raised when a snapshot hasn't been taken over configured period.

Snapshot taken successfully

Raised when a snapshot was taken successfully.

The following alert conditions apply to replica sets:

Number of elections in last hour is > X

Raised when the number of elections that have occurred in the last hour exceeds the user-specified value of X. The value of X is set when you create the alert. This alert might indicate that the cluster's replication is not in a healthy state, as evidenced by constant elections.

Replica set elected a new primary

Raised when a replica set elects a new primary.

Replica set has no primary

Raised when a replica set does not have a primary. Specifically, when none of the members of a replica set have a status of PRIMARY, the alert triggers. For example, this condition might arise when a set has an even number of voting members resulting in a tie.

If Atlas collects data during an election, this alert might send a false positive. To prevent such false positives, set the alert configuration's after waiting interval (in the configuration's Send to section).

To find possible solutions for this alert, see Alert Resolutions.

The following alert condition applies to sharded clusters:

Cluster is missing an active mongos

Raised if Atlas cannot reach any mongos for the cluster.

The following alert conditions apply to Atlas App Services.

An overall request rate limit has been hit

Raised when the number of concurrent requests exceeds the limit. This alert indicates that an app might be making an unexpectedly high number of requests.

Auth Login Fail is

Raised if the number of failed client login requests per second meets the specified threshold.

Endpoints Compute Time is

Raised if the HTTPS endpoints compute time per second meets the specified threshold.

Endpoints Egress Bytes is

Raised if the HTTPS endpoints data egress bytes per second meets the specified threshold.

Failed Requests - Endpoints is

Raised if the number of HTTPS endpoints requests that fail per second meets the specified threshold.

Failed Requests - GraphQL is

Raised if the number of GraphQL requests that fail per second meets the specified threshold. (GraphQL support for Atlas App Services is deprecated. To learn more, see the Atlas App Services documentation.)

Failed Requests - Overall is

Raised if the number of total requests that fail per second meets the specified threshold.

Failed Requests - SDK (Functions) is

Raised if the number of SDK Function requests that fail per second meets the specified threshold.

Failed Requests - Sync is

Raised if the number of failed Atlas Device Sync requests per second meets the specified threshold.

Failed Requests - Triggers is

Raised if the number of Triggers requests that fail per second meets the specified threshold.

GraphQL Compute Time is

Raised if the GraphQL compute time per second meets the specified threshold. (GraphQL support for Atlas App Services is deprecated. To learn more, see the Atlas App Services documentation.)

GraphQL Egress Bytes is

Raised if the GraphQL data egress bytes per second meets the specified threshold. (GraphQL support for Atlas App Services is deprecated. To learn more, see the Atlas App Services documentation.)

GraphQL Request Duration P95 is

Raised if the 95th percentile of duration in milliseconds for GraphQL requests meets the specified threshold. (GraphQL support for Atlas App Services is deprecated. To learn more, see the Atlas App Services documentation.)

HTTP Endpoint Request Duration P95 is

Raised if the 95th percentile of duration in milliseconds for HTTPS endpoint requests meets the specified threshold.

MQL Request Duration P95 is

Raised if the 95th percentile of duration in milliseconds for MQL requests meets the specified threshold.

Overall Compute Time is

Raised if the overall compute time per second meets the specified threshold.

Overall Egress Bytes is

Raised if the overall data egress bytes per second meets the specified threshold.

SDK Functions Compute Time is

Raised if the SDK Functions compute time per second meets the specified threshold.

SDK Functions Egress Bytes is

Raised if the SDK Functions data egress bytes per second meets the specified threshold.

SDK Functions Request Duration P95 is

Raised if the 95th percentile of duration in milliseconds for SDK function requests meets the specified threshold.

SDK MQL Compute Time is

Raised if the SDK MQL compute time per second meets the specified threshold.

SDK MQL Egress Bytes is

Raised if the SDK MQL data egress bytes per second meets the specified threshold.

Session Ended - Sync is

Raised if the number of sessions ended per second during Atlas Device Sync meets the specified threshold.

Sync Client Bootstrap Time is

Raised if the 95th percentile of the bootstrap time for the Atlas Device Sync client meets the specified threshold.

Sync Client Uploads that failed is

Raised if the number of uploads that failed per second on the Atlas Device Sync client meets the specified threshold.

Sync Client Uploads that are invalid

Raised if the number of invalid uploads per second on the Atlas Device Sync client meets the specified threshold.

Sync Current Oplog Lag Sum is

Raised if the approximate amount of time that the Atlas Device Sync is behind the MongoDB oplog meets the specified threshold.

Sync Egress Bytes is

Raised if the Atlas Device Sync data egress bytes per second meets the specified threshold.

Sync Num Unsyncable Docs % is

Raised is the number of App Services unsyncable documents meets the specified threshold.

Triggers Compute Time is

Raised if the triggers compute time per second has meets the specified threshold.

Triggers Current Oplog Lag Sum is

Raised if the approximate amount of time that the App Services triggers is behind the MongoDB oplog meets the specified threshold.

Triggers Egress Bytes is

Raised if the triggers data egress bytes per second meets the specified threshold.

Triggers Request Duration P95 is

Raised if the 95th percentile of duration in milliseconds for triggers meets the specified threshold.

The following alert conditions apply to Serverless instances:

Serverless metric outside threshold

Raised if any of the following conditions apply:

  • The number of open connections to the host exceeds 80% of the total open connections allowed.

  • The approximate size of all documents (and their paddings) and the index exceeds 0.75 terabytes.

  • The read processing units (RPUs) per second exceeds 250K for 30 minutes or more and realerts every 12 hours.

  • The read processing units (RPUs) per second exceeds 1 million for 5 minutes or more and realerts every 2 hours.

Total Read Units is

Raised if the total read processing units (RPUs) per second exceeds the specified threshold.

Total Write Units is

Raised if the total write processing units (WPUs) per second exceeds the specified threshold.

The following alert conditions apply to Atlas users.

Organization users do not have multi-factor authentication enabled

Raised when one or more users in an organization do not have multi-factor authentication enabled.

User had their role changed

Raised when an Atlas user's project or organization roles have changed.

User joined the organization

Raised when a new user joins the Atlas organization.

User joined the project

Raised when a new user joins the Atlas project.

User left the organization

Raised when a user leaves the Atlas organization.

User left the project

Raised when a user leaves the Atlas project.

The following alert conditions apply to your Atlas project.

Users awaiting approval to join project

Raised if there are users who have asked to join the project. A user can ask to join a project when first registering for Atlas.

Users do not have multi-factor authentication enabled

Raised if the project or organization has users who have not set up multi-factor authentication.

The following alert conditions apply to Atlas billing. You can configure billing alerts from the Atlas UI at the organization level or the project level.

To configure organization-level alerts:

1
  1. If it's not already displayed, select your desired organization from the Organizations menu in the navigation bar.

  2. Click the Organization Settings icon next to the Organizations menu.

    The Organization Settings page displays.

2

Click Alerts in the sidebar.

The Organization Alerts page displays.

3

To configure project-level alerts:

1
  1. If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.

  2. If it's not already displayed, select your desired project from the Projects menu in the navigation bar.

  3. Do one of the following steps:

    • Click the Project Alerts icon in the navigation bar.

    • Next to the Projects menu, expand the Options menu, click Project Settings, and click Alerts in the sidebar.

    The Project Alerts page displays.

2

Note

All amounts billed are in USD.

Amount billed ($) yesterday is above the threshold

Raised if the organization or project's last daily amount billed exceeds your configured threshold. Atlas does not account for any credits applied for the previous day when calculating the billed amount.

This condition applies to both organizations and projects.

Credit card is about to expire

Raised if the credit card on file is about to expire. The alert is triggered at the beginning of the month that the card expires. Atlas enables this alert when a credit card is added for the first time.

This condition applies to both organizations and projects.

Current bill ($) for any single project is above the threshold

Raised if the monthly total for any project within the organization exceeds your configured threshold for all projects. When the current pending invoice closes, this alert resets.

This alert condition applies to organizations only.

Current bill ($) for the organization is above the threshold

Raised if the monthly total for the organization exceeds your configured threshold. When the current pending invoice closes, this alert resets.

This alert condition applies to organizations only.

The following alert conditions apply to Atlas service accounts. You can configure these alerts from the Atlas UI at the organization level.

Service Account Secrets are about to expire

Raised if a secret for any of your service accounts expires within seven days, or the number of days you specify if you configure this alert. When all expiring secrets are removed or have expired, this alert resets.

This alert condition applies to organizations only.

Service Account Secrets have expired

Raised if a secret for any of your service accounts has expired. To generate a new secret, see Update Programmatic Access to an Organization. When all expired secrets are removed, this alert resets.

This alert condition applies to organizations only.

Organization's IdP certificate is about to expire

Raised when an IdP certificate associated with an organization for which you have the Organization Owner role expires within 14 days. Atlas sends this alert daily until you acknowledge it.

Note

Atlas creates this alert automatically when you map an organization to an IdP provider. If you remove the mapping, Atlas deletes all instances of this alert.

The following alert conditions apply to projects using Encryption at Rest using Customer Key Management.

AWS encryption key elapsed time since last rotation is above (n) days

Raised if the AWS Customer Master Key (CMK) used by the Atlas project has been active for more than the configured number of days (90 by default).

To modify the alert threshold:

  1. In Atlas, go to the Project Alerts page.

    1. If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.

    2. If it's not already displayed, select your desired project from the Projects menu in the navigation bar.

    3. Do one of the following steps:

      • Click the Project Alerts icon in the navigation bar.

      • Next to the Projects menu, expand the Options menu, click Project Settings, and click Alerts in the sidebar.

      The Project Alerts page displays.

  2. Click the Alert Settings.

If you configure the default 90 days alert to be greater than the AWS KMS CMK rotation, Atlas won't create the alert because AWS would have automatically rotated your CMK.

This alert resets automatically if you rotate the project CMK. For documentation on how to rotate your project CMK, see Rotate your AWS Customer Master Key.

Azure encryption key elapsed time since last rotation is above (n) days

Raised if the Azure Key Vault Key Identifier used by the Atlas project has been active for more than the configured number of days (90 by default).

To modify the alert threshold:

  1. In Atlas, go to the Project Alerts page.

    1. If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.

    2. If it's not already displayed, select your desired project from the Projects menu in the navigation bar.

    3. Do one of the following steps:

      • Click the Project Alerts icon in the navigation bar.

      • Next to the Projects menu, expand the Options menu, click Project Settings, and click Alerts in the sidebar.

      The Project Alerts page displays.

  2. Click the Alert Settings.

This alert resets automatically if you rotate the project Key Identifier. For documentation on how to rotate your project Key Identifier, see About Rotating Your Azure Key Identifier.

GCP encryption key elapsed time since last rotation is above (n) days

Raised if the GCP Key Version Resource ID used by the Atlas project has been active for more than the configured number of days (90 by default).

To modify the alert threshold:

  1. In Atlas, go to the Project Alerts page.

    1. If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.

    2. If it's not already displayed, select your desired project from the Projects menu in the navigation bar.

    3. Do one of the following steps:

      • Click the Project Alerts icon in the navigation bar.

      • Next to the Projects menu, expand the Options menu, click Project Settings, and click Alerts in the sidebar.

      The Project Alerts page displays.

  2. Click the Alert Settings.

This alert resets automatically if you rotate the project Key Version Resource ID.

To learn how to rotate your project Key Version Resource ID, see Rotate your GCP Key Version Resource ID.

Encryption at Rest KMS network access denied

Raised if the KMS credentials for your cloud provider are invalid due to network access restrictions.

To modify or remove the alert:

  1. In Atlas, go to the Project Alerts page.

    1. If it's not already displayed, select the organization that contains your desired project from the Organizations menu in the navigation bar.

    2. If it's not already displayed, select your desired project from the Projects menu in the navigation bar.

    3. Do one of the following steps:

      • Click the Project Alerts icon in the navigation bar.

      • Next to the Projects menu, expand the Options menu, click Project Settings, and click Alerts in the sidebar.

      The Project Alerts page displays.

  2. Click the Alert Settings.

This alert is enabled by default for all new projects.

The following alert conditions apply to projects with configured maintenance windows.

Note

You can only configure maintenance window alerts if a project has an active maintenance window.

Maintenance is scheduled

Raised 72 hours prior to scheduled maintenance for a project.

Maintenance no longer needed

Raised if scheduled maintenance is no longer needed for a project.

Maintenance started

Raised when maintenance starts for a project.

Maintenance has been auto-deferred

Raised if maintenance has been deferred.

Granted additional access to MongoDB support

Raised when MongoDB support staff has infrastructural access. You can view the access grant type and the expiration date of the granted event.

Revoked additional access from MongoDB support

Raised when MongoDB support staff no longer has infrastructural access. You can view the access grant type.

The following alert conditions apply to projects running Stream Processing Instances.

Stream Processor State is failed

Raised if a target stream processor exits with a failed state.

Back

Alert Basics