Manage System Alerts
On this page
System alerts are internal health checks that monitor the health of Ops Manager itself, including the health of backing databases, Backup Daemons, and backed-up deployments. Ops Manager runs health checks every five minutes.
To view the list of system alerts:
Click the Admin link at the top of the Ops Manager UI.
Click the Alerts tab.
Click the Open Alerts link under System Alerts.
Disabled system alerts are grayed out.
If you have the
Global Owner
or
Global Monitoring Admin
roles, you can
modify notification settings
or disable a system alert.
System Alert Components
Each system alert consists of three components:
Component | Examples |
---|---|
A condition that triggers the alert |
|
A list of recipients of the alert |
|
A method by which the alert is sent |
|
When the alert is enabled and its trigger condition is met, Ops Manager sends an alert to the specified recipients using the specified medium for that alert. For a list of the notification options, see the Select the alert recipients and delivery methods step in the Modify Notification Settings for a System Alert procedure on this page.
By default, Ops Manager enables all alerts and sends the alerts to the email address specified in the Admin Email Address field in the Ops Manager configuration options.
Available System Alerts
Ops Manager provides the following system alerts:
Alert Processing
Backup
Alert Type | Alert Message | Description |
---|---|---|
OPLOG_TTL_RESIZE | Sent when the Backup Daemon has fallen so far behind in applying oplog entries that Ops Manager has extended the period of time it stores the oplog entries. By default, Ops Manager stores oplog entries in the Oplog Store for 24 hours. If the Daemon has not yet applied an oplog entry an hour before its expiration, Ops Manager extends the storage period for another three hours. Ops Manager can continue to extend the storage period up to 14 days. If you receive this alert:
| |
THEFT_FAILED | Sent when a backup job migration to a new Backup Daemon fails. The backup job continues to run on the original Backup Daemon. For more information on moving jobs, see Jobs. |
Backup Daemon
Alert Type | Alert Message | Description |
---|---|---|
DAEMON_DOWN | Sent when the Backup Daemon has not pinged Ops Manager for more than
15 minutes. | |
DAEMON_UP | ||
LOW_HEAD_FREE_SPACE | Sent when the disk partition on which the local copy of a backed-up replica set is stored has less than 1 GB of free space remaining. Follow the Modify Notification Settings for a System Alert procedure to change this space limit. | |
LOW_HEAD_FREE_SPACE_PERCENT | Sent when the disk partition on which the local copy of a backed-up replica set is stored has less than 10 percent of free space remaining. Follow the Modify Notification Settings for a System Alert procedure to change this percentage. | |
SUFFICIENT_HEAD_FREE_SPACE |
Blockstore
Alert Type | Alert Message | Description |
---|---|---|
BALANCER_OFF | ||
BALANCER_ON | Sent when a sharded blockstore is running the sharded cluster
balancer. You should disable the balancer on a sharded
blockstore. To disable the balancer, see
Disable the Balancer. | |
INSIDE_SPACE_USED_THRESHOLD | ||
OUTSIDE_SPACE_USED_THRESHOLD | Sent when the disk space the blockstore uses exceeds the configured threshold
setting. The default threshold is 85% of the total disk capacity
on which the blockstore is stored. You can change the
mms.alerts.OutsideSpaceUsedThreshold.maximumSpaceUsedPercent
value in the Ops Manager configuration. |
Cron Job
Cron Job Status
Database Process
Alert Type | Alert Message | Description |
---|---|---|
BACKING_DATABASE_PROCESS_DOWN | Sent when Ops Manager cannot connect to a backing database and run
the ping command. | |
BACKING_DATABASE_PROCESS_NO_STARTUP_WARNINGS | ||
BACKING_DATABASE_PROCESS_STARTUP_WARNINGS | ||
BACKING_DATABASE_PROCESS_UP | ||
BACKINGDB_DEFAULTRW_CONCERN_VERIFICATION_FAILED | Sent when the read concern for
the backing databases is not "local" and
the write concern is not w: "majority" . |
Log
Modify Notification Settings for a System Alert
Select the alert recipients and delivery methods.
In the Send to section, configure notifications. To add notifications or recipients, click Add and select from the options listed below. To test a notification, click the test link that appears after you configure the notification and ensure that the service you are testing receives the message.
Which alert notification methods can be set depend on the scope of the alert:
- Project alerts
- Apply to one or more individual Organizations and Projects only.
- Global alerts
- Apply to all Organizations and Projects.
- System alerts
- Apply to the health of Ops Manager and its backing databases.
The alert notifications methods are as follows:
Notification Method | Project | Global | System | Description | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Ops Manager Project | Sends the alert by email or text message to users with specific roles in the Project.
| |||||||||||||||
Ops Manager Organization | Sends the alert by email or text message to users with specific roles in the Organization.
| |||||||||||||||
Ops Manager User | Sends the alert to a Ops Manager user, either by email or text message.
| |||||||||||||||
Ops Manager Team | Sends the alert to a Ops Manager user, either by email or text message.
| |||||||||||||||
SNMP Host | Specify the hostname that will receive the v2c trap on
standard port 162 . The MIB file for SNMP is
available for download. | |||||||||||||||
Email | Sends the alert to a specified email address. | |||||||||||||||
Sends the alert to a HipChat room message stream. Enter the
HipChat room name and API token. | ||||||||||||||||
Sends the alert to a Slack channel in the authorized Slack workplace for the Organization.
To learn more about Bot users in Slack, see the Slack documentation. | ||||||||||||||||
Sends the alert to a PagerDuty account. Enter only the PagerDuty integration key. Define escalation rules and alert assignments directly in PagerDuty. Acknowledge PagerDuty alerts from the PagerDuty dashboard. All new PagerDuty keys use their Events API v2. If you have an Events API v1 key, you can continue to use that key with Ops Manager. | ||||||||||||||||
Webhook | Sends an HTTP POST request to an endpoint for programmatic processing. The request body contains a JSON document that uses the same format as the Ops Manager API Alerts resource. To configure this option, configure the Webhook settings on the Project Settings page. To use this method at the Global level:
Ops Manager adds a request header called
If you specify a key in the Webhook Secret field,
MongoDB Ops Manager adds the | |||||||||||||||
Sends the alert to a Datadog account as a Datadog event. When the alert first opens, Ops Manager sends the alert as an "error" event. Subsequent updates are sent as "info" events. When the alert closes, Ops Manager sends a "success" event. If prompted, enter your DataDog API key under API Key and click Validate Datadog API Key. Find your DataDog API Key in your Datadog account. | ||||||||||||||||
Administrators | Sends the alert to the email address specified in the
Admin Email Address field in the Ops Manager
configuration options. | |||||||||||||||
Global Alerts Summary Email | Sends a summary email of all global alerts to the specified
email address. |