Background
A common industry standard for Synthetics is to trigger an alert when a monitor fails multiple times in a row across all locations. New Relic Synthetics doesn’t currently provide an option to create a monitor failure alert with an out of the box alerting option.
Multi-location Failure Alerting
Synthetics does have a canned alert for multi-location failures. This triggers an alert when a monitor, or set of monitors, fail from a specific number of locations in a row. More details
Concurrent Monitor Failure Alerting
Although an out of the box alert is not available to trigger an alert when a monitor fails x out of y times, we can accomplish this by using NRQL alerting.
NRQL Alerting Option
Follow the steps to set up a NRQL alert condition that supports concurrent monitor failure:
Create a NRQL alert condition
Follow the steps on our docs to create a NRQL alert in the same account as where the Synthetics monitor is running.
NRQL alert configuration
Configure the NRQL alert to use the following NRQL:
SELECT count(*) FROM SyntheticCheck WHERE monitorName =[Monitor_name] and result = 'FAILED'
Note
- If you want to have the monitor evaluate failures for single locations add ‘FACET location’ at the end of the NRQL.
- To have this track monitor failures across all possible locations keep us the NRQL above (no facet).
Threshold configurations
Adjust the “when the” field drop down to be “sum of query results is” and “above”. This causes an alert to be based on when a monitor fails.
Critical values
Enter the number of check failures to occur and the time window before the condition is triggered.
Important considerations
Consider the period at which the monitors run and number of locations. If a monitor runs every minute from 3 locations this equates to 3 checks per minute. Adjusting the # failures and time window will adjust your threshold chart. It’s recommended to adjust this until you see clear spikes/failure for the monitor.
Advanced signal settings
Select “custom static value” = 0 here. This will ensure that the alert will auto resolve when failures no longer occur.