In the past, the Synthetics alerting mechanism was limited. You would get an alert if your monitor failed from a single location. When there were network disruptions between a location and your application, this could cause “flapping” on the alert, where it is raised and cleared multiple times. Now, through the use of NRQL Alerts, you can create a scenario where an Alert is only sent after a certain number of failures.
Requirement: NRQL Alerts
For this to work you need access to NRQL Alerts which is available with your Alerts subscription. To get started, go into your Alerts Policy and configure a new condition of NRQL.
Since we are going to look at Synthetics failures, you want a query like this:
You should see a UI pop up that indicates if you have recently had any failures. My monitor just recently had a failure so I can see the blip on the graph right after 01:30 PM.
Here is where we can supply the “X” failures and “Y” minutes. This is what you need to set:
- You must change the drop down to say “sum of query results”. This is so Alerts will keep adding +1 every time you get a failure. And you must change the threshold dropdown to “above” since we expect to normally have 0 failures.
- Lastly you put in your value of “X” failures (3 in my example) and “Y” minutes (15 minutes in my example).
Lastly you need to put in the name for your condition. This will be used in the Alert Notifications.
In my testing the alert notification looked like this. Note that the name “Multi Location failed 3 times in 15 minutes” was the name I put in for my condition.
When you click on View Incident Details you see a screen like this that shows the SUM of the number of failures:
And if you click on that button that says “Go to SyntheticCheck Overview” it will take you to a query. Note that this does not show the SUM, but shows the TIMESERIES of the individual errors.
If you want the list of the exact ERROR messages, you could change your query:
Or you could navigate to a dashboard that has details on the recent failures (I added a filter for my specific monitor that is going haywire).
Some things to consider:
- Consider how frequently your monitor runs and the # of locations compared with the “Y” minutes. If your monitor runs every 5 minutes from just 1 location, you would only get a maximum 1 failure per 5 minutes. Do the math really quickly when setting this up, and test, test, test!
- Also remember to change the dropdown to “sum of query results”
NRQL queries are brand new, and we know our customers are brilliant at using our products in interesting ways. What have you discovered you can do with NRQL Alerts and Synthetics?