When it’s time for a Synthetics monitor to run, it gets sent to the minion location, and added to a queue. If that queue is long, upwards of 60+ seconds that I’ve experienced, it could end up running against a site even if the monitor status has since been disabled. This feature idea would be one of the following options:
Have the monitor job, either check back to the monitor status and see if it’s still okay to run, before it does it’s check.
If that is too costly, then let the monitor job run like it does today, but if it hits a failure, check the monitor status to see if it’s active, before sounding the alarm/alert.
This would help solve false alerts about a site being down. As it is today, we call the API to disable a site monitor, wait 1 minute, and then turn off the site, but occasionally that monitor job is still in the minion location queue, and eventually turns on the site that we’ve turned off, and received ACK from NR API that it is disabled, but end up getting a monitor alert send to PagerDuty and waking up innocent team mates.
Other links to this issue:
My initial forum post: Disabled Synthetic site monitor still throws an alert
My support case Request: #312228
New Relic Edit
- I want this too
- I have more info to share (reply below)
- I have a solution for this
We take feature ideas seriously and our product managers review every one when plotting their roadmaps. However, there is no guarantee this feature will be implemented. This post ensures the idea is put on the table and discussed though. So please vote and share your extra details with our team.