We have been trying the new Alerts feature in combination with a Synthetics Ping test of our login page. We integrated this with PagerDuty and it would be great except this one really annoying problem:
From time to time for whatever reason a ping to our app login page might fail from one of 5 locations. This does not at all indicate that the page is down. It could be that the AWS location sending the ping has an issue with reaching our site. Or that the single request was dropped on our end.
Again, these are not reasons to wake up the on-call tech at 3am (as is the purpose of PagerDuty). However, the way things are now there is no way to specify how many of the checks need to fail before an alert is sent. For example, we may want ALL of the locations to fail twice before sending an alert. That would be 30 minutes of downtime and not ideal, but since we combine New Relic with MANY other monitoring and alerting tools we need a way to set the “volume” of these alerts.
A basic default for us would be to ONLY send an alert if ALL locations fail in the same testing interval.
If this is currently possible, please show me how. If not, please put in a feature request.
New Relic edit
- I want this, too
- I have more info to share (reply below)
- I have a solution for this
We take feature ideas seriously and our product managers review every one when plotting their roadmaps. However, there is no guarantee this feature will be implemented. This post ensures the idea is put on the table and discussed though. So please vote and share your extra details with our team.