We’d like to be able to assign two rates on monitors:
- The rate that is there now would apply whenever the last job was successful.
- The new rate would apply whenever the last job failed.
So, for example, I have a monitor that runs every 12 hours. If it fails, I’d like the job to run every 30 minutes so that when it is fixed, I don’t have to wait up to 12 hours to see if issue is resolved and for the alert incident to be auto-closed. As long as it keeps failing, the job will be run at the 30-min rate.
There is a Re-check button available on each synthetic failure report that allows to recheck on-demand, but we’d like an automated way. Here are some reasons we’d like to see this:
- Sometimes when we have a failure on one monitor, the failure will occur on a few and occasionally many monitors due to same underlying issue. We won’t want to have to drill down into each failure to click Re-check button.
- Sometimes an issue will be resolved after a delay (e.g. due to server restart, waiting for caches to expire). Rather than having to remember or set a reminder to run the re-check some minutes in the future, we can just know that it will automatically be checked again fairly soon, and until it is resolved.
New Relic Edit
- I want this too
- I have more info to share (reply below)
- I have a solution for this
We take feature ideas seriously and our product managers review every one when plotting their roadmaps. However, there is no guarantee this feature will be implemented. This post ensures the idea is put on the table and discussed though. So please vote and share your extra details with our team.