Its funny you should mention and link that feature idea, as you can see I’ve been on there already and you’ve commented underneath, as well, a while back. It’s been a feature request and seemingly on your company’s roadmap for years now in developing…any update/ETA in the six months since you last commented? Any reason why its taking so long to be looked into by the product team? This has been requested by our management and our clients for over a year now in receiving accurate availability reports that do not reflect maintenance windows and removing false positive alerts from our environment during maintenance windows. Would humbly request an update and for the product team to give this serious attention this year. Specifically need it for APM, and Infra alerts. Thanks.
Oh I’m sorry @jgimpelman - I didn’t scroll down in that Feature Idea thread when I was grabbing that link.
We try to be transparent when it comes to feature ideas, however we on the support team, don’t have a huge amount of visibility into them.
We submit feature ideas and the product management team prioritise them. It’s important to understand that we receive a huge number of feature requests, here in the explorers hub, in support tickets, along with other channels (such as direct from customer -> sales rep).
It would be an impossible task to implement every feature request, though with that said the votes you add here in the explorers hub do add to the visibility these get by product management (as my team work to escalate highly voted features).
Because of the large number of requests, and shifting priorities, roadmaps change, higher priority features come along and overtake some of the lesser ‘important’ features on that roadmap. That’s the main reason we can’t guarantee feature timelines.
In this case, I don’t have a timeline right now - but I can work to escalate the feature idea to the product teams.
@RyanVeitch I would also greatly appreciate this feature. My company (Onespan) are making up work arounds right now to deal with the lack of this feature. Any sort of push to get this done would be greatly appreciated. I’ll also forward my request to other contacts to see if this might get put on a higher priority.
Hey @Tyler.Lauzon Thanks for your input - I’ll add your +1
It looks it is still a feature request … any plan to include it in the one of the next releases?
It is a requirement for integration with our CD system: we want to set a monitor downtime anytime we need for deployment, but without API any automation is not possible
@paolo.dellarocca1 - Thanks for posting your use case. I don’t know where this sits on the roadmap, but I’ll get your +1 and use case added for you.
This would also assist our CI/CD pipelines. I want to be able to set Monitor downtime just before I push a release so that I do not get alerts, any update on this request?
Hey @MWilliams3 - you could use the GraphQL endpoints to set Muting Rules - which will disable notifications for the Alert conditions you set. Such that you do not hear about monitor failures that you expect.
Muting rules differ from maintenance windows in that, the monitor will still run, and will still fail. Alert incidents will still be created, so you have the full context of what happened during your release while the condition was muted. But while all that is happening, you will not receive notifications until you unmute the conditions.
We really need the ability to schedule a downtime (maintenance window) via an API. Without it we cannot fully automate our CI/CD pipeline as we will trigger alerts during a deployment.
Needing to schedule a downtime manually defeats the purpose of using an automated CI/CD pipeline.
We need something as simple as a start downtime now at the beginning of the deployment and stop the downtime now at the end of the deployment.
The muting suggestion is not a proper solution to the problem unfortunately.
Hey @christofer.gendreau - Can I ask for clarification on why Muting Rules is not suitable for you?
I’ll get your +1 added for this feature request, but, I’m curious if there’s a way to make Muting Rules work for you in the meantime.
So as I understand it, a muting rule just mutes notifications but the alert still occurs, which would affect our SLAs.
Next, It would seem to be a maintenance nightmare to continuously need to keep the muting rules updated as we add/remove hosts, synthetics, alerts, etc… If we could just tell NR through an API to disable/enable all alerts without needing to know the id’s, host names etc, then we would not have to maintain the list of what we want to disable/enable.
However, admittedly, I am not very familiar with the Muting Rules and using NerdGraph so there may be a way to handle what I want and I just need to dig in more. That being said, a REST endpoint to handle all of this seems to be the most appropriate way to use maintenance windows.
So, if there is a way to mute all monitors and alerts on an account AND have them not affect our SLAs, I would be very interested to understand that.
You’re right! Muting rules allow the incidents to still take place, they just remove notifications.
This is, as far as I understand it, a design decision made to ensure users can still see the effects of work completed during the muted period. Though, if your SLAs are driven by number of incidents/violations, I can totally see why this would then be a challenge for you.
As for maintaining a list of all entities to mute, that can be worked around. As my colleague Sean pointed out here:
Based on Sean’s example you can see how to target all policies, but you can adjust that to target all conditions in a particular policy, depends on how your policies/conditions are set up.
This is all great feedback though, so thank you for that! I’ll get your thoughts added in for the REST endpoint for Synthetics Maintenance Windows.
Thank you for the possible, temporary workaround for silencing all alerts at once. This is a step in the right direction. We would however, still prefer the ability to enter a true maintenance window on demand. I will continue to monitor the site for further communication on the subject.
No worries! Thanks for the detailed feedback, it’s super helpful for us to get that back to the product teams!
Hi there, do you know if it is possible to set a duration for a muting rule? I would like to be able to create a muting rule that lasts for 10 minutes. This would be a safeguard for the rule to be auto removed in case something terrible happens in the build pipeline that causes the pipeline to fail before the call to delete the muting rule is executed.
Not yet! Scheduling muting rules to enable / disable at pre-set times is on the roadmap, but, that functionality is not there yet.
Only just noticed the replies. Muting rules won’t work for us as we use the synthetic statistics in reporting against our uptime. Having alerts trigger but muted would lower the overall percentage and look bad.
Thanks for that clarification @MWilliams3 - I know you mentioned you’re hoping to integrate into your CI/CD pipeline with this. If you are able to trigger an API call via your deployment tooling, it may be better then for you to disable the monitors and re-enable afterwards. With the monitors not running, they won’t fail, and your statistics shouldn’t be impacted.
Again, not ideal, an API for monitor downtime would absolutely be better for you - but maybe that can work for you for now.
I work for a large NR customer, we have thousands of NR synthetics and regularly need to add downtime windows. We have automation tooling for most tasks but this remains one of our dark areas where we have to manually add downtime during maintenance tasks, which is both time consuming and error prone. It would be a great value-add if this feature was implemented.
Thanks for adding in your use case @jessup - this kind of info is crucial in our feature ideas for the development teams to understand demand for these features. I’ll get this added internally for you