What are Warning Violations?
When you create an alert condition, you can set up a critical threshold and a warning threshold. A critical threshold, when breached, will open a critical violation, while a warning threshold will open a warning violation. Simple, right?
Here’s where the difference comes in, though – a critical violation can open an incident, while a warning violation will never open an incident (see this documentation). Since notifications are tied to incident open/incident close events, you will never get a notification from a warning violation.
But what if you actually want to get notified on warning violations? I will share a couple of ways you can make this happen.
Important caveat!
In order for these methods to work, you will need to have your alert policy’s Incident Preference (read more about Incident Preference settings in this article) set to either By condition or By condition and entity. This will allow each condition to open separate incidents and will thus send notifications when separate conditions are violated.
For these examples I will be using a disk fullness alert condition in Infrastructure. For this imaginary alert condition, the critical threshold is set to If disk is over 90% full and the warning threshold is set to If disk is over 80% full. The imaginary alert condition will be called “Disk Fullness Alert” and is part of an alert policy called “Server Alert Policy.” Keep in mind, however, that this will work with any alert condition you might have set up.
First method
For the simplest use case, if you just want the same people to be notified when a warning threshold is breached, you can use the following method (applied to the example mentioned above):
- Create a new alert condition called “Disk Fullness Alert - Warning.”
- Use the warning threshold value as the critical threshold in the new alert condition. For our example, the critical threshold would be set to If disk is over 80% full.
Now when the “warning threshold” is breached (actually a critical threshold now), a notification will be sent that the Disk Fullness Alert - Warning condition has been violated.
Second method
This method is a bit more involved, but will alert a separate group of folks when the warning threshold gets violated. This works great for a use case where you have less-senior team members who will be your first line of defense, but can also be used to notify all the same people as would be notified in the case of critical violations.
- Create a new alert policy called “Server Alert Policy - Warnings.”
- Assign notification channels to this alert policy which correspond to your less-senior team members. Ensure that your original alert policy has notification channels assigned to it that correspond to your more senior team members.
- Create duplicate alert conditions in this policy. Going back to our example, you would want to create an alert condition called “Disk Fullness Alert.”
- Set up thresholds using the values of the warning thresholds from the original conditions as critical thresholds. For our example case, you would use If disk is over 80% full as your critical threshold.
Now when the “warning threshold” (actually a critical threshold now) is breached, a separate group of folks will be notified and the notification will include the policy name, Server Alert Policy - Warnings. If the problem gets worse and breaches the critical threshold (which, if you remember, is If disk is over 90% full), your more senior team members will get called into action to help with the problem.
I hope this helps you to set up your New Relic Alerting system up so that you are getting exactly the notifications you need sent to exactly the channels where you need them! As always, comments and questions are welcome.