Error Critical Alerting Not Working

###Alerts Question Template

Here is a screen shot of the alert. Clearly the ‘critical’ threshold has been met however not seeing a corresponding alerts. The last modified date it recent because I disabled then re-enabled hoping that would fix the problem. It did not. Any help much appreciated!

Reading the material on Relic Solution: Alert Incident Preferences are the Key to Consistent Alert Notifications the only thing that I can see that might have caused no alerts to fire was having the incident policy set to By Policy. I have since changed it. But would be nice if it can be confirmed that was the reason no alert was fired. Thanks again!

Hello @nexkey, it does look like the notifications weren’t sent out because the incident preference was set as By Policy. This means that the first time a condition is violated under a policy an incident is opened and you will get notified. If any subsequent conditions or entities violate under the same policy they will be rolled up under the same incident but you will not get any additional notifications for these new violations. In order to receive new notifications with the By Policy incident preference type, all violations under that incident must be closed.

Now that you’ve changed the incident preference to By condition and entity, you should receive notifications for every entity violation in your policy.

@zahrasiddiqa thanks for looking into it. However I have one more question / clarification that I think you can probably help with as well.

Scrolling through the alert policy screen I seen that some of the alerts have “1 open incidents” however when I click on them it takes me to a screen that says “0 open incidents match…”. Below are a couple screen shots exemplifying what I mean. Would be great to know what I am doing wrong or how to clear these “open” incidents. Thanks again!

The confusion here is really easy to understand. Incidents get named based on the first violation to open, triggering incident creation. Because of the roll-up strategy you previously had set, this condition is associated with an incident with a name that doesn’t contain the name of the condition in question. When you click the 1 open incident button, the UI pre-fills the search bar for you, which doesn’t prove helpful in this situation. Let me talk to our UI engineers (I work on the alerting evaluation side) and see if this is something they’re aware of.

In the meantime, if you’re having trouble finding open violations you can use the events search tab and sort by close date. Violations that have no closed at timestamp will be sorted to the top when you order by closed at and DESC

https://alerts.newrelic.com/accounts/2152832/events/violations?offset=0&direction=DESC&orderBy=closedAt&text=Error%20percentage

@parrott, thanks for the explanation. Makes some sense. Also, thanks for poking the UI team on my behalf.

I guess my suggestion might be to set the incident preference policy to the noisiest level By Condition and Entity by default? I know that is a PM decision but just my two cents. Looking at the events tab I have had many alerts go unalerted due to using the default By Policy setting.

Thanks again for all your assistance however still not totally satisfied (yet). It appears that the only none closed violations is a thread count and disk free alert. No sign of the open error percentage or apdex (low) incident as stated on the alert policy page.

Happy to help get frustrating things resolved. I passed this post on to them to have a look at. We typically don’t provide progress updates in the forum (just to set expectations) but I can assure you that the single best person to be able to effect any changes is aware now.

I’ll get @RyanVeitch to get your #feature-ideas recorded around changing the default incident preference. You’re right, it’s a PM-level decision but @NateHeinrich does read the forum posts. :slight_smile:

I completely understand your ongoing confusion. Because the condition in question have violations that are associated with an open incident, even though those violations are closed now , the UI is telling you that there is 1 open incident associated with the condition. It’s not an obvious thing, I totally agree.

As a sidebar, if you’d like to get this incident closed you can either manually close the currently open violations in the UI or you can disable the conditions that still have open violations, which will cause all associated violations to also close, thereby closing the incident.

Ahh totally. Much clearer now. Thank you for all the clarifications and nudges to the right individuals. Much appreciated!

Taking the time to make some modifications to my whole alerting set up now.

Hey @nexkey - I just got your feature idea filed here for a change to default incident preferences :slight_smile:

@parrott - Thanks for helping out and tagging me here :man_technologist:t2:

I am also facing same issue. I setup like By condition and entity but still only few times I am gettings email alerts not all the time. I am closing alerts whenever I receive notification but still only few times I am getting

Hey,
I have created an alert policy with the NRQL condition . But whenever there is a hit in critical threshold or the condition is met. I am not getting any alert notification for it. I am using Users(my email) for notification channel. Could anybody please assist where i am going wrong and how can i solve this issue.

Hello @db.choudhary, Could you ensure the alert policy is set with incident preference By Condition and entity.

If it is set correctly, could you send a link to the alert condition for further review?

hi,
Here is my alert condition:


i had already set incident preference as condition and entity. I am also not getting incident for this

Hi @zahrasiddiqa ,
here is the link:
https://alerts.newrelic.com/accounts/2226706/policies/726966/conditions/13364489/edit?selectedField=thresholds

Hey @db.choudhary, The alert condition is set to violate when the query returns a value above 0 consistently for 5 minutes. I don’t see this occurring. You might want to change the threshold to at least once in as this is more sensitive.

At last review this feature was not accepted to be added to the roadmap. This request has been closed.