Alert condition hit critical and didn't send alert

Hi, we have a Pro plan but it wouldn’t seem to let me create a support ticket through the portal?

Anyway, the following alert condition was triggered at the critical level but we failed to receive notification via slack: 549929 We didn’t have any incidents open at the time so we would expect this to trigger?

This is a critical policy so we’d like to know why the alert wasn’t sent and what the risk is of this happening again.

Thanks,

Stephen

Hi @stephen.wilkinson - You’ll need to talk to sales regarding ticketed support as we are not sure of the requirements for this and there have been some changes in the support tiers.

In regards to failing to receive the notification for a critical violation, could you provide a link to the alert incident that did not send a notification? It sound like there might not have been an incident opened either so if this is the case, please provide a link to the alert condition and let us know the approximate date/time this violation should have opened.

It didn’t raise an incident either, also double checked to ensure there wasn’t a stale already-open incident:

@dkoyano We’re seeing the same issue. Critical violations are present but no incidents are generated. A similar alert we had triggered with no issues early last week 9/28.

Seeing if this is a known issue, our version of New Relic on our environment is 1.20.2.

Hi @stephen.wilkinson - Can you provide a link to the alert condition that didn’t open violation? Note that only users that have access to your account will be able to view this link.

Here’s the link: https://one.newrelic.com/launcher/nrai.launcher?platform[accountId]=697050&pane=eyJuZXJkbGV0SWQiOiJjb25kaXRpb24tYnVpbGRlci11aS5jb25kaXRpb24tZWRpdCIsIm5hdiI6IlBvbGljaWVzIiwicG9saWN5SWQiOiIxNTQ5OTI5IiwiY29uZGl0aW9uSWQiOjIyNTQ3MTQ2LCJyZW5kZXJlZEluUGFuZSI6dHJ1ZX0=&sidebars[0]=eyJuZXJkbGV0SWQiOiJucmFpLm5hdmlnYXRpb24tYmFyIiwibmF2IjoiUG9saWNpZXMifQ==&state=4361e2ee-372a-4a9e-9e0c-07fedc3eb158

And a permalink if that’s easier: https://onenr.io/0NgR7gJXnRo

Any updates here?

some text to hit the character limit

Hi Stephen:

Here is the query for that period of time for that condition ID. There were 29 incidents opened and closed in the time window between 9/28/21 00:00:00 till 9/30/21 00:00:00. Some are opened longer than others.

One thing I notice is that the alert condition is based on a NRQL alert condition using count(). New Relic used to insert synthetic zeroes as a result for some queries, but ever since Streaming Alerts for NRQL conditions was introduced, synthetic zeroes are no longer inserted (the rules for when they were inserted was obtuse and opaque, so now you get exactly what the query returns).

Take a look at this article, that will explain why count() and uniqueCount() will never return a value of 0. The article also goes into a couple of possible solutions depending on your use-case – these are new features which were also released with the Streaming Alerts Platform.

Since the condition needs below 10 (which can include a 0) to close; but instead of 0 s getting a Null it can result in a previous incident remaining open and future violations rolling up into it. Notifications are only sent when an incident open, closes, or is acknowledged. Violations do not send notifications. This means the previous incident would need to have closed for this violation to trigger a new incident.

There were times that the query for that time drops to 0/Null and if that happened during a time when an incident was open, it would have created a violation and it would have rolled up into the opened incident and then the incident could have closed when the data moved off of the null and sent a number below 10 an above 0 which would have closed it. This also means it may have not been open with you looked.

I would recommend with this condition (and possibly others like it) is to set up a Loss of Signal (LoS) for the same duration as the threshold, and configure it to close all open violations when met. This will result in violations that close automatically, as you’d expect them to.

Once you have made this change please let us know if the issue persists.

@slim Looking at this image your issue is similar but in the other direction of Stephens. A 0 value is considered a violation. You will need Loss of Signal set to open.

The alert condition is based on a NRQL alert condition using count(). New Relic used to insert synthetic zeroes as a result for some queries, but ever since Streaming Alerts for NRQL conditions was introduced, synthetic zeroes are no longer inserted (the rules for when they were inserted was obtuse and opaque, so now you get exactly what the query returns).

Take a look at this article, that will explain why count() and uniqueCount() will never return a value of 0. The article also goes into a couple of possible solutions depending on your use-case – these are new features which were also released with the Streaming Alerts Platform.

What I would recommend with this condition (and possibly others like it) is to set up a Loss of Signal (LoS) for the same duration as the threshold, and configure it to Open new "lost signal" violation when met. This will result in violations opening when the query stops returning numeric values and drops to 0/NULL.

There is one caveat, which is covered in the article I linked as well: there needs to be a signal to lose, in order for LoS behavior to kick in. That is, there must be a numeric data point that then drops to NULL.

Hello, I have same issues but i dont think it correlated with LoS because there is around 100 event sent to new relic per minute without no data. I try to make another condition that have most minimum time window and swap the condition,
here the query FROM Transaction SELECT percentage(count(*),WHERE error IS true) AS 'Success Rate' WHERE appName = '#appName#' AND name NOT LIKE 'transName%' AND name NOT LIKE 'trasnName%' FACET capture(name, r'OtherTransaction/Go/(?P<Name>.*)') still not sending any alert while it should be alot of alert sent.

PS: everything was fine, just now there is incident, somehow new relic not sending any alert. I check no open violation. Is there any problem with new relic?

Hey there @kevin.juniawan,

Thank you for reaching out. Are you still experiencing trouble with this? I want to make sure that you were not experiencing this during a brief interruption:

New Relic Unavailable for the US Region

Resolved

Between 17:00 UTC and 23:00 UTC on July 28, customers in the US region may have experienced an interruption of services for the New Relic UI, delays for data and chart updates, and missing alert notifications. We investigated and resolved a service interruption with our Cloud Service Provider and impacted services have returned to normal operation.

July 28, 17:28 PST

Hello thank you for reaching me, on saturday 8:00 AM GMT +7, notification run normally till now.

Edited : my alert wasn’t work at 29 July 2022 17:40, is it correlated or not with 28 july’s incident?

We had this happen again on alert Log in to New Relic

Set to close after 1 day and the previous error was way before that. We should have been alerted on the morning of the 27th and the 28th. This is on our quoting API which is high traffic so there’s no way it would have been do to no results in the query. The critical status covered over 24 hours too.

This is one of our critical indicators of performance regressions after release, so if it can’t be relied on I’m not sure your platform is fit for purpose.

Hello @stephen.wilkinson,

Welcome back to the community, it is good to see you again!

Thank you for reaching out and supplying us with a link. Our support engineers are looking over this and will reach out on this post soon for more information or extra steps. We appreciate your patience as we look into this further. Please let us know if you have any other questions in the mean time!

This topic was automatically closed after 365 days. New replies are no longer allowed.

@stephen.wilkinson Hi there! I was just checking in to see if your issue got resolved? Thanks!