Alert with background throughput condition never triggers

We have an alert condition which should trigger if the background throughput of an application is below 900 calls for at least 5 minutes. In the preview everything looks fine, but the alert actually never triggers. In this screenshot the violation was less than 5 minutes, but we also had cases where the throughput was below 900 calls for multiple hours.

Our other alerts work fine, the problem only occurs with background throughput.

The alert condition id is 22929769.

Hi @lukas12 could you send a permalink to your condition in question and we can take a look? Also if you have the info handy, the timeframe and timezone that you believe was the last time the alert should’ve triggered an incident would help… But if you don’t have that info a condition link will work for first steps!
Another thing that sometimes can happen depending on how the policy is set up is to take a look at your incident preferences and make sure you have them configured properly. I’ll attach a link that go over them a bit in detail. (the main thing is if you have multiple conditions in a policy and you have your incident preference set to by policy, issues can occur there.)

helpful links
https://docs.newrelic.com/docs/alerts/new-relic-alerts/configuring-alert-policies/specify-when-new-relic-creates-incidents

Hi @sarnce , thanks for helping out.
Permalink to my condition: https://onenr.io/0NgR7eNDaQo
The last time the alert should’ve triggered was on December 23, 08:31pm UTC+1.
Incident preference is set to “By condition and signal”.

Hi @lukas12 I wanted to confirm that you are still having this issue. It looks like there are incidents opening and closing here as recently as Jan 7th. And when I looked in the notification logs we have 200, successful : TRUE response codes for that webhook, condition id and time period.

Let us know. :slight_smile: Thanks!

Hi @sduvall , that’s correct. On Jan 7th, conditions using “background throughput” triggered for the first time. It would still be interesting for us why the condition didn’t trigger on December 23, 08:31pm UTC+1. This is an important alert and we want to make sure that it works reliably.

Thanks!

@lukas12 I can understand that. Can you please provide a permalink to the query where you show that the data violated the threshold for that day? Then I can look into it further. Thanks

@sduvall Here you go: https://onenr.io/0oqQaLXoJj1

@lukas12 Thank you for providing that. Here is the query that chart is using: https://onenr.io/0a2wdb5N7jE

You can see the data at that time drops to null, so likely the app is not reporting. While it looks like it is below the threshold because it is stored as a 0 as it is being ingested the null is a null, and null is not less than 900 it is a different data type. To monitor for a Loss of Signal you would need Loss of Signal which is not available on out of the box apm and mobile alerts but is available with NRQL alerts.

If you want to be able to be notified when the app does not report as well as when it drops below the threshold then you can set up a NRQL Alert Condition with the same or similar query and set Loss of Signal to open for the same duration as the threshold.

More information on this:

The alert condition is based on a NRQL alert condition using count(). New Relic used to insert synthetic zeroes as a result for some queries, but ever since Streaming Alerts for NRQL conditions was introduced, synthetic zeroes are no longer inserted (the rules for when they were inserted was obtuse and opaque, so now you get exactly what the query returns).

If you have a threshold that requires a value of “0” in order to open or close a violation, it’s possible your query may not ever return a “0”. Rather, it may be returning the absence of a value, or null. A null response cannot be compared against a numeric threshold.

Our alerting system processes the results of your NRQL query through a strict order of operations. We first execute the FROM/WHERE portion of the query and determine whether any results match that criteria. If no results match that criteria, then the SELECT statement is not executed at all. This is when your query will return null. (The preview chart may show that your query is returning “0”, but the preview chart is not beholden to this order of operations.)

In order to open or close a violation on a complete absence of data, you will need to configure a loss of signal threshold, and check the box to either open a new violation or close existing violations when the query doesn’t return any results.