Big News For Alerts and Applied Intelligence

Hi @rishav.dhar

As far as I know, appName should work fine to narrow the scope down to a particular entity.

This really sounds like a malfunction. I would recommend opening a support ticket on this, since that is the quickest path to get it in front of the engineering team who owns this functionality.

Until then, I would suggest trying out a combo facet: FACET appName, appId. In case there are duplicate application names, that explicitly narrows the scope down to a single application, since appId values are unique. This may not work any better than using only FACET appName, but it would be a valuable detail to include in the support ticket.

1 Like

We are using NRQL queries to retrieve IssuesActivated, IssuesClosed across the accounts.
And also using NRQLs on “Incidents” within a subaccount. We are doing this in our scripts of SELF-HEALING. Are we going to get impacted?
Thanks,
Sagar

@sagar.thirumala

Since almost all of alerts is affected, if you’re using NRQL alerts you will likely be affected by at least one part of these changes. However, only one of these changes involve you changing the queries in your alert conditions, and that’s only if you’re using sum of query results thresholds (we’re introducing Sliding Windows Aggregation to replace those).

Beyond that, the changes mostly involve how we present incidents, how we manage notification channels and the doing away with all non-NRQL alert conditions.

1 Like

Regarding Process Running condition, would you please share any info on how to use NRQL condition to replace it?

1 Like

Hi @mtsou

Regarding Process Running condition, would you please share any info on how to use NRQL condition to replace it?

Sure!

Imagine if I have a process (imagine it has a display name of processA) and I want to make sure it’s running on my server named my-favorite-server. I would use a query like this in my NRQL alert condition:

SELECT count(*) FROM ProcessSample WHERE processDisplayName = 'processA' AND hostname = 'my-favorite-server'

If the count is 0 for some period of time, it indicates that the process stopped on the host.

Therefore, to finish, I would set up a Loss of Signal to open a new violation at some point (that’s up to you) – maybe after 10 minutes of no signal. This is because the count(*) function usually won’t return 0 values (take a look at this article, where I explain why that is).

This is a very simple example, but you can expand on this to cover whatever you need.

2 Likes

Hey everyone!

I just posted a more focused announcement specifically about Outlier alert conditions. Please take a look at this link!

1 Like

@Fidelicatessen Thanks for more details. I have a specific question on AIOps Queries. We have Queries like “SELECT title, issueId from IssueActivated …” on master account. Do these queries get impacted?

1 Like

Awesome @Fidelicatessen - just got back off Vacation to see lots of shiny green boxes across our many client accounts - please pass our thanks to your teams, awesome job!

1 Like

@sagar.thirumala

Note that this list of new features and EOLs pertains only to Alerts and Applied Intelligence. This is not a query that would work with a NRQL alert, since there is no aggregator (e.g. average, min, max).

Since this is not a query that is used in a NRQL alert condition, none of the changes listed in the original post should apply to it.

1 Like

@Fidelicatessen Thanks for the updates. Where can I find the details around the scripting changes that will be necessary?

1 Like

@thomas.murphy

We endeavor to keep our API docs up-to-date, and have made the Terraform provider a top priority, so that should be up-to-date as well.

All of these features are still months out, so it’s unclear yet exactly what changes you’ll need to make, but if you follow these pages, it should help!

API docs for Alerts:

Terraform provider for New Relic:

1 Like

@Fidelicatessen Love seeing NRQL alerts getting linked to their entities for health. I am seeing a little oddity in that on the explorer dashboard the entity is marked red during an incident, but when I go into the Brower UI there are no “Open Violations” that link back to the ‘incident’. I’ll open a support ticket on it, but wanted to provide that feedback here too.

2 Likes

Thanks for your response, much appreciated. I’ve delved into this a little more and found myself to be wrong: all FACET appName entities are now covered by their NRQL alert conditions. Absolutely stellar result!

Only thing remaining is that the NRQL alert conditions themselves are not rendered on the app’s APM > Alert conditions page, as shown in the screenshot attached. This seems like an anomaly so have raised ticket #474806 accordingly.

2 Likes

What will the recommended NRQL for HNR alerts be? We might want to get an early start using that, because the existing “Don’t trigger alerts for hosts that perform a clean shutdown” checkbox doesn’t seem to catch Amazon EKS nodes that scale down during auto-scaling or an EKS node refresh.

You can already get an effective HNR alert, assuming your entity is emitting telemetry data to New Relic. Here’s how I would do it if I had a host running Infrastructure:

SELECT count(*) FROM SystemSample FACET hostname

Set Loss of Signal to 5 minutes for the standard default 5m HNR violation, and set the threshold to look for values below 1, so that violations will close on their own after the host starts reporting again.

This will also track each host separately, since the query uses FACET hostname.

2 Likes

Ok, I was expecting something new in NRQL land, but simplicity is often the best. Thanks for clarifying.

1 Like

Would it not be above 1, instead of below?

@rishav.dhar

Would it not be above 1, instead of below?

No, because “above 1” indicates that the host is still reporting. A result below 1 would indicate that the host stopped reporting.

NOTE: In order to open a violation on this, however, you would also need to set up a Loss of Signal, since (as this article explains), the result of a count(*) function, in this case, would never reach 0.

1 Like

Are all the above features live already?

1 Like

It seems that it is not live yet: