Today I’d like to talk about NR AI Analytics Events, which will be released soon. We have been working on these for some time now, and they have been in beta state, so you may have already discovered them. Specifically, I’m talking about
NrAiSignal. We have more events of this type planned, so you will eventually see
NrAiIssue, as well.
These are primarily for alerts metadata, so that you can directly query, using NRQL, information about the behavior of your alerts.
NrAiIncidentshows details from every incident open and close. Keep in mind that this is the newer definition of “incident,” so this will be the most granular form of alert. Read more about the terminology at this link. Documentation for this event type can be found here.
NrAiSignalshows details from every NRQL alert condition and every signal on your account, for every aggregation window that passes. This is data that is posted immediately after each aggregation window is aggregated and evaluated, so it will show you exactly what New Relic Alerts is seeing. Documentation for this event type can be found here.
NrAiNotification(not available yet) will show details from every alert notification that is sent.
NrAiIssue(not available yet) will show details from every Issue on your account, and will have separate records for both open, acknowledge, and close events.
I’m glad you asked! Let’s look at some possible use-cases where these event types would be valuable.
NrAiIncident and scoping to
conditionName, we can see how often and how many incidents are opening per alert condition. This can help to pinpoint alert conditions that may be contributing to an overly noisy alerts environment. The basic query you would use for this would look like this:
SELECT count(*) FROM NrAiIncident FACET conditionName SINCE 1 month ago TIMESERIES 24 hours
This can show us when any specific alert condition had a spike of incidents, but can also highlight noisy alert conditions that may need their thresholds desensitized. You could, alternatively, use
conditionId if preferred. You could also expand or contract both the time frame of the query (
SINCE 1 month ago) or the granularity of the data points (
TIMESERIES 24 hours), depending on your use-case.
Similar to the query posted just above, you can find any “hot spots” among your entities by faceting on
entity.guid and then looking up that guid to see which entity (or entities) is opening a lot of violations:
SELECT count(*) FROM NrAiIncident FACET entity.guid SINCE 1 month ago TIMESERIES 24 hours
Have you ever wondered why your alert condition is failing to open incidents? One of the first things you can do when this happens is to check
NrAiSignal to see exactly what is being evaluated. You’d do that by using this query:
SELECT aggregatedDataPointsCount, signalValue FROM NrAiSignal WHERE event = 'value' AND conditionId = '123456'
If you replace
123456 with the ID of the condition you’re interested in, this will show you how many data points are getting aggregated for each aggregation window, and the value that is being evaluated by the system. If you do not see any results for this, it indicates that your condition is failing to aggregate any data and that you may need to change your aggregation method or delay.
Take a look at the Alerts & AI → Analyze → Overview page in the New Relic UI. All of the charts on that page were made using queries to
NrAiIncident. You can view the query for any of those widgets by clicking on the ellipsis (
...) and selecting
I encourage you to start using these events right now. We have a bit more work to do before they are ready to release fully (non-beta), but they can already help to meet some of your use cases!
If you come up with more use-cases that can be met by querying these events, please post your ideas below.