Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

Relic Solution: Extending the functionality of NRQL alert conditions beyond a single minute

alerts
nrql
levelup
nrql-alerting

#1

Let’s peek under the hood for a moment

Let’s talk for a moment about how New Relic Alerts evaluates your data. Once per minute, every alert condition looks at the data stream it’s given and evaluates it numerically against the condition’s threshold. One minute later, the data is again evaluated. Each minute is evaluated discretely (on a pass/fail scale), without regard to any data before or after that single minute.

The evaluation system will build a model of the data, if you have a threshold with a time window of more than 1 minute (e.g. “for at least 15 minutes” will keep track of each minute’s pass/fail result for a rolling 15 minutes). Once the alerts evaluation system gets enough fail results in a row, it will open a violation.

Keep in mind that each minute is evaluated discretely. The evaluation system does not look at any of the minutes in the past, other than to develop the pass/fail model over a rolling time window.

One minute at a time, got it

If you think about this for a moment, you might see how NRQL queries using percentile or stddev are a lot less useful than they seem, when used in an alert condition. After all, if you calculate the standard deviation over an hour (or 24 hours), that can be meaningful. But stddev(duration), or percentile(duration,95) calculated over only 60 seconds is less meaningful.

Whoah! So … how do I set up an alert condition to monitor standard deviation over the past 24 hours?

Since the alert evaluation system only looks at a single, discrete minute at a time, but NRQL queries in Insights are much more flexible, you just need to figure out a way to wrangle a standard Insights NRQL query (which can perform functions over longer periods than a single minute) into an alert condition. Here is one way you can accomplish that.

  1. Set up a cron job to run a script once per minute (since the alerts evaluation system expects to see a data point every minute). Alternatively, you can use a Synthetics Scripted API Test Monitor to run a script for you once each minute.
  2. In the script, use the Insights Query API to run the exact NRQL query you want. As an example, SELECT stddev(someAttribute) FROM SomeEventType SINCE 24 hours ago.
  3. The script should then parse the JSON that is returned and extract the value or values that are important.
  4. Next, the script should re-package the important values as a JSON object.
  5. Finally, the script would use the Insights Event API to insert the JSON as a custom event.
  6. Once the cron job and script are up and running, set up a NRQL alert condition to monitor the attribute in the custom event that is of interest.

With this method, you get exactly the value you are looking for inserted as a custom event into Insights once per minute, which allows the alerts evaluation system to evaluate, for example, a 24-hour calculation of standard deviation, or the 95th percentile over the past 12 hours, or the count of events over the past 30 minutes – anything you can write a standard NRQL query in Insights for, you can now set up an alert condition to monitor!

I hope this helps to better understand how the alerts evaluation system works, as well as providing a way to expand its functionality. Let us know if you come up with other ways to do this!


Is it possible to join two nrql query?
New Relic Alert on Rate of Change
Alerting on trends
How to configure NRQL alert to check the past data
#2

This is awesome! I’ve actually been working on a script to this effect for a while - I polished it up and posted it below in the hopes that it can others who go the scripting route. My script uses Python 3 and the Requests library.

I made mine for a use case where someone might want to alert on the rate of change of an average (i.e. alert whenever the rate of change is greater than X amount). Here’s the query I used as an example:

SELECT average(duration)/1000, stddev(duration)/1000 from SyntheticCheck since 8 minutes ago until 5 minutes ago COMPARE WITH 13 minutes ago where monitorName = 'Test Monitor'

My SINCE clause is a bit complicated because I had to build in my own evaluation offset to ensure latency wouldn’t prevent me from getting complete results. Using a longer time period (like 24 hours) would probably render that moot.

Script:

#!/usr/bin/python
import requests
import urllib.parse

#This is the file where all your account/query/API key info is.
import request_config

#This bit builds our Query API call.
account_id = request_config.stuff['account_id']
nrql_query = urllib.parse.quote_plus(request_config.stuff['query'])

q_headers = {'X-Query-Key': request_config.stuff['query_key'], 'Content-Type': 'application/json'}

def build_url():
	q_url = 'https://insights-api.newrelic.com/v1/accounts/{}/query?nrql={}'.format(account_id, nrql_query)
	return q_url

#This is where we make the Query API call.
r = requests.get(build_url(), headers=q_headers)

#This is where we parse the content of the call.
results = r.json()

#This is where we do math.
avg = results['current']['results'][0]
prev_avg = results['previous']['results'][0]

diff = avg['result'] - prev_avg['result']

#This bit builds our Insert API call.
i_url = 'https://insights-collector.newrelic.com/v1/accounts/{}/events'.format(account_id)
events = {
	"eventType": "MyCustomEvent",
	"latestAverage": avg['result'],
	"delta": diff
}
headers = {
	"X-Insert-Key": request_config.stuff['insert_key']
}

#This is where we make the Insert API call.
r = requests.post(i_url, headers = headers, json = events)

print(r.status_code)

And here is the request_config file where you put in your account, API, and query information:

stuff = {"account_id" : [Your_Account_ID], 
		"query" : "SELECT function() FROM EventType WHERE attribute = 'value' SINCE 24 hours ago",
		"query_key" : "[Your_Query_API_Key]",
		"insert_key" : "[Your_Insert_API_Key]"}

Fun fact: you may have noticed my query uses average and standard deviation functions. My next goal was to tweak the math so that I also calculate whether the current average is within one deviation of the previous average.

Hope this helps! I’d love to see other use cases people have scripted.


#3

Hi,
It didn’t work for me. I want to setup alert when backup file is NOT created within 24 hr on windows using nri-flex integration. I setup the task, as you mentioned , running every 5 min and my powershell script is below:

$directory | Get-ChildItem | Where-Object { $.PsIsContainer -eq $false -and $.Name -like $filePattern } | Measure-Object | Select-Object -Property @{ expression = { $directory }; name = “directoryName” },@{ expression = { $filePattern }; name = “filePattern” },@{ expression = { $_.Count }; name = “fileCount” } | ConvertTo-Json

as well as a query:
PS C:\Program Files\New Relic\newrelic-infra\custom-integrations> Invoke-WebRequest -Uri https://insights-api.newrelic.com/v1/accounts/22404
61/query?nrql=SELECT%20*%20FROM%20fileTestlookup%20SINCE%2010%20minutes%20AGO -Headers @{“X-Query-Key”=“xxxxxxxxxxxxxxx”} -
ContentType “application/json” -Method GET

but nothing is happening. Can you give me a push in the right direction?

Thank you in advance


#4

I got it working for 3 min period. Now testing for 24 hrs. :crossed_fingers:


#5

Hey @ababichenko, Let us know how you get on with it :slight_smile:


#6

Hey , Yes it’s working. Wasn’t that bad when learn more about that. Thank you All :slight_smile:


#7

Fantastic! Thanks for confirming @ababichenko