Relic Solution: Alerting on COMPARE WITH queries using Synthetics API Test Monitors

Introduction:

I often see customers trying to compare a specific moment in time to its counterpart an hour/day/week earlier, this helps to see if whatever it is we are querying is behaving ‘normally’. Looking at a timeseries chart is helpful, and seeing spikes can be helpful too. But those spikes can often be worrying. They may be normal though (I’ll let you decide if normal spikes are good or bad). So comparing a like-for-like time of day can help to ease those worries, to see yeah, that did spike, but that seems about right for this app at that time.

We can do that by using a query like:

SELECT count(*) FROM PageView SINCE 1 HOUR AGO COMPARE WITH 1 WEEK AGO

Showing how many page views we have had this past hour compared to that same hour period 1 week ago.

The problem we run into is that COMPARE WITH queries are not supported in NRQL Alerting. So if there is a large drop in the query results, we can’t get notified.

Baseline Alerts can do this to a degree with a measure of deviation from the norm, but if you want to trigger an incident off of a specific percentage difference in a queryable value, you may be stuck.

Prerequisites:

Below are some requirements for this to work for you

  • You have a QUERY API Key
  • The comparative data you are looking for is Queryable in NR1
  • A Synthetics Subscription enough to use a API test monitor.

How to:

A fairly simple Synthetics API test is all it takes.

Important caveat: I do not consider myself an elegant coder. This is a script that works, but I’m sure there are many better ways to handle this use case.

To start we make a call to the Query API, our docs site have a great query API example so I started with that. In the callback function you can see that the example is sending the response body to the console log, lets take a look at that and see that the structure we see is:

{
   "current":{
      "results":[
         {
            "count":265
         }
      ],
      "beginTimeSeconds":0,
      "endTimeSeconds":0
   },
   "previous":{
      "results":[
         {
            "count":284
         }
      ],
      "beginTimeSeconds":0,
      "endTimeSeconds":0
   }
}

Obviously there is more after that, but, the important parts are here. Our script needs to parse through to get the current -> results -> count and the previous -> results -> count

This is easy enough to achieve by running:

currentValue = jsonBody.current.results[0].count; 
previousValue = jsonBody.previous.results[0].count;

note: ‘count’ above may not be the right method for you, read on for the config notes at the bottom of this post.

Once we get those bits of info, I’m passing them into an analyse() function, that will simply check whether the current value is an increase or decrease over the previous value, and the assert node module will fail the script if it is a decrease of more than a set threshold.

It’s really that easy. There are certainly flaws with this - any Synthetics locational issues may lead to this failing and sending an alert falsely. But there is just as much chance of a false positive/negative issue with the alerts platform itself. We work hard on ensuring platform stability, but this is the world of software, things can occasionally go wrong. If the monitor does fail erroneously, check https://status.newrelic.com to see if there is anything happening on our side to explain that :slight_smile:

Config Notes:

There are a few variables in the script that you will need to edit.

  • var myAccountID = 'put your account ID here';

  • var myQueryKey = 'put your query API key here';

  • var encodedQuery = encodeURIComponent('SELECT count(*) FROM PageView SINCE 10 MINUTES AGO COMPARE WITH 1 WEEK AGO');

    • Put your own query in here.
  • var decreaseThreshold = -20, increaseThreshold = 20;

    • These are the thresholds at which the monitor will fail. By default the script won’t fail on an increase, but you can uncomment those lines in the analyse function to enable that

Earlier I mentioned that count may not be the right method for you in the section of the script that parses the current and previous values. This attribute is named after the query function. In my test case the query is SELECT count(*) FROM... - so in this case the attribute is named count. If the query is SELECT average(duration) FROM ... OR SELECT max(duration) FROM... then your attribute name would match that query function. You would parse that like so:

currentValue = jsonBody.current.results[0].average;
previousValue = jsonBody.previous.results[0].average;

or

currentValue = jsonBody.current.results[0].max;
previousValue = jsonBody.previous.results[0].max;

Alerting on this data:

Right now the script just uses assert to fail with the message “There has been a significant increase” or “There has been a significant decrease”. So right now alerting is only handled via a standard Synthetics failure alerts that you need to configure against this monitor. You could, alternatively, write a NRQL query looking at, for example:

SELECT count(*) FROM SyntheticCheck WHERE result = 'FAILED' AND error like '%there has been a significant%'

so you can alert on the number of times this failure has occurred (useful in cases where the drop/increase is only brief, ensuring the monitor must fail > 4 times would make sure the problem was real, not just a blip)

Where can I get the script?

Right here: https://gist.github.com/ryanv94/6f0785fcbd4a3e0b1eed4e77ace7517c

Go ahead and copy this into an API test monitor in your own account, change the variables described in the config notes, and set up an alert condition on it.

Feel free to add comments below of improvements to this script, or any other suggestions you may have :smiley:

2 Likes

Hi @RyanVeitch - A useful solution to a problem that I had been considering. One tip that I learned recently, is to use the encodeURIComponent to HTML encode any strings. Makes it easier to read the string in the script :wink:

1 Like

Another thing I too often forget about :smiley: I just updated my script to use encodeURIComponent! Thanks for the tip, Stefan :+1: :star:

1 Like