Relic Solution: How to Use the Infrastructure Alerts REST API to Its Maximum Potential - Part 1: Exclusion Filtering

When you build Infrastructure Alerts conditions you’ll have criteria which the conditions need to satisfy so that they’re useful to you. The UI, however, might have certain limitations when helping you satisfy that criteria. If you run into these limitations then you’ll want to use the Infrastructure Alerts REST API to help you get around them.

In a series of posts I will go over each workaround individually. Even more exciting is that each of these workarounds can be combined in any way you think is necessary. For this reason you may notice that the posts sometimes repeat the same information. Please have a look at the Infrastructure Alerts REST API documentation to make sure you have a general familiarity with the technology I’ll be talking about:

Our first example pertains to exclusion filtering. Exclusion filtering is similar to an SQL clause of NOT IN, NOT LIKE, or !=. In the Infrastructure UI you are able to create filter sets defined by certain attributes. You can then create alert conditions based on them:

However, the Infrastructure Alerts UI is not able to ignore a subset of attributes in order to provide a more finely tuned filter set unless a new attribute is created for the hosts which you don’t want to select. Say you have a small subset of a large and useful filter set which has hosts with ‘DB’ somewhere in their hostnames. If all you want to do is select the whole filter set and exclude that handful of DB hosts then you can create a query which uses AND hostname NOT LIKE '%DB%' and then put this query into a POST API call. Here is an example query you can keep in mind:

SELECT (memoryUsedBytes/memoryTotalBytes)*100 FROM SystemSample WHERE `environment` = 'PROD' AND `hostname` NOT LIKE '%DB%'

Here is what the same query might look like in a POST API call. I will break it up to label its anatomy and then show you the completed call. Here’s where the curl is invoked and the alert condition is given a general classification:

curl -X POST 'https://infra-api.newrelic.com/v2/alerts/conditions' \
     -H 'X-Api-Key:{admin_api_key}' -i \
     -H 'Content-Type: application/json' \
     -d \
'{
   "data":{
      "type":"infra_metric",
      "name":"Non-DB Memory Usage Percent",
      "enabled":true,
      "policy_id":{policy_id},

Memory usage percent is calculated instead of hard-coded. When we set our "event_type" as "SystemSample" the query is automatically going to FACET by hostname because hostname is the domain of SystemSample:

      "select_value":"(memoryUsedBytes/memoryTotalBytes)*100",
      "event_type":"SystemSample",

Here’s where you can use exclusion filtering with NOT LIKE. The tricky part here is that each and every single quote must be escaped with '\'. Backticks do not have to be escaped:

      "where_clause":"(`environment` = '\''PROD'\'' AND `hostname` NOT LIKE '\''%DB%'\'')",

Next, the threshold will be violated if a value goes above what he have set. The "time_function" of "all" represents a threshold time function of for at least whereas a "time_function" of "any" represents a threshold time function of at least once in. The critical threshold below is the equivalent of above 95 for at least 10 minutes while the warning threshold is the equivalent of above 90 for at least 20 minutes:

      "comparison":"above",
      "critical_threshold":{
         "value":95,
         "duration_minutes":10,
         "time_function":"all"
      },
      "warning_threshold":{
         "value":90,
         "duration_minutes":20,
         "time_function":"all"
      }
   }
}'

Here is the completed API call:

curl -X POST 'https://infra-api.newrelic.com/v2/alerts/conditions' \
     -H 'X-Api-Key:{admin_api_key}' -i \
     -H 'Content-Type: application/json' \
     -d \
'{
   "data":{
      "type":"infra_metric",
      "name":"Non-DB Memory Usage Percent",
      "enabled":true,
      "policy_id":{policy_id},
      "event_type":"SystemSample",
      "select_value":"(memoryUsedBytes/memoryTotalBytes*100)",
      "where_clause":"(`environment` = '\''prod'\'' AND `hostname` NOT LIKE '\''%DB%'\'')",
      "comparison":"above",
      "critical_threshold":{
         "value":95,
         "duration_minutes":10,
         "time_function":"all"
      },
      "warning_threshold":{
         "value":90,
         "duration_minutes":20,
         "time_function":"all"
      }
   }
}'

Your condition "name", "policy_id", "select_value", "where_clause", and threshold values will probably all be different from what you see here.

Here are the other workaround posts which are coming up:

Part 2: Compound Alert Conditions
Part 3: FACET more than 500 hosts
Part 4: Cloud Integration Metrics & Evaluation Offset

10 Likes

You do realize that you’re proposing literal workarounds as a solution to problems in your own product? This is quite ridiculous. Just fix the problem! Don’t make paying customers jump through hoops.

3 Likes

Hey @rdeknijf - You’re right! I absolutely agree with you, this is a workaround to a limitation. We do have Feature Ideas logged here for that, but of course our engineering teams need to prioritise the features they implement. I can’t say that exclusion filtering as a feature that is on a roadmap for the near term, but please know that our product team is aware of that limitation.

Gene’s goal with this post is to help with achieving exclusion filtering while we await that to be built in to the product. I’ll definitely get a +1 added to the feature request on your behalf though.

Feel free to DM me if you’d like to chat further about either this, or more generally about our feature idea process. :slight_smile:

-Ryan.

2 Likes

I fell in the pothole, but then again I am a terrible reader. You have to remove the “filter” components from the “Get” results and then add the “where_clause”. Great article, but it’s painful to follow. Just a suggestion, I prefer a more linear approach. Do ‘A’, do ‘B’, do ‘C’, but I understand everyone if different.

2 Likes

Thanks for the feedback @reopelle.scott :slight_smile:

2 Likes

A deeper explanation of what can be done in select and the where clause is vital.

I want to count running processes in the select. Can I use count(processDisplayName) ??

I want to use an IN in the where clause to select a tag on my EC2 instance, how do I format the array of tags?

I’m going to create a support case for this but it seems like documentation explaining what is or isn’t supported in the select and where clause would be really helpful.

2 Likes

The count() function would only give you a count of the rows containing a value for processDisplayName, and not how many unique running processes there are. I’d consider using a narrow window (with the “SINCE” and “UNTIL” clauses) and use the uniqueCount() function instead. That will return a number that shows you how many different entries there are in a particular attribute, over the time range you specify.

Hope that helps!
~ @jlangdon

3 Likes

Hi @atittle

From what you’re describing, it sounds like you might want to use a Process Running alert condition. I’m not positive this would meet your needs, but your description made me think it might fit, so I wanted to throw it out there.

The best way to format your where_clause is to build it in Insights first. By doing it this way, you can not only make sure your syntax is correct, you can test to make sure you’re only picking up the entities you are concerned with.

I hope this helps!

1 Like

I found the Process Running alert helpful!

It’s also nice to know the where_clause field supports a NRQL syntax where clause. My problem was trying to use the wrong condition!

Thanks!

2 Likes