Relic Solution: Getting currently open incidents with NRQL

Customers often ask how to use NRQL to get a list of incidents (formerly violations) that are currently open. Unfortunately, this is not straightforward: New Relic records incidents as NrAiIncident events, but because NRDB is a “write-once” data store, it is not possible to update the state of an incident from open to closed. Instead, you end up with separate open and close events for each incident; I have not figured out how to write an NRQL query that returns all incidents for which there is an open event without a corresponding close event.

The New Relic REST API, however, has an endpoint that returns a list of violations; that endpoint accepts an only_open parameter to return violations that are currently open. Presumably customers want to use NRQL rather than the API to get this data so they can display it on a dashboard.

The following Synthetics API script calls the REST API to get a list of open violations, then sends each violation to NRDB as a custom event (OpenViolation). If you set the script to run once per hour, then you can use NRQL to query the data for the past hour to get a list of currently-open violations.

const ACCOUNT_ID = YOUR_ACCOUNT_ID;
const REST_API_KEY = YOUR_REST_API_KEY;
const INGEST_KEY = YOUR_INGEST_LICENSE_KEY;

var options = {
  url: 'https://api.newrelic.com/v2/alerts_violations.json?only_open=true',
  headers: {
    'Api-Key': REST_API_KEY,
    'Content-Type': 'application/json'
  }
};

// Get open violations
$http.get(options, callback);

function callback(error, response, body) {

  var options = {
    url: 'https://insights-collector.newrelic.com/v1/accounts/' + ACCOUNT_ID + '/events',
    headers: {
      'Api-Key': INGEST_KEY
    }
  }
  
  var events = [];
  // Insert each violation as custom NRDB event
  body.violations.forEach(
    function(violation) {
      var event = new Object();
      event.eventType = 'OpenViolation'
      event.label = violation.label;
      event.policyName = violation.policy_name;
      event.conditionName = violation.condition_name;
      event.priority = violation.priority;
      event.duration = violation.duration;
      event.entityProduct = violation.entity.product;
      event.entityType = violation.entity.type;
      event.entityName = violation.entity.name;
      events.push(event);
    }
  )

  options.body = JSON.stringify(events);
  $http.post(options, function(error, response, body) {});
}

To use the script, replace the placeholders for ACCOUNT_ID, REST_API_KEY, and INGEST_KEY with the values for your account.

Note that if you have a large number of open violations, the API results will be paginated; I will leave it to you to modify the script to support pagination.

Also, the script puts all the OpenViolation events in an array and POSTs them with a single API call. If you have a large number of open violations, you may exceed the maximum payload size permitted by the event API; I will also leave it to you to deal with that issue.

1 Like

Thanks, @philweber, this is a comprehensive guide to better understand a crucial metric.

Given that half the function of the monitoring tool is to alert on the data it ingests (the other half), do you think more can be done directly within New Relic’s user interface to surface this information?

To put it another way, say I’m an enterprise with multiple (sub-)accounts and the Product Support Manager wants a widget with a read-out of the current number of open alerts to quantify their team’s workload. At that point, does it seem reasonable to use a REST API script, needing to be interpolated by multiple keys, for every single account, to pull together that number?

Almost goes without saying, this guide remains useful either way. Just wish this workarounds weren’t needed in the first place.

Just wish this workarounds weren’t needed in the first place

Hi, @rishav.dhar: No argument here. I do not design the product; I have to take what they give us and make it work, just as you do.

There is a way to view open incidents in the UI, and an API to get the data programmatically. The workaround is only needed if you must get the data via NRQL.