Compound alert condition alerting

Hello,
Where I work, we have 800+ servers where each individual server is a New Relic entity. We do not group servers, but we post parameters and monitor them in New Relic for each individual server. We have an alert policy with a dozen conditions. What we’d like to do is, in addition to each server’s alert conditions, be able to create higher level alerts based on certain combinations of conditions. What I mean is I’d like to alert on the combinations like the following:

“condition1 is violated, condition5 is within a certain range, and condition7 has monitored values in a certain range.”

I would call them “super-conditions” that involve several regular alert conditions. Is this possible?

Hi @greg.gillis

Currently there isn’t an easy way to do this. We call this “compound conditions.” Although I do like the ring to “super conditions” :slight_smile: This is a topic we talk about internally now and again, but I don’t have a date or commitment I can share at this time.

That said, we are hard at work on a totally revamped incident & issue lifecycle that you’ll hear more about soon. Soon you’ll get access to a feature we call “Decisions” which allows you to group incidents based on time, context and connectivity. In addition you can raise and lower severity when this event correlation occurs.

This, in combination with a new Workflow capability will allow you to route your incidents using a flexible filtering approach, similar to a NRQL WHERE clause.

So, putting the two together, you could:

  1. Create a handful of conditions with warning thresholds
  2. Create a Workflow that routes warning Issues to low priority destinations like Slack
  3. Create a Decision that both correlates (groups) Issues from those handful of conditions together when they occur close in time, and raise their severity to critical.
  4. Create a Workflow that routes critical issues to high priority destinations like PagerDuty.

I know this isn’t the same as compound conditions, but it is similar and might help you with the problem you’re trying to solve. You can do this today if you have access to our Incident Intelligence product. Otherwise you’ll need to wait a couple of months for an announcement that should open up access to this solution to you.

1 Like

I think it would be very easy for your developers to create an extra field in the condition-create UI. This field would be called “suppress these conditions”. Then all the conditions in that policy would populate a list with checkmarks enabled for each existing policy condition. Then, when, say, Host-not-Reporting fires from the infrastructure agent, then those open conditions that are in the suppression list close themselves immediately.

Hi @greg.gillis,

Thank you for your feedback :slightly_smiling_face:

It appears our engineering team is already working on revamping the incident and issue lifecycle as @Fidelicatessen mentioned, so be on the lookout for updates in the near future!

If I can assist with anything else, please let me know.