Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

Alerts based on conditions where the threshold is triggered on no data

alerts
thresholds

#1

We are using a plugin for Rabbit MQ. When Rabbit MQ has problems, the plugin stops reporting data to New Relic. I have conditions that fire on thresholds when message rates are too low. The problem is that no data is provided by the plugin to set a threshold on when Rabbit is unresponsive/frozen.

How can I set a condition that is triggered when there are no data events or metrics of a particular kind for X number of minutes?


#2

Hi @rguess, there currently is no way to Alert on whether a RabbitMQ server has stopped using the plugin, and the plugin is developed by a 3rd party. However we do have a new RabbitMQ on-host integration that works with our Infrastructure agent that you could setup, then use a Host Not Reporting specific Alert Condition to monitor it.

Alternatively, you could use a Synthetics Ping Monitor to ping the IP address of the server and alert if it’s unavailable :slight_smile:


#3

That would not help us as we do not pay for your infrastructure offering. The host is not unresponsive, only RabbitMQ is.

In Insights, all we need is a check box when setting up alerts: “If no events or metrics in X minutes, use X value.” This way we could treat for instance the absence of rabbitmq message rate reporting for 3 minutes as 0 message rate in that time period and we could setup alerts on it.


#4

Hi, @rguess: …except that plugin data is not sent to Insights. :slight_smile:


#5

I understand that. What I am saying would help is if insights allowed you to treat no data for X minutes as data with a specific value. In my case it would be if there is no message rate for rabbitMQ, for 3 straight minutes, then consider that to be a zero message rate. NR could put a row of data in the system for that metric with the value of zero or nrql could have the option of treating gaps in data as if they had a certain value.


#6

What if NRQL just allowed you to retrieve the datetime stamp of the last metric of a particular type? Then alerts could be created if the last data point was older than X minutes.


#7

NRQL does allow that:

SELECT latest(timestamp)...

But NRQL Alerts require a query that returns a number; I don’t know of any way to calculate the difference between latest(timestamp) and now.

And NRQL does not work with metric data (such as that generated by plugins), so even if NRQL could do what you are asking, you would not be able to alert on plugin data.


#8

is there something like datediff and getdate (those are tsql) in nrql? It would be datediff(minute,getdate(),timestamp) in tsql.


#9

Unfortunately, no. There is a feature request:


#10

boom! feature request!

… the “works every time” way to end any discussion in the NR forums.


#11

What else would you suggest?


#12

That feature requests not be ignored.


#13

Hi @rguess -

I’m definitely hearing your frustration, and I want to say that it’s completely understandable. I know it’s hard when you are trying to solve a problem and keep running into roadblocks.

Please know that we aren’t in the business of using “Feature Idea” as a way to shut down conversation. In fact, we try hard to at least identify workarounds whenever we can. I understand why it might feel like we’re blocking and tackling, but that’s not what we’re doing. In this case, we just can’t think of a workaround for the situation, but we can certainly keep the thread open in case there are others - customers or Relics - who have thoughts.

I also know that it’s frustrating to have voiced a Feature Idea that never gets implemented. We get at least one feature idea every single day, just here in the community. That’s in addition to the ones identified by our other teams. I’m sure you understand that we can’t possibly address them all - it’s a delicate balancing act.

Thanks for your patience with us, and please let me know if there is anything else I can share.


#14

When was the last time a feature request for NRQL was implemented?


#15

Hey @rguess -

Not sure, frankly. I’d have to go digging for some of that data. That said, the Insights product manager is pretty active here in the community. You can see the work she’s highlighted in the review of her activity in the community:


#16

Couldn’t you just set up an alert based on the total number of messages that you can query:
select count(*) from EventType … If that is below a threshold for X minutes then fire off an alarm.


#17

Thanks for posting @Brandon.Dyer - that’s a potential workaround, though not a full solution.

The problem exists when no data reports. The metric you could be alerting on can go from a count(*) of 1000, to nothing - but with no gradual decline depending on the circumstances of that not reporting event.

And depending on the source of the data, when the data stops reporting, there could be no data for the alerts evaluator to work on, not necessarily just reporting as 0. This is different to a metric count falling below a predefined threshold. So while a count(*) condition may be helpful, it won’t account for all cases.

Essentially the problem is that Null Data != 0.


#18

I wrote a service that pulls metric data for the Rabbit MQ plugin from the REST API. If there are no new metric values for X minutes, my service writes events to the insights collector endpoint. Now I have something I can create alerts on. Silly, but it works.


#19

You might consider having it report the data to Insights every minute, regardless of the result. A pattern that works well is having a data point reporting a couple of times a minute with a 0 value for the attribute you want to configure your condition to target, and then have it report a value > 0 when you see what you consider a failure scenario. This lets you create a NRQL condition that is targeting the presence of failure and not the absence of success, protecting you against some potential false-positive results.

I’m happy to expand on why targeting the presence of failure works better than targeting the absence of success if you (or anyone else) would find it helpful!


#20

I am reporting the seconds elapsed since the last datapoint seen from the plugin. That way I can set conditions for alerting.