Supervisor Monitoring in Insights

Hi All

Like many others, we run a number of jobs within Supervisor. We wanted a way to monitor the status of our jobs via New Relic, but couldn’t find anything immediately suitable. So I’ve written a quick plugin for Supervisor so that it can post status change events to New Relic Insights where everything can be monitored.

Once the NRQL Alerting becomes a public release, that also means we’ll be able to easily set up alerts when we get jobs going in to a critical failure state! Very exciting.

So I just wanted to share this in case anyone else had a similar problem they wanted to solve:

3 Likes

super cool, thanks for posting Andy!

I’d love to see what the resulting dashboard looks like and what the NRQL query is that you’re using in the alert condition when you have a chance.

Thanks again,

AJ

Here is an example of what you can do:

I should note that many of our supervisor tasks cycle a lot, which is why we see so many events here. If running long running processes, there would be very few events (or possibly none if they never change state).

I don’t have access to the NRQL alerting just yet, but I would imagine something like this might do the trick:

SELECT count(*) FROM `Supervisor:Status` WHERE status = PROCESS_STATE_FATAL SINCE 1 HOUR AGO
1 Like

Looks awesome! Amazing share, @andy.raines :thumbsup: