I need to be able to alert the developers when their software stops running on a server. I know that within Infrastructure we can add an alert which requires a process to be running, but it would be very helpful to write a NRQL query that will look at a whether the process is running without telling it how many servers it will be run on. My question is, is this possible? I have provided a sample query below that detects the number of processes running on the server that match a particular pattern:
SELECT count(commandLine) from ProcessSample FACET hostname where commandLine like ‘/usr/java/nds-%/bin/java’ and apmApplicationIds LIKE ‘%|95505343|%’ since 3 minutes ago
My thought process is that if I could run this and require at least 1 instance running on each host at all times then this would work as long as the apmApplicationIds are populated at all times. I’m just not sure if they always will be. If they aren’t populated at all times then I could add a label to the host to indicate that the service is up, but I’m hoping there are other ways of doing this as well.