OOM_Killer alert Invoke

There are some OOM-killer processes running on my application. I wanted to invoke an alert whenever an OOM-killer process is applied . Can anyone let me know how can we setup an alert when OOM-Killer is initiated,

Hey @rtanniru! I see that you have also submitted a private support ticket regarding this question. Please update this thread when you get all sorted! I am sure what you learn and share will benefit the rest of the community members. :blush:

Any update on this?

it is not possible to collect that information with the NewRelic Infra agent?

The somewhat unsatisfying answer is “it depends”

It is going to vary based on how this is being implemented in your system. Does the OOM-killer process only spawn when something needs to be OOM-killed? If so, you should be able to set up a condition counting the number of OOM-killer processes and create a condition to monitor that. If it’s always running, could you possibly instrument that process with the New Relic language agents? Are they a custom process that you might be able to add some logic to insert events to Insights for, and then create a NRQL condition to notify you? Without more details of your implementation there is some general guidance that we could give but we’ll be short on specifics.

If you add some more context here we can hopefully at least give a more definitive answer

2 Likes

Hey @swebber - Were you able to get anywhere with David’s guidance? Any additional information you can get would help with more specifics on this :slight_smile:

Hey @parrott ,

I am referring to your post. I need some similar tracking of this OOM Killer. Can you please elaborate more on the quote I extracted from your answer. I would like to know what do you mean by setting up condition which counts number of OOM-killed? How do you setup that condition in agent? Is there any docs about this

I set it up with nri-flex integration, by doing grep and tailing just last one match and after that extracting fields with awk. The problem with this approach is how to handle log file rotation. Also alert will not close until the log file is not rolled and no new OOM appears in rolled file. All the examples in flex tutorial are looking at the proc files which have just one line which is updated by the system.

Solved problem using dmesg and date function to fetch events occured in the last minute. This is implemented using nri-flex, so on the NewRelic side I receive process name. From this I can create NRQL condition and alert on it.

2 Likes