We are seeing a peculiar issue and wanted to know if anyone or everyone is seeing the same.
I see that there is a “default kubernetes alert policy” our of the box when you integrate kubernetes with NewRelic. Its got some good alert conditions that we wanted to make use of. However, there is a particular alert condition called “Container restarted too much” is very noisy and is giving incorrect data from the past. We’ve has our containers crash at somepoint during the oprations maintenance schedule (three weeks back) and we generated bunch of violations/incidents, which we did some clean up on. However, this particular alert condition does not seem to go away even after acknowleding the alerts, closing the violations manually and making sure indeed there is no issue with the containers itself currently.
We have a setting that should alert us if the no of restarts are above 100 in 5mins. However, it appears to me that, its considering that 5mins timeframe from the past an triggering alerts as soon as we enable the condition.
Now how can I have this condition look at the current data instead of past data.