Policy that is not triggering alert

I have a Metric where 1 is good, less than 1 is bad. And a policy that has:

NRQL> FROM Metric SELECT latest(streams_job_health) WHERE tenantId != ‘c26d86bb2ea97e6332a25efd0447a09d’ AND tenantId != ‘te_c414joz5’ AND tenantId != ‘te_4mlwjqga’ AND (jobName NOT LIKE ‘c26d86bb2ea97e6332a25efd0447a09d%’) FACET jobName
Metric query result is < 1 unit for at least 5 mins

When I run that Query I can see it does find some bad ones, and if I TIMESERIES it I can see they have been bad for hours. But no incident is being raised and I cannot figure out why not?

Account: 2417212
Policy id: 554811

Hi, @maeve.oreilly: Alerts thinks there is an open incident for the policy. Because the incident preference is set to “One open incident per policy”, Alerts will not create a new incident until the existing one is closed.

But when I look at open incidents, Alerts does not show any. I will open a ticket so our support engineers can help troubleshoot. Please watch your email.

2 Likes

Hi @maeve.oreilly and @philweber

The Incidents page now has a default 7-day window. You’ll need to expand this window to the time the incident opened in order to get the incident to show up.

I would recommend, instead, looking on the Events page. From there, you can search for the policy name or the condition name and find individual violations. So long as these are critical violations, there should also be an incident link.

In this particular case, I discovered an incident which has been open for over 240 days, which all of the violations were rolling up into. Since this is the case, no further incident can open.

I suggest changing the incident preferences on this policy, if you’d like to ensure that this sort of thing doesn’t happen again.

1 Like