As this seems the only way to get support now, can you let me know why email and slack alerts have stopped on my account.
@sandy - can you share some links for the alert incidents you expect to have sent notifications to Slack/Email?
I can take a look internally for any potential issues with those.
Hey @sandy -
The very first violation in that incident comes from February this year - and is somehow still open.
While that violation is open, the incident will not close.
Notifications rely on the incident lifecycle - Open, Acknowledged, Closed. You get a notification for each of these.
Since that incident has not yet closed, you have not received a new notification from it since February.
The following link should take you to the violation that remains open.
You’ll see a button marked manually close violation - which will help you to close the incident, and you should then be notified.
Side Note: Reconfiguring your incident preference can help you no longer have such long running incidents, and can help ensure you get the notifications you need:
Are you saying we need to manually acknowledge each violation, otherwise any subsequent violations will not be alerted and effectively hidden ? And this applies even if the initial violation - e.g. a ping check had recovered after 5 mins ?
@sandy - Not necessarily -
You shouldn’t need to manually close violations (unless you have opted in to violations not auto-closing).
Typically when the opposite of a condition threshold is met, the violation will close. For example, if your threshold is:
Transaction Error rate greater than 5% for at least 10 minutes , then for the violation triggered by this condition to close, the application must have a
Transaction Error rate lower than 5% for at least 10 minutes.
Violations are what trigger incidents to open. Incident acknowledgement is not required for it to close - however violations must end before the incident can close.
In your case, since the violation was open since February - it’s very likely that it’s status got stuck on our side, and manually closing it is the only way to end the incident.
Typically this isn’t required though - since normal operating behaviour is for violations to close themselves.
Ok that makes more sense - something got stuck on the NewRelic side.
We’ve been using alerts for a while, and up to now they’ve been working as I’d expect with no manual interventions.
I’ll manually clear the stuck ones and see if things go back to normal.
Thanks @sandy - let us know how that goes & if you have any further questions.
I have a downtime alert policy. On the summary page it shows 1 open incident for each site checked (see screenshot).
When I click through though, there are no open incidents, so I can’t clear them, and can’t be fully sure that future alerts will be received. Also, under alerts / open incidents there are no Downtime related ones, yet we did miss receiving downtime alerts as described above. Can you check if we have some data corruption on our account ?
@sandy - I think those
open incidents just reference the same open violation here:
Could you try manually close that?