Notifications not received - Synthetics & Infrastructure

We have 3 monitors running at the moment. Synthetics and Infrastructure. We also have pingdom running. I have noted of late we do not get our Synthetics alerts via slack however Pingdom has been consistent and very responsive with error alerts. Is there an issue with our account or the way these alerts are triggered?

Also sometimes we find our mySQL server goes down but we do not get an alert i.e. when they are restarted on the server but failed to start up or crash. Which infrastructure metric can we use to check the uptime of the mySQL instance? same with Nginx.

Currently I am using “Connections Dropped Per Second” (Nginx) and “Connection Errors Max Connections” (MySQL) but unsure if those are the appropriate ones to check.

@jsteyn -

Would you be able to link to some events you expected, but did not receive notification for?

I’d like to see if there was anything I can see holding these notifications back, and if not, dig a little deeper on our side to see why you didn’t get the notifications you expect.

(Note - only New Relic Admins and users in your account will have access to the links)

Hi,

We are not receiving any Synthentic alerts currently.

Regards,
Johan

Hi @jsteyn - Can you share a link to the account?

https://infrastructure.eu.newrelic.com/accounts/2532703/settings/alerts

@jsteyn - I think the issue here lies in your policy settings in Alerts.

I see that you have a policy with 28 conditions

The Incident Preference set on that policy is ‘By Policy’. This basically means that when an issue happens, and a violation takes place, an incident is opened.

However, while that first incident is open, all following violations will roll up into the same incident.

So you’re getting notified of the initial problem, but the others are getting tagged onto the incident, but not notifying you.


I’d suggest changing the preference to ‘By Condition’, which will open up a new incident for every condition that breaches the configured threshold. This will potentially end up in you receiving a lot more notifications, but that sounds like what you are looking for.

Take a look at my colleague Steve’s post about Incident Preference, hopefully this will explain things a little better than I have here :smiley:

1 Like

Perfect, will investigate further and try the “By Condition”

1 Like

Awesome! Let us know how it goes :smiley:

Hello. I’ve configured an alerts policy currently configured with 2 alert conditions and 2 alert channels (email and slack). The test notifications work fine for both email and slack but if i try to actually test a condition properly (eg. power off a virtual machine for more than 5 mins) nothing comes through on either channel.

Let me know if you’d prefer me to post this issue somewhere else.

Regards,

David.

Hey @david.harris2

Could you share a link to the policy in question here? As well as that, a link to where the vm you shut down is reporting in NR.

Thanks

Ok here’s the policy (id: 994640)

https://one.newrelic.com/launcher/nrai.launcher?pane=eyJuZXJkbGV0SWQiOiJhbGVydGluZy11aS1jbGFzc2ljLnBvbGljaWVzIiwibmF2IjoiUG9saWNpZXMiLCJwb2xpY3lJZCI6Ijk5NDY0MCJ9&sidebars[0]=eyJuZXJkbGV0SWQiOiJucmFpLm5hdmlnYXRpb24tYmFyIiwibmF2IjoiUG9saWNpZXMifQ&platform[accountId]=2026003

Here’s the VM;

https://one.newrelic.com/launcher/infra.infra?pane=eyJuZXJkbGV0SWQiOiJpbmZyYS5ob3N0cyIsImZlYXR1cmUiOiJzeXN0ZW0iLCJob3N0c0ZpbHRlcnMiOnsiYW5kIjpbeyJpcyI6eyJlbnRpdHlOYW1lIjoiSEVBLUFCLUFWLTAxIn19XX19&platform[timeRange][duration]=1800000&platform[accountId]=2026003

Thanks

Hi @david.harris2

The last occurence of that host going offline appears to be on August 3rd - that did create an incident, see here

That incident should have sent notifications. It looks like it triggered a Slack notification, as well as an email to alerts@[your-domain]

Can you confirm if those notifications were received?

Hi Ryan,

Those notifications were not received. The strange this is that it appears to be working now. I just tested by powering off the vm.

Very strange.

1 Like

Interesting! I don’t see anything here that implies those alerts would be blocked. It’s good to know that you are receiving them now though.

If this comes up again please do let us know. We can try searching again once we have solid examples where this is happening.