Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

Feature Idea: Synthetic Ping failures should trigger only when all locations fail at the same time

alerts-beta
feature-idea
new-alerts
rfb

#1

Hello!

We have been trying the new Alerts feature in combination with a Synthetics Ping test of our login page. We integrated this with PagerDuty and it would be great except this one really annoying problem:

From time to time for whatever reason a ping to our app login page might fail from one of 5 locations. This does not at all indicate that the page is down. It could be that the AWS location sending the ping has an issue with reaching our site. Or that the single request was dropped on our end.

Again, these are not reasons to wake up the on-call tech at 3am (as is the purpose of PagerDuty). However, the way things are now there is no way to specify how many of the checks need to fail before an alert is sent. For example, we may want ALL of the locations to fail twice before sending an alert. That would be 30 minutes of downtime and not ideal, but since we combine New Relic with MANY other monitoring and alerting tools we need a way to set the “volume” of these alerts.

A basic default for us would be to ONLY send an alert if ALL locations fail in the same testing interval.

If this is currently possible, please show me how. If not, please put in a feature request.

Thank you!!
// eric


New Relic edit

  • I want this, too
  • I have more info to share (reply below)
  • I have a solution for this

0 voters

We take feature ideas seriously and our product managers review every one when plotting their roadmaps. However, there is no guarantee this feature will be implemented. This post ensures the idea is put on the table and discussed though. So please vote and share your extra details with our team.


Synthetics Pings Failed From One Location (but were successful from others): Feature Idea
#2

Hi @ethompsy

The alert and retry is explained in a previous post which may assist in your query.


Ping Monitor False Reporting
Two-a-Days for Team Community
#3

Hello @stefan_garnham!
Thanks for that explanation. I read it in full. However, this does not resolve my issue…

We don’t care if one geographic location cannot reach us while others can. We MAY want to look this up later in the graph but as an alert this is worthless noise. Our techs cannot control networking between disparate geographic locations and our data center. They can restart our stack if it were to crash. And the single most important indicator of our stack being down (as provided by the New Relic™ Synthetics Ping test) would be the case where the ping fails from ALL (selected) locations. That would eliminate the possibility of the issue being someone else’s network problem and indicate with near 100% certainty that we have a serious problem that our on-call techs can fix.

Does this make sense?

Without this feature I am faced with only one option: I will be turning off the Synthetic Alert today. If this feature does not get implemented I will be writing the Synthetic Check -> Alerts integration as useless to us. In my organization I am tasked with the evaluation of New Relic™ as a viable solution for a plethora of requirements.

Don’t get me wrong: I REALLY like New Relic™. But this feature is a dud without some way to tune a threshold of “availability zones down” before sending an alert.


#4

@ethompsy

Thanks for taking the time to outline your feature request! It makes a ton of sense, and I’ll get that filed and sent to the Alerts product manager for consideration. We’ve heard others request the same configuration option recently, so we understand how valuable this use case is for our customers. However, I’m unable to suggest when or if this feature will be available, as we have many considerations to balance!

Thanks again for your feedback!


#5

Hello @rmcdonough!
No problem on the timing. We will watch the release notes.
I just wanted to get the request in as I know we are not alone on this need.


#6

I logged this request with New Relic 6+ months ago. I hope they are working on comprehensive/flexible alerting system.


#7

@cookcr I have faith that they are. I would also imagine that they have their hands full as their product is pretty awesome.


#8

This is a must-have feature for us. Synthetics alerting is a lot less useful otherwise.


#9

@jamesh I have created a feature request for you too. Once there is an update you will be notified.


#10

I want to +1 this.

Knowing that one location is flakey is marginally useful, and might indicate an issue to investigate with a CDN or something.

Knowing when all or most locations fail at the same time is MUCH more valuable as an actionable alert, and that’s what this product is supposed to be for. I have a feeling that if I enable this alert as-is, I’m going to have a revolt from my ops people when they start waking up at 3am every time Sao Paolo can’t find us.


#11

@dougw Thanks for your input, your use case is a great example for us to take to our product managers. I agree that knowing how many locations are failing would be super valuable.

While there hasn’t been any announcement for when/if this feature would be available. I hope we can avoid any revolts, mutinies, or full blown rebellions :scream:. To that cause I have added your +1.


#12

The reliability of the Linode servers have been pretty bad. New Relic hasn’t been updating there status page when these issues occur with there cloud providers. I’m quite tired of the noise. We get woken up in the middle of the night because one monitor location is spamming email. Can you provide an ETA on this feature? It might be worth adding this robustness, if you are going to continue to use linode as a provider. Here are some recent examples:

http://status.linode.com/incidents/pxxyjq5hcmfh

http://status.linode.com/incidents/3v13kt5dgjnv


#13

We too had multiple engineers get paged with false alarms last night because of the Linode issues. So long as there’s no “2nd check” from another datacenter we can’t trust the reliability of NR Synthetics to determine whether we’re down or you’re down. Also no status post from NR means I have to go spelunking through your vendors’ status pages to find out what’s going on.


#14

@mwhittingham and @jamesh Although I cannot give an accurate ETA on this feature right now, I would be more than happy to pass this input along to my product managers. They always want to hear about these kinds of use cases. And the feedback you have to offer is great! Thanks for chiming in! :blush:


#15

Have there been any updates on this feature request? We have the exact same problem, one of many locations fail and we get notifications the site is down. We only want to get alerts when all locations are failing. Hugely annoying and noisy without this feature.


#16

I am sorry to hear that you too are experiencing this as noise, @ggyssler. I don’t have an update to share right now. I will add a feature request of your behalf. Thanks for letting me know and check back for updates on this in the future!


#17

This is a big thing for us too. Getting lots of noise from offshore New Relic endpoints when our local one is just fine.

Also an issur for scripted browsers where we do an end to end transaction and get woken up overnight because one location isn’t happy.

I will note that Dynatrace has this feature already - please don’t lag behind for long :slight_smile:


Feature Idea: Decrease Sensitivity of Synthetics Alerting
#18

Thanks for reaching out to us, @mraynor! I have added a poll above so that our product can see how many votes are behind this request. I’ll pass along your input—any and all use cases are always mighty helpful. Thanks so much!


#19

Guy! any update on release of the requested feature, the test retry option (‘n’ failure should trigger alert instead of first one from any location) is a MUST otherwise we will forced to look into other products as it’s raising too many false alarms.


#20

Hello!
This feature would be a great first step to fixing the signal to noise ratio on synthetic alerts, but synthetic alerts needs even more knobs and levers to tune the alerts. I’ve been asking for this too. since last year at FutureStack.

Right now there is no way to tune the signal-to-noise ratio of synthetics. Please add this!