Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

UPDATE: User defined alerts for Synthetics

alerts
synthetics
rfb
marchmayhem2018

#1

Our customers have spoken and we’ve heard you when you told us that new alert conditions for Synthetics is something you value. We have good news to share: this feature is being designed RIGHT NOW!

See?! I told you so! So what does that mean?

Well, we know a lot already about what you all need. For example, after pouring over the many, many bits of feedback you’ve given us, we know that we can solve the vast majority of your use cases with the following two conditions:

Alert me when X number of locations are failing at the same time
Alert me when X out of the last Y checks have failed

Not coincidentally, those are exactly the two conditions we’re planning to introduce! But opening the violation and sending the notification are only half of the equation. Right now our team is trying to figure out the right conditions for closing the violation, and we’re hoping you can help us.

Please fill out our little poll below for your opportunity to help guide the design of this exciting new feature:

If my alert condition says “alert me when 4 out of the last 5 checks have failed” and that condition is met, when would you expect that violation to close?

  • As soon as the failures drop below the threshold I set --> The violation would close as soon as we see 5 checks with less than 4 failures, meaning even if 3 out of the next 5 checks fail we would close the violation
  • As soon as the failures drop to half the threshold I set --> The violation would close as soon as we see 5 checks with 2 or less failures
  • As soon as the failures drop to zero --> We would have to see a full 5 checks with no failures before closing the violation

0 voters

Thanks for your feedback and we look forward to sharing another update with you next month!


March Mayhem Champion! 🏀
#2

#3

@jmarcel Thank you. As you expect that there will be a difference between the threshold to cause the alert and the threshold to clear the alert, why not allow both values to be applied, where the threshold clear value will default to the threshold alert value? Are you just checking if every voter wants the values to be the same (everyone votes option 1)?

I would want to know if some of my customers are potentially experiencing a regional outage, so I wouldn’t set the alert threshold that high.


#4

Glad to see that there is progress on this feature. I would agree with @Trevor_Dearham though, why not have the functionality to set the threshold to close as configurable?


#5

@jmarcel Great to see! How will this impact/interact with the current soft failure (3/3 retry) logic?


#6

@peckb1 - I would expect the current retry mechanism would still be in place


#7

Update 2:

Work on this is in progress. The majority of it is happening in our data pipeline. We’re building out a direct interface between NRDB and Alerts that sends non-aggregated event data to our evaluation tier so that our analyzers (the things that evaluate your configured conditions) have access to the raw events needed to perform their duties. This is an exciting extension of our internal platform and will be useful in many more ways in the future.

We also have a few more designs to share. This first one shows a potential confirmation screen after creating a new monitor. We’re playing with the idea of removing the alert configuration step from monitor creation within Synthetics since many monitors are created first without alert conditions (in a testing phase) then later alerts conditions are added.

This second screenshot simply shows the inputs needed to create a “Sequential check” condition. We’re also considering adding a warning threshold.

Some good questions:

My thoughts on this are around the balance between simplicity and flexibility. If in practice one of the heuristics above is very effective, starting there seems ideal.[quote=“stefan_garnham, post:6, topic:55054, full:true”]
@peckb1 - I would expect the current retry mechanism would still be in place
[/quote]
My opinion is that the retry mechanism used by Synthetics today shouldn’t be something customers need to think about. Ideally customers will think of it as a black box that contains logic that ensures the results of the check are correct. Then with these new condition types you just think about checks failing and succeeding in patterns that are worthy of grabbing your attention.

As always, we’d love your feedback, opinions and advice!


#8

The define thresholds screen looks useful. At present we’ve had to create a NRQL based alert to only trigger if X ‘FAILED’ occurrences are experienced.

When released will this be available to implement on all existing scripts?


#9

UPDATE #3

UIs have been designed and work on the data pipeline continues. Our Alerts team has their work on evaluating the pattern of successes and failures queued up and ready to go as the next thing they tackle. In the mean time, we are starting to design out how we will incorporate the current Synthetics Alert conditions into the new model in a way that makes the most sense for you, our customers.

We’re considering two different options, which means it time for…. ANOTHER POLL!

  • Keep the condition that allows a failure to be thrown every time a Synthetics location fails. This means if you’re running checks on 5 locations and they all started failing you’d get 5 alerts, one for each location. Then you would see a recovery for each location once they recover.
  • Migrate the current failure conditions to the new condition that alerts ‘whenever X locations fail’. X, in this case, would be 1. Under this scheme you would get just one alert when the first of your 5 locations fails and would see a recovery when they are all passing again.

0 voters

Which do you prefer? You can tell us in the comments why, or if there’s some other way we should do it.

Thanks!