NRQL Alert for Synthetic Browser

I have a synthetic browser check that runs every 15min from 4 locations.

I’d like to alert when all 4 locations fail. The catch is that I only want to alert if the script makes it past a certain point. Thus, the new(-ish?) multi-location synthetic alert is not enough.

I’ve tried building a NRQL alert
SELECT count(*) from SyntheticCheck where monitorId = ‘cbf5a350-e793-47e3-8968-17f8928d5eb0’ and result = ‘FAILED’ and custom.failedAt = ‘AOSAPP’

, but I’m having trouble understanding what I should choose for

  • critical violation → (above or equals 4 for at least 15 min
  • window duration → 15min
  • sliding window → ?
  • streaming method -->timer
  • timer -->15min

And since I want it to auto-close, I apparently need a loss of signal to “close all open violations” after → 10min.

I feel like I’ve spent all day looking at various blogs and Q&A posts. Some of which turn out to be outdated and are no longer applicable.

The above settings seemed pretty close, but when testing I saw the following

  • app disabled at 11am MDT
  • violation opened at 11:15am
  • slack message at 11:30am
  • app enabled at 11:31am
  • violation closed at 11:43am
  • Slack message at 11:43

Now as I typed all the above, I’m thinking that I need to change “for at least” to “at least once”.

Maybe that’s all I need to do, but before jumping through the hoops to test this again, I’d appreciate any feedback on other changes or considerations.

Thank you

Hi @charles.wilt

I think you are really close to what you want.

I’d suggest changing it to at least once as you mentioned and also setting the sliding window to 1 minute.

This will make the window slide every 1 minute (returning the data points) so the condition doesn’t need to wait until the window finish (15 minutes) to create incidents and send notifications.

Please let me know if this has worked for you.

thanks

Rodrigo

Hi Rodrigo,

I made the changes suggested, testing the alert. In the edit alert condition, the “preview”(?) is showing that a critical violation should have been opened at 8:53am or so, but it’s 9:30am now an I still haven’t seen a slack message come through…
https://onenr.io/0KQXKylqWja

Scratch that…the slack came through at 9:27…

Still missing something apparently.

Charles

I tried changing the “at least once in 15min” to “at least once in 1min”

  • app disabled at 11:03
  • violation opened at 11:05
  • slack message at 11:37…

Why is it taking 30min to send the slack alert?

Also, that change seemed to make it too sensitive, closing and reopening…
image

Hi @charles.wilt

Thanks for reaching back out with all the updated info, we really appreciate it.

This is strange, I am unable to see why this is occurring even with the link provided “https://onenr.io/0KQXKylqWja”. I have gone ahead and looped in the Alerts Engineering team to have a look at this.

Please note they will reach out via this post. Should you have any additional updates or questions please do reach out!

Hi @charles.wilt - I am going to check with Engineering on this through a support ticket. I don’t quite see the logic in this timing and how your configuration is affecting this. I suspect sliding windows has something to do with it but am going to get some clarification and will be in touch in the ticket. I will also be posting the solution here for other Explorer’s Hub users as well.

1 Like