Conditional NRQL Alert not triggered

Hi NewRelic,

We have created an Alert Policy with one Condition using Terraform. I was able to trigger the NRQL Condition alert once by injecting test data into our Datasources. Unfortunately after this first alert, we are not being able to trigger this Condition anymore.

I have tried many NewRelic Discuss topics and Documentation. So far I have already tried:

  • Made many Query Variations
  • Changed Evalution Offset so that we are sure data is ready to be analyzed
  • Added configuration properties for possible Signal Loss
  • Changed Threshold Configuration Approach/Values
  • Changed Incident Preference Configurations
  • Changed Terraform and Provider Versions

None of those changes were able to trigger the Condition once again. I have used Insights and Query Builder, and by looking into the query everything seems to be ok. If you check below query we can see that Threshold Condition should be satisfied:

https://one.newrelic.com/launcher/nrai.launcher?pane=eyJuZXJkbGV0SWQiOiJhbGVydGluZy11aS1jbGFzc2ljLnBvbGljaWVzIiwibmF2IjoiUG9saWNpZXMiLCJwb2xpY3lJZCI6IjExNTEwNzkifQ==&overlay=eyJuZXJkbGV0SWQiOiJ3YW5kYS1kYXRhLWV4cGxvcmF0aW9uLmRhdGEtZXhwbG9yZXIiLCJpbml0aWFsQWN0aXZlSW50ZXJmYWNlIjoibnJxbEVkaXRvciIsImluaXRpYWxRdWVyaWVzIjpbeyJhY2NvdW50SWQiOjIyMTM2NjgsIm5ycWwiOiJTRUxFQ1QgYXZlcmFnZShgcHJvdmlkZXIubWVzc2FnZXNJblBlclNlYy5BdmVyYWdlYCkgKiA2MCBGUk9NIEF3c01za1RvcGljU2FtcGxlIFdIRVJFIHByb3ZpZGVyLmNsdXN0ZXJOYW1lPSdpaG0tc3RyZWFtcy1jaScgYW5kIHByb3ZpZGVyLnRvcGljPSdkaWdpdGFsLmZpbmFuY2UuZGxxJyBTSU5DRSAnMjAyMC0xMi0wNSAwOTowNTowMC0wNjAwJyAgVElNRVNFUklFUyBVTlRJTCAnMjAyMC0xMi0wNSAwOTo1MTowMC0wNjAwJyAgIn1dLCJpbml0aWFsTnJxbFZhbHVlIjoiIiwiaW5pdGlhbEFjY291bnRJZCI6MjIxMzY2OCwiaW5pdGlhbENoYXJ0U2V0dGluZ3MiOnsiY2hhcnRUeXBlIjoiQ0hBUlRfTElORSIsImxpbmtlZERhc2hib2FyZElkIjpudWxsfSwidGltZVJhbmdlIjp7InN0YXJ0IjoxNjA3MTgwNzQxODI5LCJlbmQiOjE2MDcxODM1MTQ5NTJ9LCJpbml0aWFsVGltZVdpbmRvd092ZXJyaWRlIjpudWxsfQ==&sidebars[0]=eyJuZXJkbGV0SWQiOiJucmFpLm5hdmlnYXRpb24tYmFyIiwibmF2IjoiUG9saWNpZXMifQ==&platform[accountId]=2213668

Here is the Actual Alert Policy Links:
https://one.newrelic.com/launcher/nrai.launcher?platform[accountId]=2213668&pane=eyJuZXJkbGV0SWQiOiJhbGVydGluZy11aS1jbGFzc2ljLnBvbGljaWVzIiwibmF2IjoiUG9saWNpZXMiLCJwb2xpY3lJZCI6IjExNTUxMzAifQ&sidebars[0]=eyJuZXJkbGV0SWQiOiJucmFpLm5hdmlnYXRpb24tYmFyIiwibmF2IjoiUG9saWNpZXMifQ

EDIT
Found a way to trigger alerts by saving the Alert Condition using NewRelic UI instead of API(Conditional NRQL Alert not triggered). But this does not seem to be a good solution, as there is a manual process involved. We want to automate our NewRelic Alert Policy Infrastructure with Terraform without having to worry about NewRelic UI.

Thanks

I have found the solution on Creating New Relic Infrastructure Alerts using Terraform - Alerts don't trigger

In fact it is not really a solution. @RyanVeitch mantioned the following:

Could you try go into the UI and resave the same condition created via Terraform?

We’ve seen rare cases where conditions created via the API (or terraform in this case), have not synced to the Alerts DB upon creation. Usually resaving the condition re-syncs it to the DB and ensures the condition will trigger

After saving the Alert Condition an Incident was created. Is there anyway to fix this problem without having to use the API and then saving it once again using NewRelic UI?

Hi @PedroGregorio,

Sorry to see that you’re having issues here! I’m having some issues getting to the specific Alerts condition that you referenced here. Could you provide an updated permalink to the most recent one that you’re working off of?

One thing I’ll note is that latent data coming from AWS is a common issue that drives tickets and forum posts like this. One immediate solution that you might consider would be to increase your evaluation offset to 15 minutes (if you’re using default settings, that’s 15 1-minute windows under “advanced settings” in the condition UI). More background on that here: Relic Solution: Better Latent Than Never – How Data Latency Affects NRQL Alert Conditions

I’d suggest giving that a shot and seeing if it works for you here. Let me know what happens!

1 Like

Hi @Masen thank you for your reply,

I have already tried changing the offset to 15, that is the value we are using since the beginning of our tests. This is our updated Policy https://one.newrelic.com/launcher/nrai.launcher?platform[accountId]=2213668&pane=eyJuZXJkbGV0SWQiOiJhbGVydGluZy11aS1jbGFzc2ljLnBvbGljaWVzIiwibmF2IjoiUG9saWNpZXMiLCJwb2xpY3lJZCI6IjExNTUxMzAifQ&sidebars[0]=eyJuZXJkbGV0SWQiOiJucmFpLm5hdmlnYXRpb24tYmFyIiwibmF2IjoiUG9saWNpZXMifQ

Please take a look in my previous comment(Conditional NRQL Alert not triggered), after applying @RyanVeitch suggestion and saving the previous created Alert(API) using NewRelic UI, alerts start working(seems they weren’t sync to NewRelic DB in some rare cases using API).

But as we are trying to automate our Alert Policy Infrastructure this doesn’t seem a good solution, as we would have to manually go to NewRelic UI and save our Alert Conditions.

Is there a way to solve this issue without having to save the Alert Condition in NewRelic UI?

Hi @PedroGregorio

I’ve been digging in to this and have isolated the ihm-digital-streaming-monitor-ci-alert-condition alert Condition from the Policy you linked to – this condition has a query that exactly matches the query you provided.

However, the query you provided is scoped to about 1 hour from 7 December, and this condition wasn’t created until 15 December. Do you have a query showing this data in a violating state since the condition was created? Or a link to a different condition that existed on 7 December, which should have opened a violation on that day? In order to troubleshoot this type of issue, we really need to be able to see an alert condition in existence at the time of a metric breaching a threshold.

I am eager to get to the bottom of this, but unfortunately I can’t help with the condition you provided a link to. If you can provide the following, we can start our investigation:

  1. A URL link to an alert Condition that is enabled
  2. A URL link to a query showing the data that condition is targeting breaching the threshold – the time window needs to span a time in which the Condition mentioned above was enabled.

I hope everything is working well for you, and do let us know if you’d like to continue troubleshooting on this with us!