As part of our new What’s On Deck? series, I want to share with you a feature we are actively developing and that will be coming to you soon. Follow the
#whats-on-deck tag in general for new Alerts updates, and follow this thread specifically if you’d like to get updates on how close we are to release!
We will be releasing the ability to use Sliding Window Aggregation (SWA) in your NRQL alert conditions. Sliding Windows is something that is currently available in NRQL queries. I encourage you to read more on it in this documentation. We will be adding this functionality to Alerts soon, and I want to make sure you’re aware so that it won’t take you by surprise, and so that you can get excited for this new functionality.
As part of this improvement, we will also be increasing the maximum aggregation window size from 15 minutes to 120 minutes!
- It will allow for more consistent aggregation of erratic or volatile signals
- More accurate and reliable alerting for infrequent or inconsistent signals
- Ease of troubleshooting – you can duplicate sliding window behavior in ad-hoc NRQL queries
- You can use aggregators other than
The documentation on Sliding Windows in NRQL queries covers the basics, but I’ll quickly go over the formula we’ll be using (and that you can use too) to convert your “Sum of query results” alert conditions over to using SWA.
First of all, here’s the formula – you can reproduce this in NRQL for now, so I’m using
TIMESERIES, which we do not normally allow in Alert conditions:
<your query> TIMESERIES <your threshold window> SLIDE BY <your aggregation window>
So if you have a threshold something like
Sum of query results is over 100 at least once in 3 minutes, your threshold window is 3 minutes. Let’s assume you have an aggregation window of 1 minute (the default). This would result in the last 3 minutes worth of data being aggregated each minute.
Here’s an example of what that would look like. Imagine each block is 1 aggregation window of data, and inside the block is the aggregated value for that window. I’m going to use
sum for my aggregator, since that just makes things easier to think about.
- On minutes 1 and 2, no evaluation would take place. That’s because a buffer is being filled, and we do not yet have 3 minutes worth of data.
- On minute 3, we now have a full buffer of data (3 minutes’ worth), so we can aggregate the values. The evaluated value for minute 3 would be
- On minute 4, the 3-minute window slides by 1 minute, and a value of
9is evaluated (
- On minute 5, the 3-minute window slides by 1 minute again, and a value of
12is evaluated (
- and so forth
We do have plans to allow this, but for now you will be able to use sliders in the UI to control these values. Keep in mind that your aggregation window (used for
SLIDE BY) needs to be smaller than your threshold window (used for
TIMESERIES), and the threshold window should be evenly divisible by the aggregation window.
Which brings us to …
We plan to disallow these cases, but I want to make sure you all understand the why.
This is not a terrible way to break your condition, but it will make it so that you’re not really getting slide-by functionality.
If your threshold window and aggregation window are the same value, you wind up actually getting the traditional alerts behavior. That is, instead of getting a nice, incremental slide, like this
You wind up with a “cascading” aggregation, which looks more like this
This is a pretty terrible way to break your condition, since you will wind up with gaps which are not evaluated.
Let’s say you had your threshold window set to 3 minutes, but your aggregation window set to 6 minutes. That would look like
TIMESERIES 3 minutes SLIDE BY 6 minutes.
You would wind up with behavior like this
3rd way to break your slide-by: use a
SLIDE BY setting that does not divide evenly into your aggregation window
This is a somewhat terrible way to break your condition. Since your
SLIDE BY setting is lower than your aggregation window, but will leave a gap once in a while.
Here’s an example: imagine a
SLIDE BY setting of 60 seconds, and an aggregation window of 90 seconds. For the first minute, everything is good, but on every other minute, the
SLIDE BY setting moves forward by half an aggregation window, which leaves half an aggregation window as a “gap” that does not get evaluated.
We will have validation in place to disallow these, but it’s important that you understand why.
You may think that SWA sounds familiar. That’s because I included this feature in my big announcement, which, I recommend checking it out at this link if you haven’t already.
Sliding Windows Aggregation, will be our replacement for the
Sum of query results threshold type in NRQL alert conditions (documented here). This will be a gradual replacement, so that you will still be able to use your
Sum of query results thresholds for a time after SWA is released.
In a nutshell, they will only sum data. They won’t give you the maximum value, the minimum value, or the average value, they will only add data points together and give you a sum over a rolling time window. While this is certainly useful for some use-cases, there are many other cases where an
max or some other aggregator is needed.
Yes! When you use Sliding Windows Aggregation (SWA), you control how we aggregate the sliding window in your NRQL query when you use an aggregator function. If you use
average, we will give you the average over the sliding window, instead of always the sum and only the sum.
I can’t share an exact date with you yet, but I encourage you to try out the
SLIDE BY function in ad-hoc NRQL queries to see how it works now!