Baseline alerting is awesome, but as we have previously described in this post, it isn’t suitable for all data. Sometimes you might also want to have a little more control over your baseline range, or wish you could tweak the tolerance of one side independent of the other. For those situations where baselines just doesn’t quite fit the bill, here is an alternative you can use with the power of NRQL alerting.
In this post, we are going to use two primary concepts to build our NRQL queries:
- selecting data from a range of values (or a channel)
- using the
hourof()function to select data from certain time ranges
Using these two concepts, you can essentially create an alert for when values are outside (above or below) of a channel during a given time; these are static values that you have manually selected as the range for what you consider the “baseline” to be. Because you select these values, one end can be more or less tolerant than the other.
When selecting a range, you will want to target a particular attribute and specify that the query should select events where that attribute was above or below a certain value. For example, you could write something like this:
SELECT count(*) FROM Transaction WHERE appId = 9999999 AND (duration < 0.1 OR duration > 1.5)
This query counts the number of transactions that fall outside of a specified range of duration values; anything that takes less than 0.1 seconds, or takes more than 1.5 seconds.
Again, although this gives you flexibility in creating a custom range and tolerance for your data, they are still static thresholds. If you have data that acts in a predictable pattern (say, high values during the day, and low during the night) you can build even more specifications into your query so that it will only look at data from certain hours of the day.
You can set up NRQL alert conditions that only monitor during certain times of day using
WHERE hourOf(timestamp) != <someTime:00>.
As an example, here is a NRQL alert query that targets average transaction duration from a certain app, but excludes any data from between 2:00am - 4:00am:
SELECT average(duration) FROM Transaction WHERE appId = 9999999 AND hourOf(timestamp) != '2:00' AND hourOf(timestamp) != '3:00'
The limitation of using
hourof() comes from the side effect of the datastream dropping off during the hours that are excluded, making this useful for thresholds set to over X amount, but not useful for thresholds set to less than X amount. If the threshold is below a certain value, then you will get false positives during these built-in gaps, because the query will not return any data during these periods (which Alerts will treat as a zero).
However, by combining this with a range, you can create a threshold looking for a) events where your targeted attribute is above or below your channel, and b) only look at events during a specific range of time. Here is what this might look like all together:
SELECT count(*) FROM Transaction WHERE appId = 9999999 AND (duration < 0.1 OR duration > 1.5) AND hourOf(timestamp) != '2:00' AND hourOf(timestamp) != '3:00'
Because your query should only be returning positive values based on the criteria, you can use the “above X amount” threshold with no trouble.
With a bit more querying, you can customize and fine-tune this as much as you want! For the times where you need a bit more control than baselines can offer, try this out and let me know what you think.