Outlier Detection Threshold
Outlier detection looks for a deviation as a flat number, not a percentage. This is a problem because our traffic volume varies significantly. For example, during moderate load a normal sample set might be (91, 98, 167) with a max deviation of 48, but during lower load a problem sample set might look like (1, 2, 62) with a max deviation of 40. The max deviation for normal behavior (during moderate load) is higher than the max deviation for problem behavior (during lower load).
Allow the threshold to be a percentage. This will make outlier detection more meaningful when the normal values vary significantly over time.
Average Calculation for small sample sets
We also have a problem because we have a small sample set. When a single value deviates, it affects the average significantly, and makes the deviation less likely to be detected. This forces us to make the threshold lower, and increases the risk of false positives.
Allow an option (off by default) to calculate the deviation for each sample as the distance from the average of other values (not all values). This will eliminate the problem with small sample sets where deviations tend to mask themselves by shifting the average.
This table shows how these suggestions could improve New Relic’s ability to recognize a problem.
-The fourth column shows the current deviation calculation
-The fifth column illustrates suggestion #1
-The last column illustrates both suggestions together
Scenario Samples Average Deviat. Deviat. % Average Deviat. Deviat. % (excl 3rd) (for 3rd) (for 3rd) Normal 1,3,6 3 3 100% 2 4 200% Problem 1,2,62 22 40 182% 2 60 3000% Normal 91,98,167 119 48 40% 95 72 76% Normal 89,96,130 105 25 24% 93 37 40%
Note regarding suggestion #1: I recognize this could be sporadic with low numbers, and it may be valuable to include something to stabilize it at low numbers, but I don’t think that’s important enough to hold up the initial feature.
Note regarding suggestion #2: If you’re concerned about performance, you could put a limit on how many samples could be evaluated this way, and if the number of samples exceed the limit, deviation calculation could revert to the average of all values.
Final note on posting this Feature Idea: I tried to tag this post with “outlier”, and the tag doesn’t exist, and it appears that I cannot create arbitrary tags. It may be helpful to create an “outlier” tag.
New Relic Edit
- I want this too
- I have more info to share (reply below)
- I have a solution for this
We take feature ideas seriously and our product managers review every one when plotting their roadmaps. However, there is no guarantee this feature will be implemented. This post ensures the idea is put on the table and discussed though. So please vote and share your extra details with our team.