Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

Dashboard/NRQL: facet over error rate

dashboard
nrql
insights

#1

Hi there. On my dashboards, I have “Error rate” charts from APM. All others are already converted to NRQL-based ones (the New dashboard type). When I use the builder to get the error rate chart it works fine showing data similar to APM one. E.g. this one works fine: SELECT percentage(count(*), WHERE error is true) AS 'Error rate' FROM Transaction WHERE appName = '.....' TIMESERIES SINCE 3 hours ago

But what I want to achieve is to facet it by error type, so I can see the percentage of each error code across the overall traffic. That is, if there are 10 transactions, one with http 400, and 2 with http 500, and 7 http 200, I want to see 2 points on the chart: 10% and 20%.

But the moment I add a facet, the “100%” becomes not the overall count of transactions but the count of error transactions only. So if at certain point in time I have only errors with response code 405, that 405 takes 100%, even though the overall error rate may be 0.5%.

I also see all sorts of weird behavior when there are different consecutive error types, or when the calculation is done through select count(transactionname)/count(name) from transactionError, transaction.

Is this scenario supported at all? If yes - how can I get the expected results for the dashboard.


#2

Hi @Mykhailo_Makarov

Although helping find a query that will split out your error percentage by HTTP response code is beyond the scope of support, I did want to jump in to explain why you’re seeing the behavior you describe when adding FACET errorType to your query.

Any type of filter command (in this case, percentage) will not work well with FACET. FACET works before the filter, meaning that events are first grouped by the value they have for the faceted attribute and then events where the faceted attribute value is null (in this case, errorType) are dropped. So before percentage is run, all events without an error type are not included in the query results. Then percentage is run discretely on each facet, returning a 100% value for each facet so long as there was at least one event with that value (e.g. 100% of the returns for an errorType=500 value had an errorType of 500).

I hope this helps to understand better why you’re seeing the results you’re seeing. I will turn this over to the community to help find a way to split out error percentage by HTTP response code.


#3

Thanks @Fidelicatessen, that explains. I think a lot of people may find this information helpful.

From my side, after a numbers of experiments, I think the closest I got to what I wanted is SELECT count(*) from Transaction where appName = '...' TIMESERIES 1 minute facet response.status LIMIT 10 SINCE 6 hours ago EXTRAPOLATE. But this anyway is not what the original goal was, I think I’m not going to use this on the dashboard.

My takeaway from this is that at present facets are only usable with count/min/max - according to this (by the way - why not average?). Which is extremely disappointing, to be honest. I actually won’t suggest that you’d change the behavior of aggregation functions, who knows how hard that is.

Instead, how about adding a virtual variables available in query when facet is used? E.g.:

  • faceted_count representing the Sum(all the events by facets in the query)
  • faceted_distinct representing the Sum(all the unique events by facets in the query)
  • nofacet_count representing the count(*) without the facet

This way I could do things like SELECT count(*)/nofacet_count*100 from Transaction where [.....] facet errorType [...] or SELECT count(*)/faceted_distinct*100 from Transaction where [.....] facet response.status [...] which would solve the issue without actually changing the engine much.


#4

Great share, @Mykhailo_Makarov! Thanks for sharing your takeaways with us!