Observations during a Performance Testing Window

I wanted to monitor and see any difference during a particular performance testing window. Do we have any pointer like i need check the below list

like a doc links / pdfs would be helpful ?

@RyanVeitch
Regards

Hey @moses.arock -

As I understand it, you’re looking for a way to target your New Relic metrics to just the portion of time you’ll be running performance testing…? Is that right?

I don’t know of any docs on the subject, but some recommendations that come to mind:

  1. Target your timepicker in the APM/Browser/Synthetics/Mobile/Infrastructure windows to show only since the start of your performance testing. Or, start your timepicker from slightly before - to pick up the change in performance from normal operation.
  2. Build a Dashboard based on NRQL with widgets important to your test. Such as, Page Load time (SELECT max(duration) FROM PageView WHERE appName = 'perfTestApp' ),
    or your APM Transaction Response Time (SELECT max(duration) FROM Transaction WHERE appName = 'perfTestApp')
  3. Send a custom attribute in your Insight Events from whichever event source you need (APM/Browser/etc…)… Something like 'testing':'true' - This will help you to target your NRQL Queries to the performance test data.

Honestly, this really comes down to the metrics that are important to you. Not everyone has the same requirements or expectations for performance load testing.

It’s good to determine what you expect to see, and what you hope to learn from your tests, and then you can better work to figure out the metrics you need to look at in New Relic.

Yes i am looking understand from New Relic side and how can I highlight the issues / how to build a metrics out of it.

I think it’s best to let some other users respond in this thread. Users who may have run similar tests on their systems who can better help understand what metrics are important…

@stefan_garnham - @MKhanna - @jfry - @reopelle.scott - Do you folks have experience with understanding the metrics behind performance testing??

1 Like

Hi @moses.arock

Here is how I would approach this scenario.

  1. I will first quickly define what is an issue? Example: if APDEX is before .95 for T of 0.35 Sec, thatsan issue for me, Error rate is higher than 5% for 2 mins that an issue, CPU and memory are around 90% that an issue. Based on this I would setup alerts, so I wont have to see the screen while the test runs.
  2. Once I have setup the alerts, Now its time to make one place to look at the complete picture, you have two choices again:
    a. For the time frame Alert fired, you can go in APM Overview and start the investigation from there.
    b. you can build a troubleshooting dashboard where you can collect all the widgets that will help you identify the problem: Example, when the apdex fell, what transaction ran, what database query were the slowest, etc.
  3. Have a read through this if you haven’t already: https://learn.newrelic.com/monitoring-performance-with-apm

Hope these help.

Happy performance testing.

4 Likes

Along with MKHanna’s suggestions, I’d also recommend setting up a few Key Transactions to monitor as well.

When we do perf testing, it’s usually isolated to a few select areas at a time (specific pages / controllers / etc.) so setting up alerts on a Key Transaction as well as monitoring overall transaction times and APDEX results for specific time frames (above post) should give a pretty good set of details.

The Key Transaction section would then give you a bit more of an isolated view (Service Map and Transaction details) for these areas instead of hoping to find those metrics in the general APM pages.

Some added details here:
https://docs.newrelic.com/docs/apm/transactions/key-transactions/introduction-key-transactions

4 Likes

Thanks so much for helping out here @MKhanna & @jfry :smiley:

thanks for your suggestions @jfry

1 Like

Let us know if you need anything else on this @moses.arock

Thanks @MKhanna for your inputs.

@RyanVeitch - Will the Thread Profiler help in understanding the data ? Like for e.g., the response time is High and CPU is less. Will I be able to track it down in profiler ?

Regards

There’s a depth behind this question that no one has hit on… I recently worked with our team that is “testing” our eCom solution. Here’s what I discovered as we progressed through that journey.

This example focuses on the use to Insights Data Apps to create fully functional dashboards.

  1. Define Service Level Indicators like Latency, it’s the most popular. Don’t forget about availability, can the service response at a specific throughput.
  2. Set clear performance thresholds and measure the percentiles of time that you are hitting those. They can either be individual calls/transactions or functional groupings.

As an example: Our search function should return a result (as a http 200) in under 1500ms 99.9% of the time.

  1. Run the tests at increasing request levels. 100rpm test 1, 200rpm test 2 until you hit the ‘wall’. This will also show you the throughput capability of your app… if you only plan to have 50 rpm, then you can test at 200 to make sure you are covered for seasonal spikes, stuff like that.
  • Once you hit the wall, STOP. You can’t do anything advantageous after you hit the wall. The wall can be defined as the level where your tests fail the performance levels set in steps 1-2
  1. Review the tests by selecting the time of the test with the timepicker. Review and record results.
  2. Once you know where you are failing, use APM to figure out what caused the issues… DB slowness, network hops, slowly performing code, etc… make improvements and retry.

Here’s a small example of how my data apps were originally formatted. I’ve added a lot since then, but this should get you started… Creating a template data app

3 Likes

Please don’t use any averages to measure your latency…
Please read this: If you use averages, you ARE missing the most critical events

and watch this: https://www.youtube.com/watch?v=lJ8ydIuPFeU

4 Likes

Great suggestions! Thanks for sharing @reopelle.scott :smiley:

Thanks @reopelle.scott for your suggestions.

1 Like

You are welcome and you can IM me if you need any additional help or suggestions.
There is considerably more to the story, I was trying to get out the door yesterday when I typed this up… Don’t beat me up on the “Averages” in the screenshot below… I was trying to demonstrate the wild variations between averages & high percentiles.

1 Like

This was what I meant by breaking down into monitoring specific interesting calls you want to observe. Here’s the query…

Performance Percentile… transactions that met your threshold (as opposed to those that failed).
SELECT percentage(count(duration), WHERE duration < 2) AS 'Met Requirements' FROM Transaction WHERE appName = '<appname>' AND name LIKE '%<specific/API/call>/#POST%' AND duration is NOT NULL SINCE 7 days ago

The statistics chart…
SELECT min(duration) AS 'Minimum', average(duration) AS 'Average', max(duration) AS 'Max', percentile(duration, 90, 95, 99) AS 'Percentile', count(duration) AS 'Count' FROM Transaction WHERE appName = '<Appname>' AND name LIKE '%<some/API>/#POST%' SINCE 7 day ago

and the large chart at the bottom with all API calls
SELECT count(duration) AS 'Count', average(duration) AS 'AVG', percentile(duration, 90, 95, 99) AS 'PERCENTILE', max(duration) FROM Transaction WHERE appName = '<appname>' AND name LIKE '%api%' SINCE 7 days ago FACET name LIMIT 100

3 Likes

Thanks @reopelle.scott for sharing your inputs and queries.