I am running an ASP.NET application across several IIS servers that are running on Win Server 2016. We recently started experiencing high CPU usage on w3wp.exe processes, along with high network I/O throughput, causing slowness in the app.
Our goal is to alert for incidents through Zendesk tickets, however, currently, we are getting lots of alerts that are closing for a short period due to recovery and we would like to understand what is the best threshold to avoid false alerts. Are there any guidelines or recommendations I could follow when configuring these alerts to be more intelligent?
This is a CPU usage infrastructure alert with a critical threshold of over 90% for 10 minutes
Currently monitoring using the infrastructure agent.
Your suggestions are greatly appreciated!