Finding the Where and When of an Incident with Automap

Introducing New Relic One Automap

To quickly resolve issues affecting a cluster of services in your architecture, you need to effortlessly and rapidly identify both where AND when issues originate. Today, with New Relic One Automap, you can.

Wake Up, Something is Wrong

It’s the “3:00 AM” scenario every on-call engineer dreads and yet always expects: a notification indicating that a service is alerting in Production. If your users aren’t already impacted, they likely will be soon. Therefore, triage needs to begin immediately; the incident clock has started, and every minute counts.

Incident Analysis Scope is about When and Where

With modern architectures spanning hundreds of interdependent services, most incidents don’t just involve single services. Instead, they can include dozens of clustered services with issues propagating like a wave, wreaking havoc through your architecture. Often, the service that alerted you to the issue isn’t the cause but one of several services impacted. The true source of the issue may have started further upstream and occurred well before your service ever experienced any degradation.

Built on top of the entity-centric New Relic One platform, New Relic One Automap intelligently expands to include all services related to the issue, providing a full visual scope of the incident. Then, using Automap Timewarp, you can step backward in time and understand how the issue propagated through the architecture and track the issue back to where and when it all started.

First, Determine the When

Establishing the timeline of events is critical in any investigation, and diagnosing an incident is no different. This analysis is especially true when the incident involves multiple services with complex interdependencies! Before you can understand what happened, you first need to scope your investigation by identifying when it happened. When did things begin to go from “operating as expected” to “operating abnormally”?

With Automap, identifying the when is easy. Automap:

  • Constructs automatically a timeline through seamless integration with New Relic One’s Alerting & Anomaly Detection
  • Enables you to explore the incident timeline backward, up to three hours earlier, in one-minute increments

Next, Identify the Where

Only after you’ve established the timeframe for the issue can your investigation proceed to where or which services the issue may originate? With its intuitive visualization and automated scope detection, Automap indicates where to direct your analysis. Service health is clearly indicated via the familiar “traffic light” red/yellow/green coloring and anomalous behavior indicated by the presence of a purple center. The Automap UI allows you to customize your view to get precisely the information needed without ever feeling overwhelmed.

Now you can find the What

Previously, identifying when and where an incident occurred took minutes, even hours. Now, with Automap in the New Relic One Platform, identifying the true source of an incident in a complex service architecture takes seconds. Only once you’ve determined when and where your issue began can you investigate what happened. With Automap, the when & where are now easy, and with the power of New Relic One, you can quickly answer the question you had when you first woke up: “What happened?”

New Relic One Automap is included for all full users. Check it out now here.

5 Likes