Alerts Best Practice Guide
Alerting is an indispensible practice, keeping your teams in the know about potential performance issues before they even happen. You can’t possibly watch your site every second of every day, and with alerting, you don’t have to. You’ll be notified as soon as an issue arises.
In this post, we want to help you set up New Relic Alerts so that you can make the most of this pivotal tool. As with any project, a little pre-planning and preparation will make you more successful, so we’re sharing the features to which we think you should pay the most attention.
When you have reviewed these best practices, show off your new found skills. Take the Alerts Best Practices Quiz to earn your badge.
Set up Policies and Conditions:
Creating an effective alert policy can be challenging. It takes planning to develop the correct set of conditions (what you are alerting on), the thresholds (the values that will trigger the alert), and the notification channels (where the alerts information will be sent). You can think of a policy as a collection of conditions, all designed to target specific entities (apps, hosts, monitors, etc…). You can have multiple conditions per policy. Notification Channels are policy specific though. So you should group your conditions into separate policies depending on the team you expect to receive the notification.
- Tutorial: Alert Policies
- Tutorial: Alert Notification Channels
- Defining Alert Conditions
- Configuring Alert Policies
- Notification Channels
Choose your Incident Preference:
Incident preference is a policy-wide setting that specifies the frequency of alert notifications you will receive for each policy. By default, if there’s an open incident for a policy, any new violation of any condition within that policy will roll up into that initial incident. No new notifications will be sent. The incident must be closed, and new violation must occur before a new open incident notification will be sent. Incident Preference allows you to opt in to new incident creation for every policy, condition, or condition and entity, resulting in the least to the most notifications respectively. Setting up an account-wide standard practice for alerts will help ensure you have the notifications you need, when you need them.
Set up a notification channel:
Notification channels dictate where your alert notifications are sent. You can configure a notification channel to send automatically to a specific user on an account, a particular email address, or choose one of several pre-configured integrations with messaging services. Additionally, you can leverage webhooks to send notifications just about anywhere you can imagine. Remember that notification channels are set against policies, so make sure your conditions that require attention from the same individual or teams are grouped into the same policy.
- Notification Channels
- Tutorial: Alerting Incident Life Cycle
- Relic Solution: Alert Incident Preferences are the key to consistent Alert notifications
- Relic Solution: All about webhooks
Understanding what you are seeing
Learn the language of alerting:
New Relic Alerts feature a number of terms that may be specific to New Relic. Understanding the terms used is helpful in being able to get started with alerts quickly and without confusion. Check out our Alerts Glossary doc to help get started quicker.
Explore an Incident:
The incident view in New Relic alerts groups together all violations that occur in a policy, allowing you to view them as a timeline. Exploring this timeline provides you a better understanding of what triggered an event, the subsequent issues, and how to address it. Incidents can be acknowledged by a team member, signaling that they are working on addressing the issues. Incidents will close automatically after all violations are auto-closed out, or the violations can be closed manually.
Taking action on your alerts
There are possibly countless reasons for an applications response times to increase, from slow database calls, to slow external services, and more. Tracking down the cause can take a lot of time. New Relic Alerts Incident Context analyses your application at the time your thresholds were triggered. If it detects slow database calls, or slow external services at times that correlate to the alert violation, you’ll see that in the incident UI.
Send Alerts data to Insights:
By default Alert events are not available in Insights, but utilizing the flexible Webhook Notification Channel to send alert notifications to the Insights Insert API, you can track Alerts as Custom Events. Sending Alerts data to Insights allows you to create dashboards that can answer questions like “Which policies are most frequently violated?” or “What are the five most recent Alerts violations?”’
Set up a Baseline Alert:
Applications aren’t always easy to predict. Perhaps you have a higher throughput during business hours, so the potential for resource exhaustion is higher at that time than at the weekend. Therefore one set threshold for your apps to meet 24/7 is not always a viable option. Baseline alerts help here as they learn from patterns in your data. Initially, that is up to 2 weeks of data. However, over long time periods, those patterns are analyzed and used to determine what the baseline is for your application at any time. This allows you to set up a condition to trigger when any of your standard alert-able metrics (error rate, response time, etc…) deviate from what is considered ‘normal’ for your application.
Ready to Learn More?
Looking for more Alerts best practices and tips? Check out the Alerts Level Up category.