Relic Solution: Setting Up Disk Alerts In Infrastructure

Infrastructure is a great product for monitoring all aspects of your servers. However, insuring that you set up your alert conditions properly is the key to getting notified when something is going wrong.

Let’s take disk alert conditions. Sometimes these alert conditions get set up to monitor disk fullness, but then the alert condition doesn’t open a violation when a disk on one of the targeted hosts violates the threshold.

Why is this?

It can be important, when setting up disk alert conditions, to make sure that you’re targeting Storage Metrics, not System Metrics, depending on your needs.

System Metrics focuses on overall system health. If you target this in your disk fullness alert condition, the condition will never fire unless all the storage capacity on a given host combined reaches the specified threshold.

Here’s an example of what I mean. Let’s say you have a host containing 3 disks: an 80GB disk and two 10GB disks. If you’re using System Metrics for your disk fullness alert condition, our alerts evaluation system will consider this as a 100GB system. If your threshold was set to alert you if disk fullness goes over 90%, 90GB of disk space would have to be getting used before that alert condition would open a violation. Even if both of the 10GB disks were completely full, the alert condition would not open a violation until the 80GB disk was 70GB full.

OK, so how should I set up my disk fullness alert conditions?

If you scope your alert condition to Storage Metrics instead, you will be able to filter the list of hosts that’s being targeted, and you will also be able to filter on every single disk attached to that list of hosts. An alert condition scoped to Storage Metrics will then open a violation whenever any single disk violates the threshold you specify.

So I should always use Storage Metrics for disk alert conditions?

It depends on your use case. It can be useful to know about host-level disk performance – that’s when you use System Metrics. At other times, you will want to scope to individual disks using Storage Metrics. Regardless of what your use case is, now you know the difference and can set up alert conditions on your account to better suit your needs.


Is it possible to alert on some value other than % utilization or to make % utilization calculated based on a fixed value? For example, a 1GB value that is 90% used has 100MB free whereas a 1TB volume has 100GB free. Assuming a utilization rate of 10MB per hour, that 1TB volume has nothing to worry about at 90%.

We’ve forced this in other environments by setting thresholds that are calculated on the back-end and then injected into the environment as a custom % threshold. Maybe a custom metric as talked about in Relic Solution: How to Set Up Alertable Metrics Using Infrastructure SDK?

1 Like

Hi @jbiggley

That’s a very interesting suggestion! It sounds like you’d like to set up an alert condition that will let you know when fill rate rises above a certain percentage of free space.

Although this is currently not possible with our products out of the box, I think you could probably set something up using the Infrastructure SDK and custom metrics.

Let us know how it goes!

1 Like

8 posts were split to a new topic: Infra disk alerts