Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

DISK I/O Metrics


#1

Answer:

The disk I/O is measured in time spent servicing requests. The way we measure disk I/O on Linux is “wall-clock time with read or write requests being processed by the disk”. This is obtained from a counter the OS keeps in /proc which is read by the Server Monitor Agent. Disk I/O is displaying what percentage of time a disk is in use by a read or write command. For example, if for 55 seconds out of the minute a disk is reporting a read or write, then the value shown will be 92% Disk I/O Utilization.

This only shows a percentage of time that the disk is being used, it does not reflect a fullness or amount that is being read or written. Most commonly Databases show very high Disk I/O utilization due to the large number of read write requests. This is not tracking a metric equivalent to either IOPS (I/O Operations per second), or to “capacity” - how close to maximum possible your usage is. So you can imagine a situation where a program logs twice a second bringing usage to 100%, even though you could also be successfully writing 10x that much data without issue to the same disk.

We gather this information directly from the Linux OS. by reading the /proc file. From this we gather

IO Utilization (percentage of time spent reading and writing)
IO Rate (amount of data being read written over time in kb/s
IO Operations per second ( how many operations are executed per second)
Disk Space utilization


#2

I’ve noticed a big problem with the disk I/O graphs (or perhaps any New Relic graphs for that matter). I got an alert for disk IO hitting 100% for about a 40 minutes. Looking at the alert’s detail graph, it’s very precise. However, when I go to a 12 hour view, it shows just a pinprick at 100% (where it should’ve still shown much more accurately the problem). Even worse, when I switch to the 24 hour view, it doesn’t show anything exceeding 50%. That’s inaccurate and misleading. It appears that the build of that graph is using averaging, not actual data. If so, that’s not a good method - it can and does lose efficacy and accuracy - two traits that we are critical to our monitoring needs.


#3

All New Relic graphs gradually aggregate data over longer time windows as the time period that you’re looking at increases. The exact behavior of this windowing is described in our documentation. (This behavior is also influenced by how old the data is.)

We do this in order to make graphs more readable at longer time windows, while preserving the granularity at shorter time windows. Until the data ages out, you can always zoom in to see the greatest accuracy.


archived #4