NRQL query | Alerts

Hello, :slight_smile:

How can i get CPU utalization % using K8NodeSample? Could you kindly correct my query below…

FROM K8sNodeSample SELECT latest(cpuUsedCores) - (How to get total core)/100 WHERE clusterName=‘mycluster’ FACET nodeName

Or if there is any other query that i can use to sent up hight CPU utilization alert for pods? Thankyou!

Hey @Madhu_Sharma1

Does the coreCount attribute work for you?

then for percentage used you can try:

SELECT ((latest(numeric(cpuUsedCores)) / latest(numeric(coreCount))) * 100) AS 'CPU Used %%' FROM K8sNodeSample WHERE cpuUsedCores is NOT NULL AND coreCount IS NOT NULL SINCE 3 HOURS AGO TIMESERIES AUTO

Here’s what that looks like on some demo data over 3hrs:

1 Like

@RyanVeitch You rock!! Thankyou :slight_smile:

1 Like

No worries! Glad to help :smiley:

@RyanVeitch If you could also help me verify two other queries that would be great…How do i get memory used % and Disk used % of nodes in cluster: i dont see any parameter for total memory bytes here

SELECT (latest(numeric(memoryUsedBytes)) / latest(What will go here) AS ‘Memory Used %%’ FROM K8sNodeSample WHERE clusterName=‘MyclusterName’ FACET nodeName TIMESERIES

Also below is the one that i am using for Disk used % . Please correct if the query looks wrong:

SELECT (latest(numeric(fsAvailableBytes)) / latest(numeric(fsCapacityBytes)) *100) AS ‘Disk/storage Used %%’ FROM K8sNodeSample WHERE clusterName=‘MyClusterName’ FACET nodeName TIMESERIES


Hey @Madhu_Sharma1

I don’t see a attribute that maps to total memory bytes either, but, I think this is workable:

SELECT (latest(memoryUsedBytes) / latest(memoryAvailableBytes + memoryUsedBytes)) * 100 AS 'Memory Used %%' FROM K8sNodeSample WHERE clusterName = 'MyClusterName' FACET nodeName TIMESERIES 

Here we are adding the memoryAvailableBytes to the memoryUsedBytes to synthetically create our own memoryTotalBytes.

1 Like

Also - your disk used query looks fine to me :slight_smile:

1 Like

@RyanVeitch if i compare Node resource utilization graph from cluster explorer and the graph that i get from above 3 queries that we discussed ( CPU,DIsk and Memory) i get different results.

For instance, cluster explorer says node1 disk usage is 20% but my NRQL query shows 80% :confused:

So these queries are looking at the latest reported value. It’s possible the chart you are looking at is average.

You can change that in your queries from latest() to average() to see if that helps bring these into alignment.


i think i got the answer for Disk. My query is just looking into sda1 drive. How can i monitor other drives that i have?

also, CPU graph looks fine too… but i still don’t know why Disk graph is mismatched.

Here is my alert policy link
and cluster explorer[timeRange][duration]=1800000

Alos what is the difference between storage graph and disk used %

Sorry for too many questions…

@RyanVeitch Should i open a new ticket if this is closed?

Hi @Madhu_Sharma1 - no here is ok! Sorry, I didn’t see your reply earlier…

Can you clarify the exact disk charts you are hoping to see matched to the query we built?

As for the difference between Storage Used % and Disk Usage, below are the queries showing each of these:

Storage Usage %

SELECT average(diskUsedPercent) as 'Storage used %' FROM StorageSample WHERE `entityGuid` = 'anEntityGuid' TIMESERIES auto

Disk Usage

SELECT latest(diskUsedPercent) as 'Used %' FROM StorageSample FACET device WHERE entityId = 'anEntityId' LIMIT 4

So the differences primarily are that they are looking at different aggregations. Disk Usage is looking at the latest report, and Storage Usage is looking at an average over the timeframe selected. Both of these are looking at the Infrastructure agents default StorageSample event type, not a K8s specific event type, which may also show up some differences between these charts and those in your queries which are looking at the K8sNodeSample event type.

@RyanVeitch its okey :slight_smile: thanks for your response.

so, for my node a see memory utilization in cluster explorer 21% and for the same node using out query i see 44% . why there is difference?

Hi @Madhu_Sharma1

The queries here are different, so the results set are expected to be different too.

The Memory Usage you see in your K8s Cluster Explorer is using the query:

SELECT average(memoryUsedBytes/memoryTotalBytes*100) AS 'Memory used %' FROM SystemSample WHERE `entityGuid` = 'myEntityGuid' TIMESERIES auto

The query you are running for alerts is looking at latest memory data from the K8s event type, rather than the query above looking at averages of the SystemSample event type, not a K8s specific event.

You can absolutely use this query for alerts if your goal is to match the Memory Usage chart in the cluster explorer:

SELECT average(memoryUsedBytes/memoryTotalBytes*100) AS 'Memory used %' FROM SystemSample WHERE `entityGuid` = 'myEntityGuid' TIMESERIES auto

@RyanVeitch But we don’t get SystemSample in kubernetes integration.

No - that’s an infrastructure metric. This is what the Cluster Explorer is using to chart these metrics though.

Do you have the Infra agent installed? If not, you may be able to get close with the k8s Events, but not exactly match the charts that are built on Infrastructure agent data.