Kubernetes - Get total allocated cpu cores for a cluster

I am trying to get the total allocated cpu cores in a cluster. This would ideally be the sum of all container requests in a cluster. This value at node level can be retrieved by running kubectl describe node <node-name> command. We used to sum up these values to get total allocated cpu in a cluster.

Now, we are switching over to NR and trying to create dashboards with these values. We are making use NR infra agent to collect the metrics from our clusters. And we are creating dashboards using these metrics. But, we are unable to find a metric which would help us compute the total allocated cpu cores.

We tried the following query…
SELECT latest(cpuRequestedCores)/1000 from K8sNodeSample facet clusterName, nodeName limit max
But this returned a value higher than the coreCount (eg. 27 cores was returned, while the node has only 16 cores)

We later realized that this was considering the container request associated with ‘Terminated’ pods as well. And we couldn’t find a way to filter out such pods (as we are only interested in ‘Running’ pods).

We later tried a different query using K8sContainerSample…
select sum(rcores) from (SELECT latest(cpuRequestedCores) As rcores from K8sContainerSample where status != 'Terminated' FACET clusterName,nodeName,namespace,podName,containerName limit max) facet clusterName limit max

This query does work, but only for smaller clusters. This is mainly because NR has a max. limit of 2000 rows per query. One of our clusters has more that 2000 containers running and the inner query truncates the results (to 2000 rows). This results in incorrect value for large clusters (ie., shows a lower value of allocated cou cores).

Need your help on this. It would have been nice if cpuRequestedCores in K8sNodeSample could return the allocated cpu resources based on ‘Running’ pods.

kubectl describe node nod...

.....
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests           Limits
  --------           --------           ------
  cpu                9581m (60%)        41216m (259%)
  memory             35942203776 (58%)  70791972352 (114%)
  ephemeral-storage  0 (0%)             0 (0%)
  hugepages-1Gi      0 (0%)             0 (0%)
  hugepages-2Mi      0 (0%)             0 (0%)

9581m is the allocated cpu in the node. I couldn’t find this info in NR metrics.

@dinup.pillai Welcome to the community :slight_smile: Thank you for being patient with getting a response to your issue. In viewing your account I see you do have access to ticketing support so I’m going to open a ticket so that our team can take a closer look at this with you. Expect an email from them soon!