How to view remaining request capacity


I would like to build a dashboard/alert which shows me the sum of K8sContainerSample.cpuRequestedCores versus how many total cores are schedule-able in the cluster. I have not been able to figure this out.

For requested cores I can only get the latest values like this:
FROM K8sContainerSample SELECT latest(cpuRequestedCores) FACET containerID
But I am not able to sum the values.

For the total cores in the cluster, I am unsure how where to get this metric from.

Any pointers would be appreciated.


Hi @mike.luedke ,

One trick in order to sum value is to do the average * the number of instances.
In this case you could do :
FROM K8sContainerSample SELECT average(cpuRequestedCores)*uniqueCount(containerID)

The number of cores get reported from the underlying OS in the SystemSample event type.
To get the total request vs the total cores ; you would use something like:
from SystemSample,K8sContainerSample SELECT (average(cpuRequestedCores)*uniqueCount(containerID))/( average(numeric(coreCount))*uniqueCount(hostname) )



Thanks for this suggestion. Unfortunately the result seems to be way off from the actual values that we see in kube-state-metrics. I think the problem with this approach is that we lose too much fidelity when we are talking about thousands of containers.

Edit: I also realized the suggestion above averages all historical values, not just current values. That makes the calculation invalid.


You could restrict the timeframe so that averages are more accurate:
from SystemSample,K8sContainerSample SELECT (average(cpuRequestedCores)*uniqueCount(containerID))/( average(numeric(coreCount))*uniqueCount(hostname) ) since 2 minutes ago until 1 minute ago

Another approach is to represent the data as a timeserie so you can see the evolution:
from SystemSample,K8sContainerSample SELECT (average(cpuRequestedCores)*uniqueCount(containerID))/( average(numeric(coreCount))*uniqueCount(hostname) ) since 1 day ago TIMESERIES