Reduce the amount of ingested data by kubernetes integration

Hi. I’d like to know if there’s any way to reduce the amount of ingested data coming from kuberentes integration by increasing the sampling interval or such. We’re using the nri-bundle for kubernetes, and you can tune the sampling interval for hosts metrics, however there’s no configuration available for all the kubernetes metrics that kube-state-metrics exposes. Is there a way to do so?

Thank you.

1 Like

Hi @ferran3, welcome to the Explorers Hub.

One option would be to disable ksm metrics:

https://docs.newrelic.com/docs/integrations/kubernetes-integration/installation/kubernetes-integration-install-configure#disable-kube-state-metrics

This would have the following effect:

Disabling kube-state-metrics also disables data collection for the following:

ReplicaSets
DaemonSets
StatefulSets
Namespaces
Deployments
Services
Endpoints
Pods (that are pending)

Additionally, disabling this affects the Kubernetes Cluster Explorer in the following ways:

No pending pods are shown.
No filters based on services.

You can also limit which processes to send to New relic as well as changing the process sample rate.

https://docs.newrelic.com/docs/integrations/kubernetes-integration/installation/kubernetes-integration-install-configure#include-matching-metrics

Could you be more specific and show us how to do this with helm? Disabling the ksm is pretty trivial with helm, but modifying the yaml file for the infrastructure agent doesn’t, exactly, work the same way the documentation provides…

For example, If I use the following command to deploy the nri-bundle, can you confirm that it will, indeed, change the storage_sample_rate to once every 50 seconds?
helm upgrade
–install newrelic-bundle newrelic/nri-bundle
–namespace=default
–set global.licenseKey=MyKey
–set global.cluster=MyCluster
–set kubeEvents.enabled=true
–set webhook.enabled=true
–set prometheus.enabled=true
–set logging.enabled=true
–set ksm.enabled=true
–set infrastructure.metrics_storage_sample_rate=50

@troy_knapp Did you ever solve this?

I’m still kind of New Relic newb, so I don’t know how to check if it’s working or not… my clusters vary pretty wildly so it’s hard to just look at something like the changes in storage sampling when you’re constantly scaling.

The documentation on helm, certainly leaves a lot to be desired.

That sample rate switch is probably not quite right, try:
–set newrelic-infrastructure.config.metrics_storage_sample_rate=50

Also, you can disable particular ksm collectors:
–set kube-state-metrics.collectors.certificatesigningrequests=false

As @nmcnamara says some collectors are required to drive the New Relic interface, so testing is needed.

Where exactly this should be performed? I need to install nri-bundle without “replicasets” collector, so I tried Helm chart installation (using Helm 3, --set kube-state-metrics.collectors.replicasets=false) but it’s failing with following message:

Error: expected at most two arguments, unexpected arguments: –-set, kube-state-metrics.collectors.replicasets=false

Hi dpasseri

I can disable ksm collectors by following Helm chart install command:

helm install newrelic-bundle newrelic/nri-bundle
–set global.licenseKey=LICENSE_KEY
–set global.cluster=MYK8S
–namespace=newrelic
–set infrastructure.enabled=true
–set newrelic-infrastructure.config.metrics_network_sample_rate=30
–set newrelic-infrastructure.config.metrics_process_sample_rate=30
–set newrelic-infrastructure.config.metrics_storage_sample_rate=30
–set newrelic-infrastructure.config.metrics_system_sample_rate=30
–set newrelic-infrastructure.config.metrics_nfs_sample_rate=30
–set prometheus.enabled=false
–set ksm.enabled=true
–set kube-state-metrics.collectors.certificatesigningrequests=false
–set kube-state-metrics.collectors.configmaps=false
–set kube-state-metrics.collectors.ingresses=false
–set kube-state-metrics.collectors.limitranges=false
–set kube-state-metrics.collectors.mutatingwebhookconfigurations=false
–set kube-state-metrics.collectors.networkpolicies=false
–set kube-state-metrics.collectors.poddisruptionbudgets=false
–set kube-state-metrics.collectors.persistentvolumeclaims=false
–set kube-state-metrics.collectors.persistentvolumes=false
–set kube-state-metrics.collectors.replicationcontrollers=false
–set kube-state-metrics.collectors.resourcequotas=false
–set kube-state-metrics.collectors.secrets=false
–set kube-state-metrics.collectors.statefulsets=false
–set kube-state-metrics.collectors.storageclasses=false
–set kube-state-metrics.collectors.validatingwebhookconfigurations=false
–set kube-state-metrics.collectors.volumeattachments=false
–set kubeEvents.enabled=true

I have checked the ksm pod log and confirmed it works. Hope this can help you.

Another way to reduce data is by setting revisionHistoryLimit to zero if you don’t ever need to roll back. If you do you will want to set it to at least 2. The default for K8 is 10. By setting it to 0 I reduced my data by about 25%.

Here is the quick script I wrote that loops over the namespace.
export K8_NAMESPACE=“yournamespace”
kubectl --namespace $K8_NAMESPACE get deployments --template ‘{{range .items}}{{.metadata.name}}{{"\n"}}{{end}}’ |xargs kubectl -n $K8_NAMESPACE patch deployment $1 -p ‘{“spec”:{“revisionHistoryLimit”:0}}’

1 Like

@godleon , thanks for the snippet! After applying it my metrics part of the graph is gone, but my “Infrastructure inegration cost” haven’t changed. Is there a way to reduce it?

Hi @serhii.ostapchuk

I think you can check the logs of the pod named “newrelic-bundle-kube-state-metrics-xxxx”. In my env, I can get some information below to indicate what kinds of kubernetes metrics are collected by this pod for me.

Active collectors: cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,jobs,namespaces,nodes,pods,replicasets,services

Hope this helps.

@leon.tseng ah, so all of those go to “Integrations part”. And we cannot reduce sample rate for them as well, right, just disable some of them?