Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

Kube-state-metrics label selector for pod/service

feature-request

#1

Hello,
Thank you for releasing the beta kubernetes integration. We started ingesting data from our test environment successfully with a few tweaks to get it working.

Is it possible for k8s newrelic-infrastructure beta container to search for a different label “k8s-app: kube-state-metrics”?

failed to discover nodeIP with kube-state-metrics, got error: no pod found by label k8s-app=kube-state-metrics
failed to discover kube-state-metrics endpoint, got error: no service found by label k8s-app=kube-state-metrics

Thank you

Ed De


#2

Hi @edde,

At the moment is not possible to customise either the App name or label but that is an interesting idea.

As you know, the K8s integration is taking its first steps and that’s exactly the kind of feedback we’re looking for from our customers, so thanks very much for your input.

I’m going to submit a feature request and flag it to our Product Management team.

If you have any other issues or suggestions feel free to reach out to us again.

Thanks,


#3

@edde @ccastro I found a work around as I saw these in our logs as well.

First, get the name of your deployment for kube-state-metrics:

$: kubectl get deployments | grep kube-state-metrics
kube-state-metrics-kube-state-metrics   1         1         1            1           17m

So we will be using “kube-state-metrics-kube-state-metrics” for the following command:

kubectl patch deployments kube-state-metrics-kube-state-metrics --type='json' \
  --patch="$(curl https://gist.githubusercontent.com/ellisio/70058000c102aa911b95d778d1fbcd7a/raw/newrelic-kube-state-metrics-patch.yaml)"

If you want to inspect that Gist, you’ll see it is a PATCH that is adding the “k8s-app: kube-state-metrics” to the kube-state-metrics deployment. This label was not included when I ran:

helm install \
  --name kube-state-metrics \
  stable/kube-state-metrics

Once I patched the Deployment these entries stopped happening.


#4

Thanks @ellisio for suggesting a solution for this. Offering a way to configure how to find the kube-state-metrics service in in our roadmap.

It looks like you are only patching the labels for pods that belong to that deployment. Our integration requires the k8s-app label on the kube-state-metrics Service as well. Did you add that label separately?


#5

Hey @rguiu,

Looks like I did patch the service as well. Updated patch commands are (so no one has to click a Gist to see what the contents are):

kubectl patch svc kube-state-metrics-kube-state-metrics --type='json' \
  --patch="[{'op':'add','path':'/metadata/labels/k8s-app','value':'kube-state-metrics'}]"

kubectl patch deployments kube-state-metrics-kube-state-metrics --type='json' \
  --patch="[{'op':'add','path':'/spec/template/metadata/labels/k8s-app','value':'kube-state-metrics'}]"

However, I noticed the following error when the all pods came online:

time="2018-04-26T15:03:25Z" level=error msg="Getting the Sampling Path for plugin" data prefix=integration/com.newrelic.kubernetes entity=gcp-5237939866413758294 error="Plugin not registered: integration/com.newrelic.kubernetes" plugin name=nri-kubernetes pluginID=integration/com.newrelic.kubernetes

I also noticed this error, which only happens on one of the three nodes in the default-pool:

time="2018-04-26T15:03:29Z" level=error msg="executing data source" data prefix=integration/com.newrelic.kubernetes error="exit status 1" plugin name=nri-kubernetes stderr="time=\"2018-04-26T15:03:26Z\" level=warning msg=\"Environment variable NRIA_CACHE_PATH is not set, using default /tmp/nr-kubernetes.json\"\ntime=\"2018-04-26T15:03:29Z\" level=warning msg=\"Recoverable error group: no data found for replicaset object, no data found for namespace object, no data found for deployment object, no data found for pod object\"\ntime=\"2018-04-26T15:03:29Z\" level=warning msg=\"No data was populated\"\ntime=\"2018-04-26T15:03:29Z\" level=panic msg=\"no data was populated\"\ntime=\"2018-04-26T15:03:29Z\" level=fatal msg=\"no data was populated\"\n"

Any idea what those are about?

Cheers,
Andrew


#6

Hey @ellisio,

With regards to the first error (“Plugin not registered: integration/com.newrelic.kubernetes”), it’s an error that should never happen since it basically means that the Kubernetes integration is not available in the image that was downloaded, which is of course not possible.

We have seen that error in the logs when the node does not have enough resources to run the newrelic-infra pod. Could you check the Kubernetes events (kubectl get events) to see if there are any errors deploying the DaemonSet? The error may look DaemonSet Warning FailedPlacement daemonset-controller failed to place pod on "gke-mycluster-default-pool-21edba18-rrnw": Node didn't have enough resource: cpu, requested: 100, used: 900, capacity: 940

With regards to the second error, it seems to indicate that the integration is not fetching any data from kube-state-metrics. I am assuming you are running kube-state-metrics 1.3.0?

Could you enable verbose mode and share the logs? Instructions: https://docs.newrelic.com/docs/integrations/host-integrations/host-integrations-list/kubernetes-monitoring-integration#verbose

Thanks.


#7

Hey @rguiu,

I just rebuilt our test cluster and installed the agent again. The “executing data source” error no longer happens.

However, the “Getting the Sampling Path for plugin” still happens. I ran the following and got no results:

kubectl get events | grep "DaemonSet Warning"

The output of running kubectl get events | grep newrelic: https://gist.github.com/ellisio/490aa47bf87101d573914462bacb8b16

Cheers,
Andrew


#8

Hi @ellisio, is the error showing all the time or only once at the beginning? Are you noticing any data being lost? It would help if you could share the full verbose logs so that we can see what is going on.


#9

Hi @edde ,
Version beta2.3 of the Kubernetes integration is now available and includes an improvement related to this issue (discovery of kube-state-metrics by the label app=kube-state-metrics). I invite you to try to deploy again.
Thanks!


#10

Hi all,

It looks like the kube-state-metrics helm chart updated their labels, and now the newrelic-infrastructure helm chart isn’t working either.

Here’s the commit that updated the kube-state-metrics labels: https://github.com/helm/charts/commit/ece06ce7535925760b901d630320b429524e6dea

For now I manually edited the kube-state-metrics service and deployment to add the labels expected by newrelic-infrastructure. But the best course of action i believe is to change the label selectors to the new kube-state-metrics labels.

Thanks!


#11

Hi @aron3,

Thanks a lot for raising this and pointing us to the solution, highly appreciated!

Can you confirm that the following worked to solve the issue?
kubectl label pod/<Kube-state-metrics pod name> "app.kubernetes.io/name=kube-state-metrics" -n <namespace where Kube-state-metrics is installed, usually kube-system>

We’ll add it to our documentation until we release a newer version with the fix.

Thanks,
JF


#12

Hi @jjoly.

I manually edited the yaml files using kubectl edit, but I believe with that syntax it’ll be:

kubectl label pod/<Kube-state-metrics pod name> "app=kube-state-metrics" -n <namespace where Kube-state-metrics is installed, usually kube-system>

(app=kube-state-metrics is the label newrelic-infrastructure is looking for)


#13

Awesome, thanks @aron3 !
You are right for the label, I copied the new one from the helm chart instead of the one we use for detection. Thanks for spotting it.

JF


#14

We updated the New Relic Kubernetes integration to be compatible with the change in label for KSM:
https://docs.newrelic.com/docs/release-notes/platform-release-notes/host-integrations-release-notes/new-relic-integration-kubernetes-193

Helm chart will be updated as well.

Thanks again @aron3 for bringing this change to our attention, highly appreciated!