Kubernetes New Relic agents restarting infinitely

Hi!
For some reason newrelic infrastructure monitoring for Kubernetes (RKE) not working as expected. After waiting more than hour to see that cluster appeared on integration page I’ve connected to kubernetes cluster and saw that newrelic kubelet pods restarting infinitely, if we are look into pod logs we can see this:
kubelet container:
time=“2022-05-20T12:21:28Z” level=info msg=“Waiting for agent container to be ready…”
time=“2022-05-20T12:21:28Z” level=info msg=“New Relic Kubernetes integration Version: v3.2.0, Platform: linux/amd64, GoVersion: go1.17.9, GitCommit: 07996399732a02ea798be9df034446fbda254010, BuildDate: Mon May 16 11:56:34 UTC 2022\n”
time=“2022-05-20T12:21:28Z” level=info msg="Trying to connect to kubelet locally with scheme=“https” hostURL=“10.1.2.206:10250"”
time=“2022-05-20T12:21:28Z” level=info msg="Connected to Kubelet through nodeIP with scheme=“https” hostURL=“10.1.2.206:10250"”
time=“2022-05-20T12:22:01Z” level=error msg=“retrieving scraper data: retrieving kubelet data: kubelet data was not populated after trying all endpoints”

10.1.2.206:10250 is accessible from this container, I tried to curl it from inside of kubelet container and it was able to connect to this endpoint.

Any ideas?

Hey there @dkuklov,

I hope you are doing well!

While your question is a bit out of my scope and I am unable to find exactly what is causing this for you. One thing that can cause the constant restarts is due to not having enough resources. With that said I would like to loop in one of our Infrastructure experts to look over this as well to help us pinpoint why you are experiencing this. I appreciate your patience as we continue to provide support.

Please let us know if you have further questions that we can help with in the meantime and we will be more than happy to assist. Our team will reach out here shortly when we have an update for you.

no, there are enough resources for NR agents

@dkulov
I would suggest to enable verbose logs on the infra agent container/kubelet pod.

level=debug logs should tell us more about the issue.
If you are deploying it on RKE1 then Extra configuration is needed to instrument control plane components.

Hope this helps!