Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

New Relic on Docker EE Kubernetes


#1

Hi,

I am trying to set up New relic Kubernetes integration on baremetal Kubernetes cluster running on Docker EE. The newrelic daemon set is created just fine but the pods on worker nodes start terminating and restarting with the following error:

time=“2019-03-27T20:13:00Z” level=info msg=“New Relic Infrastructure Agent version 1.1.14 Creating Service (1.055075ms)”
time=“2019-03-27T20:13:00Z” level=info msg=“Agent service manager started successfully. (1.100827ms)” service=newrelic-infra
time=“2019-03-27T20:13:00Z” level=info msg=“New Relic Infrastructure Agent version 1.1.14 Initializing (1.361709ms)”
time=“2019-03-27T20:14:02Z” level=warning msg=“network error waiting for endpoint, retrying” error=“Head https://infra-api.newrelic.com: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”
time=“2019-03-27T20:14:22Z” level=warning msg=“network error waiting for endpoint, retrying” error=“Head https://infra-api.newrelic.com: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”
time=“2019-03-27T20:14:47Z” level=warning msg=“network error waiting for endpoint, retrying” error=“Head https://infra-api.newrelic.com: dial tcp: i/o timeout”
time=“2019-03-27T20:15:17Z” level=warning msg=“network error waiting for endpoint, retrying” error=“Head https://infra-api.newrelic.com: dial tcp: i/o timeout”
time=“2019-03-27T20:15:52Z” level=warning msg=“network error waiting for endpoint, retrying” error=“Head https://infra-api.newrelic.com: dial tcp: i/o timeout”

Please can you assist? Master and Worker nodes are in different network VLANS.


#2

Hi @vipalazhi

It looks like the pods are unable to reach https://infra-api.newrelic.com - do they require a proxy to reach the address? Here is more info on the IP ranges used by infra-api:

-David


#3

Thanks for the reply David. All the nodes have outbound internet access without needing a proxy. I am able to Telnet to https://infra-api.newrelic.com and also access the API for infra-api.newrelic.com from the worker node where the pods are terminating:

[root@worker1 ~]# telnet infra-api.newrelic.com 443
Trying 162.247.242.5…
Connected to infra-api.newrelic.com.
Escape character is ‘^]’.
^]

telnet> status
Connected to infra-api.newrelic.com.
Operating in obsolete linemode
Local character echo
Escape character is ‘^]’.
^]q

telnet> q
Connection closed.

API Access works with

curl -v -X GET --header "X-Api-Key:{My API KEY} " “https://infra-api.newrelic.com/v2/alerts/conditions/{My Condition ID}”


#4

Hello!

Could you confirm that your telnet was from on of the infra containers? We’ve seen this issue before with custom network layer implementations like flannel.

A way to check this would be to use the following yml in your environment:

apiVersion: v1
kind: Pod
metadata:
  name: issue
spec:
  hostNetwork: true
  dnsPolicy: ClusterFirstWithHostNet
  containers:
    - name: issue
      image: newrelic/infrastructure-k8s:1.6.0
      imagePullPolicy: IfNotPresent
      command: ["bin/sh"]
      args: ["-c", "apk add curl; nslookup infra-api.newrelic.com; curl --head -H \"User-Agent:New Relic Infrastructure Agent version 1.0.944\" -H \"Content-Type:application/json\" -H \"X-License-Key:<yourLIcenseKey>\" -H \"X-NRI-Entity-Key:some entityID\" https://infra-api.newrelic.com -v"]
  nodeSelector:
    issue: "true"

Then, lets add the environment variables NRIA_STARTUP_CONNECTION_TIMEOUT=360s and NRIA_STARTUP_CONNECTION_RETRY_TIME=15s and redeploy the integration yaml (with kubectl apply -f path_to_file.yaml ) to update the daemonset.

Then run this to get a shell into the pod kubectl exec -it <POD_NAME> sh and run:

$ apk add curl
$ nslookup infra-api.newrelic.com
$ curl -X POST -H "User-Agent:New Relic Infrastructure Agent version 1.0.944" -H "Content-Type:application/json" -H "X-License-Key:<yourLicenseKey>" -H "X-NRI-Entity-Key:some entityID" https://infra-api.newrelic.com -v 

Regards,

Paul


#5

The issue turned out to be an issue with the kube dbs, we were using calico , cross subnet between the nodes and IPnIP configuration had issues.
This is fixed now.


#6

That is great news. Thanks for letting us know! :blush: @vipalazhi