Your data. Anywhere you go.

New Relic for iOS or Android


Download on the App Store    Android App on Google play


New Relic Insights App for iOS


Download on the App Store


Learn more

Close icon

Config Overrides

rfb

#1

Is there a list of other configuration overrides that are available?

For example, I would like to communicate with the kubernetes service at kubernetes.default.svc.cluster.local instead of kubernetes.default.


#2

@sohail.ahmed could you provide more context so that we can understand why you would need to make that change?


#3

I would like the New Relic infrastructure agent to communicate with the kubernetes API at a specific address. Specifically, I have an issue with my TLS cert that is not signed for the shorter API domain name.

\"\ntime=\"2018-05-03T23:18:07Z\" level=fatal msg=\"error trying to connect to: https://kubernetes.default/api/v1/nodes/ip-10-17-44-49.ec2.internal/proxy/healthz. Got error: Get https://kubernetes.default/api/v1/nodes/ip-10-17-44-49.ec2.internal/proxy/healthz: x509: certificate is valid for ip-10-17-41-79.xxx.xxx.io, apiserver, leader.telekube.local, kubernetes.default.svc.cluster.local, ip-10-17-41-79.ec2.internal, stable.xxx.xxx.xxx.io, not kubernetes.default \"\n"

I currently cannot fix the cert at the moment for reasons outside of my control, but having any additional configuration overrides would be helpful.


#4

@sohail.ahmed thanks for the additional information. We will take your input into consideration for future enhancements.


#5

+1 - I’m having the same issue and it would be much easier if I could just change the cluster API that this thing is trying to talk to.


#6

Thanks for the input @tsullens! We’ll take it into consideration.


#7

Is there any update on this, or is there a work around so that I can get started?
Datadog was easy enough for me to get set up and at this point I’m pretty much being forced to go with them simply because I can’t even trial the New Relic implementation.


#8

+1 @tsullens We are blocked on this issue as well.


#9

+1 am now seeing this issue and it’s blocking us.


#10

@sohail.ahmed @matt.burdan We are looking into this issue. Can you confirm that in your platform “cert is not signed for the shorter API domain name” ?


#11

@jjoly that is correct

time="2018-06-27T18:59:23Z" level=error msg="executing data source" data prefix=integration/com.newrelic.kubernetes error="exit status 1" plugin name=nri-kubernetes stderr="time=\"2018-06-27T18:59:23Z\" level=warning msg=\"Environment variable NRIA_CACHE_PATH is not set, using default /tmp/nr-kubernetes.json\"\ntime=\"2018-06-27T18:59:23Z\" level=panic msg=\"error trying to connect to: https://kubernetes.default/api/v1/nodes/NODE-NAME/proxy/healthz. Got error: Get https://kubernetes.default/api/v1/nodes/NODE-NAME/proxy/healthz: x509: certificate is valid for leader.telekube.local, apiserver, kubernetes.default.svc, kubernetes.default.svc.cluster.local not kubernetes.default \"\ntime=\"2018-06-27T18:59:23Z\" level=fatal msg=\"error trying to connect to: https://kubernetes.default/api/v1/nodes/NODE-NAME/proxy/healthz. Got error: Get https://kubernetes.default/api/v1/nodes/NODE-NAME/proxy/healthz: x509: certificate is valid for NODE-NAME leader.telekube.local, apiserver, kubernetes.default.svc, kubernetes.default.svc.cluster.local, not kubernetes.default \"\n"

#12

Great thanks for confirming @matt.burdan , we are adding the capacity to set the Kubernetes URL so that it will match the certificate.
I will keep updating this thread.


#13

thanks @jjoly. look forward to the update.


#14

any update on this? saw a new docker tag pushed but don’t see any new configuration in the new k8s resource definition.


#15

@matt.burdan We have released version 1.1.0 which includes support for specifying the Kubernetes API Host and Port by setting the KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT environment variables: https://docs.newrelic.com/docs/release-notes/platform-release-notes/host-integrations-release-notes/new-relic-integration-kubernetes-110


#16

@bpedro thanks so much for the update. are both environment variables required or can I just set the KUBERNETES_SERVICE_HOST with default port?


#17

@matt.burdan you can just set one . If not set it will use the default.

Thanks and let us know how it goes.


#18

@jjoly I believe the fix worked as now the agent is starting without errors.

time="2018-08-09T00:54:47Z" level=info msg="New Relic Infrastructure Agent version 1.0.944 Creating Service (886.986µs)"
time="2018-08-09T00:54:47Z" level=info msg="Agent service manager started successfully. (1.019695ms)" service=newrelic-infra
time="2018-08-09T00:54:47Z" level=info msg="New Relic Infrastructure Agent version 1.0.944 Initializing (1.206244ms)"
time="2018-08-09T00:54:47Z" level=info msg="New Relic Infrastructure Agent version 1.0.944 Running (404.020128ms)"

we do however after a certain time period we are seeing networking issues:

time="2018-08-09T01:58:58Z" level=error msg="metric sender can't process 1 times" error="Error sending events: Post https://infra-api.newrelic.com/metrics/events/bulk: read tcp $INTERNAL_IP->162.247.242.5:443: read: connection reset by peer"

but reviewing the documentation regarding networking (https://docs.newrelic.com/docs/apm/new-relic-apm/getting-started/networks#infrastructure) it only mentions that we should have outbound open to the infra-api endpoint which our pods appear to have so I am not sure what the issue is:

[root@ip-$INTERNAL_IP /]# telnet infra-api.newrelic.com 443
Trying 162.247.242.5...
Connected to infra-api.newrelic.com.
Escape character is '^]'.

I can open another ticket/topic if required


#19

@matt.burdan thanks for the update !

The connection to Kubernetes API is proceeding, that’s great news.

The message you are seeing for the connection is a different topic , here are a couple of things to verify:

Could you check that whether you are seeing data in Insights ?
SELECT * FROM K8sPodSample since 1 day ago

Regarding the network issue, can you confirm that you have open outgoing connection for the following:

    Domain: infra-api.newrelic.com
    Networks
       50.31.164.0/24
       162.247.240.0/22
    Port: 443
    Domain+Port: infra-api.newrelic.com:443

#20

@jjoly the response to the insights query was:
No events found -- do you have the correct event type and time range?

seem to have an open outgoing connection to the endpoint:

[root@ $INTERNAL_IP/]# lsof -n -i4TCP
COMMAND   PID USER   FD   TYPE     DEVICE SIZE/OFF NODE NAME
newrelic-   1 root    3u  IPv4 1289497145      0t0  TCP $INTERNAL_IP->162.247.242.5:https (ESTABLISHED)

after waiting some time these errors also showed up in the logs:

time="2018-08-11T00:20:03Z" level=error msg="metric sender can't process 1 times" error="Error sending events: Post https://infra-api.newrelic.com/metrics/events/bulk: read tcp $INTERNAL_IP->162.247.242.5:443: read: connection reset by peer"
time="2018-08-11T01:00:06Z" level=info msg="agent has been offline for 86400000000000. Recreating delta store" component=patchSender entityID=i-$RANDOM
time="2018-08-13T01:00:06Z" level=info msg="agent has been offline for 86400000000000. Recreating delta store" component=patchSender entityID=i-$RANDOM