I’m trying to use the kubernetes integration and my agents have stopped gathering metrics. I’m getting 404 errors on the collection endpoints but I’m not sure if it’s because I’ve hit my usage limits or is there some other issue at play here?
I realised that I’ve gone over my 100GB free tier ingestion limit as my kubernetes agents were ingesting about 9GB a day (which seems extremely high). I’ve used the 7 day override that I thought would allow me to address my metrics problems and cut down my metrics gathering and I’ve tried to reduce the sampling rate by setting some environment variables but now all my metric collection has stopped.
In my logs for the newrelic
nri-bundle-newrelic-infrastructure pods I’m getting the errors below:
time="2021-03-14T19:34:20Z" level=warning msg="URL error detected. May be a configuration problem or a network connectivity issue." component=AgentService error="Head \"https://infra-api.eu.newrelic.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" service=newrelic-infra time="2021-03-14T19:34:20Z" level=warning msg="Collector endpoint not reachable, retrying..." collector_url="https://infra-api.eu.newrelic.com" component=AgentService error="Head \"https://infra-api.eu.newrelic.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)" service=newrelic-infra
I’ve also tried to check that I don’t have a firewall issue by CURLing the endpoints from my cluster.
curl --head -v https://infra-api.eu.newrelic.com * Trying 18.104.22.168:443... * Connected to infra-api.eu.newrelic.com (22.214.171.124) port 443 (#0) * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /Users/<myuser>/opt/anaconda3/ssl/cacert.pem CApath: none * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (IN), TLS handshake, Server key exchange (12): * TLSv1.2 (IN), TLS handshake, Server finished (14): * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.2 (OUT), TLS handshake, Finished (20): * TLSv1.2 (IN), TLS handshake, Finished (20): * SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 * ALPN, server did not agree to a protocol * Server certificate: * subject: C=US; ST=California; L=San Francisco; O=New Relic, Inc.; CN=*.eu.newrelic.com * start date: Aug 19 00:00:00 2020 GMT * expire date: Nov 22 00:00:00 2022 GMT * subjectAltName: host "infra-api.eu.newrelic.com" matched cert's "*.eu.newrelic.com" * issuer: C=US; O=DigiCert Inc; CN=DigiCert SHA2 Secure Server CA * SSL certificate verify ok. > HEAD / HTTP/1.1 > Host: infra-api.eu.newrelic.com > User-Agent: curl/7.71.1 > Accept: */* > * Mark bundle as not supporting multiuse < HTTP/1.1 404 Not Found HTTP/1.1 404 Not Found < Date: Sun, 14 Mar 2021 19:45:15 GMT Date: Sun, 14 Mar 2021 19:45:15 GMT < Content-Type: text/plain; charset=utf-8 Content-Type: text/plain; charset=utf-8 <
Not sure if 404 is expected but I think means my firewall isn’t blocking the connection.
Any idea if this is a configuration problem or an account issue?
Thanks in advance for any help.