Container not stop when agent is stoppped

Hi,

I have followed this guide to setup newrelic agent on Ubuntu Core. Everything is working fine. Except when there is a network issue, the newrelic agent is stopped but the container is still running. If the newrelic agent is stopped the container should terminate and start again.

Please see my logs. How do I get the docker container to restart when the newrelic agent stops?

time="2021-01-12T23:39:57Z" level=warning msg="URL error detected. May be a configuration problem or a network connectivity issue." component=AgentService error="Head \"https://infra-api.newrelic.com\": dial tcp: lookup infra-api.newrelic.com on [::1]:53: read udp [::1]:44844->[::1]:53: read: connection refused" service=newrelic-infra

time=“2021-01-12T23:39:57Z” level=error msg=“Can’t reach the New Relic collector.” component=AgentService error=“Head “https://infra-api.newrelic.com”: dial tcp: lookup infra-api.newrelic.com on [::1]:53: read udp [::1]:44844->[::1]:53: read: connection refused” service=newrelic-infra
time=“2021-01-12T23:39:57Z” level=info msg=“agent process exited, stopping agent service daemon…” exit_code=1
time=“2021-01-13T02:31:10Z” level=info msg=“service is stopping. waiting for agent process to terminate…”
[WARN tini (3602)] Tini is not running as PID 1 and isn’t registered as a child subreaper.
Zombie processes will not be re-parented to Tini, so zombie reaping won’t work.
To fix the problem, use the -s option or set the environment variable TINI_SUBREAPER to register Tini as a child subreaper, or run Tini as PID 1.

Hi @jhe, this looks like DNS resolution is failing and so the agent can’t complete its startup process.

Can you verify the host that’s running the agent can properly resolve infra-api.newrelic.com? To do this please run either a dig or nslookup :slight_smile:

Hi,

Yes the DNS failed. The scenario is when newrelic agent is running there is no network connection yet. This is why DNS failed.

I want the agent container to fail or exit when there is no network connection. So it force the container to restart. But now if there is no network connection the container is stuck there in a zombie state.

This issue won’t tell the container to exit, I think that would be unusual because it’s harder to troubleshoot the issue with the agent then. It would require some additional logic to tell the container to attempt a restart, it’s not a feature we have at this time