@juliandelrio and @admin16,
Error Message Explanation
The error message ENOENT
indicates the agent or its worker daemon are being denied entry to /tmp/.newrelic.sock
. This file is a Unix Domain Socket, which the agent and daemon use to communicate by default. This error is often related to permissions and security issues.
The error message ECONNREFUSED
indicates that the agent cannot detect its worker daemon. This might be because the daemon is not running, or because the agent and daemon are communicating in different ways, or because the daemon does not have permission to edit the current UDS at /tmp/.newrelic.sock
.
Resolution
To resolve both of these issues, you can perform the following.
- Delete the current file at
/tmp/.newrelic.sock
.
- If it exists, delete the file at
/etc/newrelic/newrelic.cfg
or simply rename it, for example as newrelic.cfg.bak
.
- Change the communication method of the agent and daemon by setting the below in the newrelic.ini file:
newrelic.daemon.port= "@newrelic-daemon"
- Kill the agent and daemon by running this command:
killall newrelic-daemon
- Restart the web server to redeploy the agent, which will restart/respawn the daemon.
Explanation
Step 1: Daemon File Permissions
We delete the current UDS at /tmp/.newrelic.sock
in case the current worker daemon is lacking permissions to edit it. This may occur if the daemon did not restart properly and the previous daemon still has this file locked. Deleting this file allows the current daemon to create a new one that it can write to.
Step 2: External Startup Mode
If the file newrelic.cfg
exists in the path /etc/newrelic/
then it forces the agent into external startup mode, which means the daemon must be started manually. If this file is present, then in the agent and daemon logs you will also see startup=init
when the agent and daemon startup and establish connection.
We recommend using agent startup mode instead, because it ensures that the agent spawns a worker daemon if it cannot detect one running already. This is a more reliable method of starting the agent. Deleting or renaming the newrelic.cfg file switches it back to agent startup mode.
Settings should be populated in the newrelic.ini file rather than the newrelic.cfg file. The newrelic.ini file will be located with PHP on the host. If you are not sure where the newrelic.ini file is, run php -i
to get the PHP info at the command line. In the section at the top called Additional INI Files Parsed
, it will list the path to the newrelic.ini file.
Step 3: Agent-Daemon Communication Method and the Port Setting
Setting the port in the newrelic.ini file will change the method of communication for the agent and its worker daemon. By default this communication is over the Unix Domain Socket (UDS) /tmp/.newrelic.sock
. However some operating systems like CentOS 7 that run SystemD will not permit a process to access another process’s temp files. Because of this, you can instruct the agent and daemon to switch to communicating over an abstract socket by setting it to @newrelic-daemon
.
SELinux will also prevent the use of the default UDS. Check the status of SELinux on your system by running sestatus
. If the abstract socket does not resolve the problem, you could also set SELinux to permissive mode or create a policy to allow the agent and daemon to communicate, or try a different communication method below.
Alternatives to the abstract socket
An abstract socket will not work in some cases, for example in Docker. In this case you can use an alternate UDS socket, or an unused TCP port. I listed two examples below that you could choose from:
newrelic.daemon.port= "/run/.newrelic.sock"
- A UDS socket in SystemD’s recommended location:
/run
. As explained on StackExchange, the /run
directory is the companion directory to /var/run
. For example /bin
is the companion of /usr/bin
.
newrelic.daemon.port= "9500"
- This number is set to an unused TCP Port, but could be set to any number from 1 - 65534. See Wikipedia for a list of TCP/UDP ports that are already used.
External Startup Mode and the Port Setting
If you need to use external startup mode, be sure to set the port identically in the newrelic.ini file as in the newrelic.cfg file. In this case the newrelic.ini file will control the agent’s behavior and the newrelic.cfg file will control the daemon’s behavior. If they are set to different ports, they will be trying to communicate in different ways and the agent will not be able to detect the daemon. This can result in an ECONNREFUSED
error.
Any setting in the newrelic.ini file that starts with newrelic.daemon.
has an equivalent in the newrelic.cfg file. For example newrelic.daemon.port
in the newrelic.ini file has the equivalent port
in the newrelic.cfg file.
- In newrelic.ini:
newrelic.daemon.port= "@newrelic-daemon"
- In newrelic.cfg:
port= "@newrelic-daemon"
Step 4: Force Killing the Agent and Daemon
The agent and the daemon processes pull all their settings when they are initially spawned, and will not pull new settings until they receive a restart signal. Typically they can be restarted by simply restarting the web server. This will redeploy all PHP extensions, including the New Relic PHP agent. In agent startup mode, the agent will spawn a new daemon, with the new settings. In external startup mode, you must also restart the daemon manually. Note: some web servers provide a graceful restart option, which allows current requests to complete before the web server restarts. On production servers, this would be the recommended method for restarting the web server.
It is possible that the agent and daemon processes will not respond correctly to the restart signal. In this case they are considered hung processes, or zombie daemons (which I prefer, cause how fun is it to say: “The daemon failed to respawn and became a zombie…”). To check if this is the case you can run ps aux | grep newrelic-daemon
This should return information on two daemons, as well as on the grep
itself. This information includes the process ID, as well as the start time of the process. I have included an example below from an Ubuntu 14 (Trusty) host:
$ ps aux | grep newrelic-daemon
root 1656 0.0 0.7 187344 7444 ? Ssl 12:03 0:00 /usr/bin/newrelic-daemon --agent --pidfile /var/run/newrelic-daemon.pid --logfile /var/log/newrelic/newrelic-daemon.log --port /tmp/.newrelic.sock --tls --define utilization.detect_aws=true --define utilization.detect_docker=true
root 1662 0.0 0.9 253936 9488 ? Sl 12:03 0:00 /usr/bin/newrelic-daemon --agent --pidfile /var/run/newrelic-daemon.pid --logfile /var/log/newrelic/newrelic-daemon.log --port /tmp/.newrelic.sock --tls --define utilization.detect_aws=true --define utilization.detect_docker=true -no-pidfile
user2 2596 0.0 0.2 15948 2188 pts/2 S+ 12:04 0:00 grep --color=auto newrelic-daemon
If the start time of the process does not reflect the last time that you restarted the web server, then you can force kill the processes. This can be accomplished by identifying them by process ID with kill -9 PID
. I have provided an example below. After killing the watcher and worker daemons, only the grep
process remains:
$ sudo kill -9 1656
$ sudo kill -9 1662
$ ps aux | grep newrelic-daemon
user2 4132 0.0 0.2 15944 2128 pts/2 S+ 12:16 0:00 grep --color=auto newrelic-daemon
Alternatively you can just force kill both procceses at once by process name, as in the resolution instructions above:
killall newrelic-daemon
Step 5: Restart the Web Server to Apply Changes
As discussed in the step 4 explanation above, new settings are not deployed to the agent and daemon until these processes are restarted. The agent restarts with PHP as a PHP extension when the web server restarts. The daemon is either respawned by the agent in agent startup mode, or must be restarted manually in external startup mode. It is recommended to restart the web server gracefully on production hosts to prevent dropping requests.