When attempting to start the Ambari agent it does not start. It does not return an error message but the subsequent ambari-agent status
returns the error meesage "Agent not running
".
After running ambari-agent start
once in the /var/log/ambari-agent/ambari-agent.log
file, observe that there are only a few lines for each attempted start:
INFO 2017-03-24 10:37:53,103 main.py:90 - loglevel=logging.INFO INFO 2017-03-24 10:37:53,103 main.py:90 - loglevel=logging.INFO INFO 2017-03-24 10:37:53,103 main.py:90 - loglevel=logging.INFO INFO 2017-03-24 10:37:53,104 DataCleaner.py:39 - Data cleanup thread started INFO 2017-03-24 10:37:53,106 DataCleaner.py:120 - Data cleanup started INFO 2017-03-24 10:37:53,108 DataCleaner.py:122 - Data cleanup finishedThe logs don't show an error even though it is not a complete startup sequence.
fuser
processes, in uninterruptible sleep state, using the command ps -flye | grep fuser
. Confirm that the output looks similar to the output below (multiple fuser processes in 'D' state):
D root 513702 513701 0 80 0 2300 25605 rpc_wa 13:09 pts/14 00:00:00 fuser 8670 tcp S root 521174 1 0 80 0 1264 2825 wait 13:47 pts/14 00:00:00 /bin/sh -c fuser 8670/tcp 2>/dev/null | awk '{print $2}' D root 521175 521174 0 80 0 2296 1396 rpc_wa 13:47 pts/14 00:00:00 fuser 8670 tcp S root 521929 521921 0 80 0 1264 2825 wait 13:56 pts/14 00:00:00 /bin/sh -c fuser 8670/tcp 2>/dev/null | awk '{print $2}' D root 521930 521929 0 80 0 2292 1396 rpc_wa 13:56 pts/14 00:00:00 fuser 8670 tcp S gpadmin 523226 523026 0 80 0 904 25812 pipe_w 14:11 pts/15 00:00:00 grep fuser
This is an issue in the OS, related to the NFS. In this scenario, the host affected has some issues related to NFS.
fuser
command to obtain the PID of the Ambari Agent. Since this command is stuck in an infinite loop due to a bug, the startup process for the Ambari Agent never gets completed.