Pivotal HD (PHD) cluster deploy fails when icm_client fails with "Could not start Service nmon"
search cancel

Pivotal HD (PHD) cluster deploy fails when icm_client fails with "Could not start Service nmon"

book

Article ID: 294560

calendar_today

Updated On:

Products

Services Suite

Issue/Introduction

Deploying a Pivotal HD (PHD) cluster with icm_client fails with the following log entry in the file, /tmp/GPHDNodeInstaller_xxxxxx.log
notice: nmon_conf_dir = /etc/nmon ; nmon_conf_subdir = /etc/nmon/conf ; nmon-site.xml path = /etc/nmon/conf/nmon-site.xml
notice: /Stage[main]/Mgmt_apps::Nmon/Notify[nmon_conf_dir = /etc/nmon ; nmon_conf_subdir = /etc/nmon/conf ; nmon-site.xml path = /etc/nmon/conf/nmon-site.xml]/message: defined 'message' as 'nmon_conf_dir = /etc/nmon ; nmon_conf_subdir = /etc/nmon/conf ; nmon-site.xml path = /etc/nmon/conf/nmon-site.xml'
debug: /Stage[main]/Mgmt_apps::Nmon/Notify[nmon_conf_dir = /etc/nmon ; nmon_conf_subdir = /etc/nmon/conf ; nmon-site.xml path = /etc/nmon/conf/nmon-site.xml]: The container Class[Mgmt_apps::Nmon] will propagate my refresh event
debug: Service[nmon](provider=redhat): Executing '/sbin/service nmon status'
debug: Service[nmon](provider=redhat): Executing '/sbin/service nmon start'
err: /Stage[main]/Mgmt_apps::Nmon/Service[nmon]/ensure: change from stopped to running failed: Could not start Service[nmon]: Execution of '/sbin/service nmon start'returned1:at /var/lib/gphd/gphdmgr/puppet/modules/mgmt_apps/manifests/nmon.pp:47
The nmon service does not start and the status reports "unrecognized service":
[gpadmin@namenode-01 tmp]$ service nmon status

nmon: unrecognized service
nmon is a service bundled with Pivotal Command Center (PCC). During the deployment of Hadoop services, a previous version of the nmon package is installed on the failed hosts:
[root@namenode-01 ~]# rpm -qa|grep nmon

nmon-14i-8.el6.x86_64
The Hadoop deployment expects version "nmon-1.0.0-69.x86_64". The incompatible version of nmon was  provided by a yum repository called local-epel:
[root@namenode-01 ~]# yum list nmon
Installed Packages
nmon.x86_64 14i-8.el6 @local-epel
The local-epel yum repository has a higher version of nmon and yum will use the local-epel version instead of the Pivotal Command Center gphd.repo version.
[gpadmin@namenode-01 yum.repos.d]$ cat epel.repo 
[local-epel]
name = epel Local
enabled = 1
baseurl = http://10.47.121.215/repos/centos/6/x86_64/epel/
gpgcheck = 0


Resolution

To resolve this issue, follow the steps below:

1. Disable epel.repo on the failed hosts:
[gpadmin@namenode-01 yum.repos.d]$ cat epel.repo 
[local-epel]
name = epel Local
enabled = 0
baseurl = http://10.47.121.215/repos/centos/6/x86_64/epel/
gpgcheck = 0

2. Make sure the nmon package will be provided by the correct gphd repository:

[gpadmin@namenode-01 ~]$ yum list nmon
Installed Packages
nmon.x86_64 1.0.0-69 @gphd

3. Uninstall and deploy the cluster again:

[gpadmin@namenode-01 ~]$ icm_client uninstall -l <phd_cluster_name>

[gpadmin@namenode-01 ~]$ icm_client deploy -c <config_dir>