pollagent v2.26 will not startup and shows max restarts errors in its log

book

Article ID: 38121

calendar_today

Updated On:

Products

DX Infrastructure Management NIMSOFT PROBES

Issue/Introduction

Problem:

The pollagent probe would not start.
 
The pollagent probe also displayed as green but only had a PID, no port. Also, more importantly, upon discussion with the customer, I found that snmpcollector always maintained a PID but NOT a port whether the pollagent was up or down. pollagent when run manually from the command line gave NO indication of any problem even when started in verbose mode.

Environment:

  • CentOS 6.2
  • UIM 8.31
  • snmpcollector 2.26
  • pollagent 2.26
  • Robot 7.80
  • hub 7.80

Cause:

The hub/robot where the pollagent and snmpcollector probes were deployed had the
a) wrong/old hostname configured and
b) reverse lookup environment/related files needed adjustment.

Resolution:

1. Set snmpcollector loglevel to 5 and logsize to 5000
2. Cold start the snmpcollector probe

I then saw the following UnknownHostException error:

Feb 09 13:44:18:126 [main, snmpcollector] stderr: Caused by: java.net.UnknownHostException: emdnaphub01.e-mds.com: emdnaphub01.e-mds.com: Name or service not known
Feb 09 13:44:18:126 [main, snmpcollector] stderr:     at java.net.InetAddress.getLocalHost(Unknown Source)
Feb 09 13:44:18:126 [main, snmpcollector] stderr:     at com.nimsoft.nimbus.NimQoS.getHostAddr(NimQoS.java:875)

This rang a bell with customer as this was an old hostname configured when the box was originally commissioned but it was no longer the valid hostname.

Under the covers the snmpcollector does the following to determine the hostname - we use a standard java call, called InetAddress.getLocalHost().getHostName() which does a reverse DNS lookup on the given IP.  So it is querying DNS and giving it the localhost IP and getting back that name somehow.

3. Therefore, I did a reverse lookup and we got no valid result. A reverse lookup revealed that the entry in /etc/named.conf was missing the reference for the reverse lookup zones as well.
4. The customer checked reverse lookup using 'dig' command as well (which shows what name server is being hit and dumps the data) and we had to change the following HOSTNAME entry as well because it was set to emdnaphub01.e-mds.com:

vi /etc/sysconfig/network

NETWORKING=yes
HOSTNAME=emdnaphub01.emds.com
GATEWAY=10.xx.xx.x

5. I asked the customer to issue a hostname command and it returned: emdnaphub01.e-mds.com, the wrong/bad hostname that the box was originally configured with...

So we changed that hostname to the correct hostname emdnaphub01.

6. Then we rebooted the box for the hostname change to take effect.

***Once the hub/machine was back up the pollagent and snmpcollector probes both had ports and PIDs and were working fine.

Environment

Release: CNMSPP99000-8.31-Unified Infrastructure Mgmt-Server Pack-- On Prem
Component: