Discovery is not working on a few servers
search cancel

Discovery is not working on a few servers

book

Article ID: 116987

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

Discovery stopped working and refuses to discover a few nodes.

Environment

- UIM 8.5.1 SP1
- discovery_server 8.5.2

Cause

- hub tunnel in 3-tier setup was down despite appearing up and running/green with port and PID

Resolution

3rd tier hub queue (discovery)
abcw-ssss-p1 (Site Hub 1)

Queue Name: probeDiscovery queue
ATTACH
Subject/Queue: probe_discovery
Current Status-> 5734 messages and slowly building were in the queue not being processed

2nd Tier hub queue (discovery)
Proxy hub (xxxx_Proxy_hub1)
Queue Name: probeDiscoveryFromSite_hub1
GET
Subject/Queue: probeDiscovery
Current Status-> 0 and 0 messages queued/sent

1st Tier = Primary hub queue (discovery)
There is a GET queue on the primary hub for Proxy hub 1

discovery messages.

Queue Name: probeDiscoveryfromProxy_hub_1
Subject/Queue: probeDiscoveryToPrimary
Current Status-> 0 and 0 messages queued/sent
Type: GET

On the Primary hub, there is the expected DEFAULT discovery_server ATTACH queue with Queue Name: probeDiscovery and Subject/Queue: probe_discovery

Current Status-> 0 queued and 38 Sent.

discovery_server.log error:

04 Oct 2018 11:13:20,440 [hubWorker-3] WARN  com.nimsoft.discovery.common.nimbus.scan.AbstractHubRobotInfoFetcher - hub work exception for /xxxx/xxxx_Primary_hub/yyyW-xxxxxx-P1/hub : (4) not found, Received status (4) on response (for sendRcv) for cmd = 'nametoip' name = '/xxxx/xxxx_Primary_hub/xxxW-PRxxxxx-P1/hub'

This error above indicates that the DS is erroring on a nametoip call as its trying to fetch the hub/robots list. So I checked the Hubs tab and could see that the Site_hub_1 was red. As it turns out, that hub is a Tunnel Server and the TUNNEL between SIte Hub 1 and Proxy Hub 1 was down, therefore the discovery messages could not make it through from the site hub to its designated Proxy hub.

To resolve it, we restarted the Proxy hub and all probe discovery messages were then processed.