If you encounter this issue, do the following to troubleshoot:
-
Check the stage of discovery progress using the following commands:
./dmctl -s <domain> get ICF_TopologyManager::ICF-TopologyManager::probeInProgress
./dmctl -s <domain> get ICF_TopologyManager::ICF-TopologyManager::InPostProcess
Example
./dmctl -s TEST-AM-PM get ICF_TopologyManager::ICF-TopologyManager::probeInProgress
TRUE
./dmctl -s TEST-AM-PM get ICF_TopologyManager::ICF-TopologyManager::InPostProcess
FALSE
-
The above example output indicates that the discovery is struck in Probing phase.
-
Check the Smarts IP domain log file for the following errors:
SWFE-E-EGETNEXT-While getting next of OID .1.3.6.1.4.1.9.9.109.1.2.1.1.2.1.155,
SNMP-ERESPONSE-No response from xx.xx.xx.xx, port 161
SNMP-ETIMEOUT-Timed out
-
If the above errors are found to be continuous for one particular device, it means that the discovery of a device at every OID is timing out. When this occurs, Smarts IP will try to reach OID for the number of retries * timeout value set in discovery.conf file for the domain. By default, the SNMPTimeout during Probing phase is 1000ms and Number of retries is 3. This retry process may be taking most of the discovery time.
-
Check previous logs for the device generating the errors (if available), to check the total discovery time as in the following example:
Example
DiscoveryTime = 0 14:28:18
-
The above example indicates that the device discovery is taking 14+ hours. This is typically caused by a defective SNMP agent on the device, meaning the discovery failure is a device issue.
-
If the issue is found to be caused by the device SNMP agent, stop the SNMP agent on the problem device. This will allow for the Smarts IP discovery to continue and allow the rest of the discovery process can complete faster. If this does not work, you may need to restart the Smarts IP domain for the discovery process to completely stop, and then restart the discovery process again.