Primary hub probes down and data_engine test shows command not found(4)
search cancel

Primary hub probes down and data_engine test shows command not found(4)

book

Article ID: 269831

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) CA Unified Infrastructure Management On-Premise (Nimsoft / UIM) CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

DX UIM primary server probes are down and we are unable to connect to the database server through data engine probe.

data_engine failed repeatedly even after being recovered through deactivating then activating.

- various probes down/red, data_engine log showed no errors, was inserting data into the db, and the data_engine probe was green and had a port and PID
- data_engine connection test showed 'command not found(4).'

STARTING TEST ...
Provider=SQLNCLI11
Initial Catalog=xxxxx
Data Source=xxx.xxxx.xxx,<port>
Network Library=dbmssocn;Language=us_english
User ID=xxxxxxxx
Password=******
Contacting data_engine probe...
Executed async test_connection on <NimBUS_Address>/data_engine, please wait...
Waiting...
test connection DONE - result:
command not found(4)

Environment

  • Release: UIM 20.3.3 or higher
  • data_engine 20.31 or higher
  • Primary hub (OS: Windows 2019 Standard)

Resolution

We had to recover and/or reinstall several probes and in that process we had to rename some files but it kept prompting as if another process was interfering. This included several probes including ems, automated _deloyment_engine, and distsrv, among others.

Attempts at probe deployment generated 'inst_pkg_failed: SID has expired' errors.

As it turns out, as per the Windows Task Manager, there were numerous leftover probe processes still running despite the fact that the probes had been deactivated.

We used the Task Manager to find the C processes/java processes, and forcefully stop them, e.g., distrsrv.exe, automated_deployment_engine, telemetry, etc. Then we were able to rename or delete files/delete distsrv jobs, etc., and deploy probes without 'SID has expired' errors.

Primary hub
Overall the machine seemed somewhat unstable/bogged down and this was seen even when trying to simply view files in Notepad and select items in the folders/desktop.

Recommended to run Windows Update ASAP until there are no further updates to apply, and then perform a reboot of the server.

All issues were resolved.