UIM outage after Windows update

book

Article ID: 132143

calendar_today

Updated On:

Products

DX Infrastructure Management NIMSOFT PROBES

Issue/Introduction

After a restart/reboot of the hub, UIM is down

Key symptoms:

- Some probes on the Primary show port and PID, but others were red and could not be 'Activated'
- The probes that were red could not even be successfully 'Deactivated'
- When you open the GUI for probes that are green and can be opened, you cannot save ANY changes 
- When you view the log file for ANY probe via rt-click->View Log, there is nothing being written to the log file
- When you try to directly edit a local probes' .cfg file, e.g., robot.cfg, via the filesystem, you cannot save the change and an "Access denied" error and/or
"Unable to contact /<NimBUS address>/<probe_name> 
permission denied(80040406)" error occurs:

<Please see attached file for image>

controller access denied error

<Please see attached file for image>

Failed save configuration data_engine

Cause

Access/permissions errors were being thrown for two service accounts, a) one account which is used on the Primary and Secondary (HA) hubs, to authenticate to the database and is also the user the Robot is set to "Run As" on the Primary, and b) another account which is used on all other robots. Upon inspection of the UIM  Nimsoft installation directory->Security->Properties, we could see that the ownership/access and rights/permissions were altered/wiped out for those accounts.

Based on a discussion with the customer, it seems to have happened before and was caused by the customers internal configuration management software which schedules config changes for their security software. Somehow the job was misconfigured.

The erroneous change was made by the job but the issue did not reveal itself, until the hub was restarted due to the reboot of the machine post-Windows update. This occurred in Production. Assuming the same job ran and affected the Development UIM environment, we purposefully restarted the hub to see if all of the same symptoms would occur in DEV, and the same issue was fully reproduced.

Environment

- UIM v8.51 
- OS: Windows 2012 
- Database: MS SQL Server Enterprise 2014 
- Windows authentication being used for the CA_UIM database
- data_engine v8.5

Resolution

- Access and permissions for the two service accounts were manually restored by the Windows Administrator
- Moving forward the customer will either stop the job from executing or repair the misconfiguration so that the filesystems permissions for these admin accounts are no longer wiped out by the scheduled job.

Overall the UIM service accounts require ownership for the local filesystem and should be included in the local administrators group on the Primary and Secondary hubs. The other service account for the remote hubs and robots require the correct account admin permissions for full access to the local filesystem on all of the other UIM hubs/robots.

Attachments

1559051371019000132143_sktwi15okjw4363ho.png get_app
1559051368953000132143_sktwi15okjw4363hn.png get_app