VMware Identity Manager (vIDM / WSA) service opensearch / elasticsearch will not start.
book
Article ID: 315176
calendar_today
Updated On:
Products
VMware Aria Suite
Issue/Introduction
Opensearch service (before 3.3.7 known as elasticsearch) will not start. Running remediate or other vIDM requests through LCM may fail on the health check. In ssh session to the IDM nodes, we see that opensearch can't start
/etc/init.d/opensearch status
Not running
/etc/init.d/horizon-workspace status
RUNNING as PID=_____
Environment
VMware Identity Manager 3.3.x
Cause
This may be caused by stale liquibase lock.
Resolution
Resolution
First confirm that opensearch is Not Running but horizon-workspace is Running:
/etc/init.d/opensearch status
/etc/init.d/horizon-workspace status
Try to simply restart opensearch:
/etc/init.d/opensearch restart
^ If it spends minutes Waiting for IDM then you can kill it with Ctrl+C
A common cause of this issue is an inability to secure the lock. This can be caused by an unclean restart of opensearch for example.
Step 2 only needs to be executed once for the cluster. The remaining steps
Make sure Opensearch service is stopped on all nodes: /etc/init.d/opensearch stop
Release locks (once for the cluster is enough - run on psql primary node) /usr/sbin/hznAdminTool liquibaseOperations -forceReleaseLocks
Restart the main vIDM service - first on primary, wait a minute or two, then the other two nodes: service horizon-workspace restart
Start opensearch on all nodes: /etc/init.d/opensearch start
Workaround: if forceReleaseLocks fails
If the hznAdminTool command above hangs and does not complete, there may be another lock which must be manually removed:
First confirm cluster health as per KB 367175: if hznAdminTool gives error "The connection attempt failed", this can indicate that the delegateIP needs to be assigned to the psql primary node on eth0:0.
Make sure Opensearch service is stopped on all nodes: /etc/init.d/opensearch stop
Log in to the DB on psql primary node with this command: sudo -u postgres psql -h localhost -U horizon saas
Check for a lock here: select * from saas.DatabaseChangeLogLock;
If there is a lock found above (t, with some date & IP address), remove it like so: update saas.DATABASECHANGELOGLOCK SET LOCKED=false, LOCKGRANTED=null, LOCKEDBY=null where ID=1;
Log out of the database with \q and issue steps 2,3,4 above: release liquibase locks, restart horizon-workspace and then start opensearch.
(Note: for versions of vIDM earlier than 3.3.7, replace opensearch with elasticsearch wherever mentioned. These older versions are now EOL.)
Additional Information
Impact/Risks:
Brief service restart. If the vIDM is serving users in terms of login, there may be a momentary disconnect.