VMware Identity Manager Health Critical Status with Error Code LCMVIDM73107 and Postgres Cluster Issues

search cancel

VMware Identity Manager Health Critical Status with Error Code LCMVIDM73107 and Postgres Cluster Issues

book

Article ID: 391656

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

In Aria Lifecycle Manager (vRSLCM), VMware Identity Manager (vIDM) is flagged as critical under the health status check.

Upon investigation, one of the vIDM nodes is found in a SHUTDOWN state, causing disruptions in the service.

su root -c "echo -e 'password'|/usr/local/bin/pcp_watchdog_info -p 9898 -h localhost -U pgpool"

Output:

<Host1>:9999 Linux <Host1> <Host1> 9999 9000 4 MASTER
<Host2>:9999 Linux <Host2> <Host2> 9999 9000 7 SHUTDOWN
<Host3>:9999 Linux <Host3> <Host3> 9999 9000 7 STANDBY

Reviewing the /var/log/pgService/pgService.log, we find that the shutdown is caused by a network issue.

2025-03-18T14:03:38.680992+00:00 pgpool[21165]: [83872-1] 2025-03-18 14:03:38: pid 21165: WARNING: network IP is removed and system has no IP is assigned
2025-03-18T14:03:38.681263+00:00 pgpool[21165]: [83872-2] 2025-03-18 14:03:38: pid 21165: DETAIL: changing the state to in network trouble
2025-03-18T14:03:38.681593+00:00 pgpool[21165]: [83873-1] 2025-03-18 14:03:38: pid 21165: LOG: watchdog node state changed from [STANDBY] to [IN NETWORK TROUBLE]
2025-03-18T14:03:38.681624+00:00 pgpool[21165]: [83874-1] 2025-03-18 14:03:38: pid 21165: FATAL: system has lost the network
2025-03-18T14:03:38.681656+00:00 pgpool[21165]: [83875-1] 2025-03-18 14:03:38: pid 21165: LOG: Watchdog is shutting down

Environment

VMware Identity Manager 3.3.x

Cause

The affected vIDM node loses its network connection, which leads to its removal from the cluster and a transition to the SHUTDOWN state.

Resolution

Step 1: Open SSH session to all three VIDM appliances and run the following command to get the password in use:

cat /usr/local/etc/pgpool.pwd

NOTE: If no value is returned, then the password will be 'password' for the step below.

Step 2: Verify Cluster Status

To determine the current state of the cluster, run the following command in the primary node of vIDM

su root -c "echo -e 'password'|/usr/local/bin/pcp_watchdog_info -p 9898 -h localhost -U pgpool"

Output:

<Host1>:9999 Linux <Host1> <Host1> 9999 9000 4 MASTER
<Host2>:9999 Linux <Host2> <Host2> 9999 9000 7 SHUTDOWN
<Host3>:9999 Linux <Host3> <Host3> 9999 9000 7 STANDBY

This confirms that Host2 is in SHUTDOWN mode while the other nodes remain functional.

Step 3: Restart the Affected Node’s Database Service

To bring the affected node back online, restart the PostgreSQL service using:

/etc/init.d/pgService restart

After the restart, the node will rejoin the cluster, and its status was verified.

Step 3: Enable Auto-Recovery

To prevent future occurrences, enable Auto-Recovery to allow automatic recovery of the cluster in case of similar network disruptions.

Feedback

thumb_up Yes

thumb_down No