vIDM postgres cluster health status is critical. Postgres status of node(s) some node, marked as down in the cluster creating service outages.
Log excerpts:
workspace.log:
10-Feb-2025 23:13:23.143 WARNING [localhost-startStop-8] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [SAAS] appears to have started a thread named [Replication Thread] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
com.vmware.horizon.datastore.IdmRMIAsynchronousCacheReplicator.replicationThreadMain(IdmRMIAsynchronousCacheReplicator.java:89)
com.vmware.horizon.datastore.IdmRMIAsynchronousCacheReplicator$ReplicationThread.run(IdmRMIAsynchronousCacheReplicator.java:369)
horizon.log
10-Feb-2025 23:13:23.146 WARNING [localhost-startStop-8] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [SAAS] appears to have started a thread named [Replication Thread] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
com.vmware.horizon.datastore.IdmRMIAsynchronousCacheReplicator.replicationThreadMain(IdmRMIAsynchronousCacheReplicator.java:89)
com.vmware.horizon.datastore.IdmRMIAsynchronousCacheReplicator$ReplicationThread.run(IdmRMIAsynchronousCacheReplicator.java:369)
accesscontrol.service.log
2025-02-10 17:37:11,396 GMT WARN VIDM-FQDN:accesscontrol (localhost-startStop-1) [;;;] org.springframework.context.annotation.AnnotationConfigApplicationContext - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'liquibase' defined in class path resource [com/vmware/vidm/accesscontrol/db/DbDataStoreAutoConfiguration.class]: Invocation of init method failed; nested exception is liquibase.exception.DatabaseException: org.postgresql.util.PSQLException: The connection attempt failed.
VMware Identity Manager 3.3.x
Please follow the below steps and ensure that you take snapshots of vRSLCM and vIDM nodes prior to applying below steps
NOTE : take snapshots of vRSLCM and vIDM nodes prior to applying below steps
After the above steps are followed observe for 3-4 days and if still observing the issue execute the below step
NOTE : take snapshots of vRSLCM and vIDM nodes prior to applying below steps