VMware Identity Manager services offline with 502 error after NSX Standard to Enhanced conversion
search cancel

VMware Identity Manager services offline with 502 error after NSX Standard to Enhanced conversion

book

Article ID: 421151

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

  • You observe a "502 Bad Gateway" error when attempting to access the VMware Identity Manager (vIDM) VIP.

  • The vIDM nodes appear offline.

  • Critical services, including postgres, opensearch, rabbitmq, and horizon-workspace, are down.

  • This issue occurs immediately following a VCF cluster conversion from NSX Standard switching to NSX Enhanced switching.

Environment

 

  • VMware Cloud Foundation (VCF)

  • VMware Identity Manager (vIDM)

  • VMware NSX

 

Cause

This issue is caused by a Defensive Shutdown triggered by the Guest OS during the network migration. The conversion process can cause a temporary failure of the Guest OS NetworkService.

  • Clustered Services: Logic within auto-recovery.sh detects the network isolation and stops postgres and opensearch to prevent "Split-Brain" and data corruption.

  • Local Services: Services like rabbitmq stop because they cannot bind to the necessary communication ports when the NetworkService is unstable.

Resolution

To resolve this issue,

1. You must validate the network and restart the appliances to reset the services.

Verify that the NSX Enhanced switching configuration is complete and the network is stable.

2. Log into the Aria Suite Lifecycle  machines.  Aria Suite Lifecycle > Lifecycle Operations > Environments > Globalenvironment


3. Perform a power off and power on for the Identity Manager:

           a. Select "VIEW DETAIL" on the Globalenvironment card.

           b. Click  " ... " (three dots) at the end of  line  "VMware Identity Manager" to get the dropdown menu.

           c. Select Power ON first from the dropdown list to power off the nodes and then "Power ON" 

4.  Once the nodes are back online, verify that the services have started and the 502 error is resolved.

Additional Information

The shutdown of these services is an expected protective measure during total network loss to ensure data integrity. They must be manually restarted once the network layer is restored.