In VMware NSX 4.x post upgrade, the LogicalSwitchState (Config State) are in failed state seen in Manager view
Similar log entries may be found in proton/nsxapi.log:
2024-06-03T17:15:35.160Z ERROR l2VcFullSyncScheduler1 NsxPortgroupExecuteVcUtils 86947 SWITCHING [nsx@6876 comp="nsx-manager" errorCode="MP9538" level="ERROR" subcomp="manager"] Could not find the HostSwitch [<UUID>] of type VDS
When using this command as root in NSX manager #corfu_tool_runner.py -n nsx -t LogicalSwitchState -o showTable > LSSb4.txt, similar entries maybe found for the LogicalSwitch:
"############-############": {
"portGroup": {
"cmId": "<CM-UUID>",
"portGroupKey": "dvportgroup-<pgID>"
},
"state": "PORT_GROUP_STATE_ENUM_FAILEDFORDELETE"
}
Notice the ID of this logical switch is similar to "##################gga1a2d3" instead of the normal one such as "## ## ## ## ## ## ## bb-cc dd ee ff gg a1 a2 d3".
Also, the state of the logical switch in question is "PORT_GROUP_STATE_ENUM_FAILEDFORDELETE" instead of "PORT_GROUP_STATE_ENUM_SUCCESS".
VMware NSX 4.x
This is a known issue impacting VMware NSX.
The following workaround is non-impacting; however, it is always recommended to perform fixes within a scheduled maintenance window.
Workaround:
Once the state is cleared in the current release, this problem will not be seen when upgrading to a higher release.
We have developed a script to fix this issue.
SSH as root into the vCenter Server VM where the logical switches are present:
#/opt/vmware/vpostgres/current/bin/psql -U postgres -d VCDB -c "select ID, DVS_ID, DVPORTGROUP_KEY, LOGICALSWITCH_UUID from vpx_dvportgroup;" | grep dvportgroup- | awk '{print $5}' > dvpgport_info.txtUpload the script and dvport_info.txt file to any one node of the NSX managers' /tmp directory, and execute the following command as root user:
#corfu_tool_runner.py -n nsx -t LogicalSwitchState -o showTable > LSSb4.txt
cd /tmp)#python3 del_dvpg_4.1.x_3407553_0916_allLS.py dvpgport_info.txt#corfu_tool_runner.py -n nsx -t LogicalSwitchState -o showTable > LSSafter.txt start search resync all on all the manager nodes in admin mode.If the Config State still shows failed after the resync command and the customer is running on NSX 4.2.0
For fixing individual Logical Switches in NSX 4.2.x and 4.1.x, please download del_dvpg_4.1.x_3407553_0916.py and use the following syntax:
python3 del_dvpg_4.1.x_3407553_0916.py dvpgport_info.txt <logical_switch_id>If there are issues after running the script, please open a Broadcom Support Case attaching the dvpgport_info.txt, LSSb4.txt, and LSSafter.txt files with the reference to this KB article.
Same symptoms can also be seen if hitting another known issue as explained in KB: Logical Switch Status may incorrectly show as FAILED with no impact to realization