nsxcli -c get hyperbus connection info Thu Dec 23 2021 UTC 20:12:31.155 VIFID Connection Status HostSwitchID 65101bd0-####-####-####-########a20 169.254.1.10:2345 MISS_VERSION_HANDSHAKE 50 1d 00 40 ## ## ## ##-## ## ## ## 9a 83 48 44 8afe08b9-####-####-####-########4c4 169.254.1.11:2345 MISS_VERSION_HANDSHAKE 50 1d 00 40 ## ## ## ##-## ## ## ## 9a 83 48 44
To Verify the cause of this symptom, run the following commands via SSH on an affected ESXi host:
port ec781af3-####-####-####-########f54: com.vmware.common.port.alias = vmk50 , propType = CONFIG load balancing = source virtual port id com.vmware.common.port.volatile.vlan = VLAN 0 -- port hb-89ba14b7-####-####-####-########79e: com.vmware.common.port.alias = hb-89ba14b7-####-####-####-########79e , propType = CONFIG load balancing = source virtual port id com.vmware.common.port.volatile.vlan = VLAN 4094 -- port hb-########-####-####-####-########1622: com.vmware.common.port.alias = hb-8d943d75-####-####-####-########622 , propType = CONFIG load balancing = source virtual port id com.vmware.common.port.volatile.vlan = VLAN 4094 -- port hb-c8f41d3c-####-####-####-########0f3: com.vmware.common.port.alias = hb-c8f41d3c-####-####-####-########0f3 , propType = CONFIG load balancing = source virtual port id com.vmware.common.port.volatile.vlan = VLAN 4094
host properties: com.vmware.common.host.portset = DvsPortset-3 , propType = CONFIG com.vmware.nsx.vdl2.enabled = true , propType = CONFIG com.vmware.nsx.spf.enabled = true , propType = CONFIG com.vmware.nsx.kcp.enable = true , propType = CONFIG com.vmware.vswitch.pvlanMap: (4093, 4093) - promiscuous (4093, 4094) - isolated propType = RUNTIME com.vmware.common.opaqueDvs.status.component.vswitch = up , propType = CONFIG
The issue described above can arise if the "rebootless_upgrade" flag is set to "false" for an NSX-T host upgrade to version 3.1.3.3.
If that flag is set to false, the upgrade script does not call a specific method that removes the pvlanMapping, which in turn can cause the symptom described above.
This issue is resolved in VMware NSX-T Data Center 3.2.0.1.
Workaround:
There are two workarounds to this particular symptom. The first is a proactive, preventative workaround. The second is a reactive workaround, to be applied if this symptom is encountered.
Preventative Workaround: (Before upgrade to 3.1.3.3, to prevent occurence).
Check the "rebootless_upgrade" flag on each Host Upgrade Group, prior to NSX-T upgrade.
Find the group id for the cluster using GET /api/v1/upgrade/upgrade-unit-groups?component_type=HOST or from UI host upgrade page.
Then run GET /api/v1/upgrade/upgrade-unit-groups/<group id> and look for the "rebootless_upgrade" flag in the return.
If a Host Upgrade Group returns "false", use a PUT api call to the same URL to change the value to "true".
NOTE: The default, and expected value, is "rebootless_upgrade", "value" : "true"}
Reactive Workaround: (After upgrade to 3.1.3.3, if encountered).
1. Run nsxdp-cli vswitch instance list on an affected ESXi host. Pay attention to the switch name in the first line of the return.
In the example below, the switch name is 'RegionA01-VDS7'
DvsPortset-0 (RegionA01-VDS7) 50 1d 00 40 ## ## ## ##-## ## ## ## 9a 83 48 44
2. Run net-dvs -l | egrep 'port |port.alias|port.volatile.vlan' | egrep 'vmk50|.alias = hb-' -A 2 -B 1 to identify the port ID's of vmk50 and each hyperbus port on the affected ESXi host.
Example Return:
port 66a8205e-####-####-####-########a34: com.vmware.common.port.alias = vmk50 , propType = CONFIG com.vmware.common.port.volatile.vlan = VLAN 4093 -- port hb-0c925e99-####-####-####-########20f: com.vmware.common.port.alias = hb-0c925e99-####-####-####-########20f , propType = CONFIG com.vmware.common.port.volatile.vlan = VLAN 4094 -- port hb-e928e034-####-####-####-########036: com.vmware.common.port.alias = hb-e928e034-####-####-####-########036 , propType = CONFIG com.vmware.common.port.volatile.vlan = VLAN 4094
3. Run the following commands to change each VLAN tag to 0 on the affected ESXi host:
net-dvs -v "0" -p 66a8205e-####-####-####-########a34 RegionA01-VDS7 *** above command clears pvlan from vmk50 port net-dvs -v "0" -p hb-0c925e99-####-####-####-########20f RegionA01-VDS7 *** above command clears pvlan from hyperbus port net-dvs -v "0" -p hb-e928e034-####-####-####-########036 RegionA01-VDS7 *** above command clears pvlan from hyperbus port net-dvs -u "com.vmware.vswitch.pvlanMap" -p hostPropList RegionA01-VDS7 *** above command clears pvlanMap from ESXi host
NOTE: Each Diego Cell VM has a hyperbus port (hb-xxxxxx port) on each affected ESXi host.