/var/run/log/vmkernel.log> log messages similar to the following are observed:<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_ERR> nmlx5_core: 0000:2a:00.1: Health: NIC disabled state detected
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> assertVar[0] 0x00000000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> assertVar[1] 0x00000000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> assertVar[2] 0x00000000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> assertVar[3] 0x00000000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> assertVar[4] 0x00000000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> assertExitPtr 0x00000000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> assertCallra 0x00000000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> firmwareVersion 0x00000000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> hwId 0x00000000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> iriscIndex 0
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> synd 0x0: unrecognized error
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> extSynd 0x0000
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_INF> driver 4.23.6.5
<timestamps> UTC In(182) vmkernel: cpu56:2098438)<NMLX_WRN> nmlx5_core: 0000:2a:00.1: Health: Bad device state recovery is started
net-dvs -l> shows the following:"com.vmware.common.host.dpu.failover.status" = "red fail"
/var/run/log/hostd.log> showing nsxa is not responding at that time:hostd.8:<timestamps> UTC Er(163) Hostd[2103060]: [Originator@6876 sub=Hostsvc opID=DpuFailover-1004ef14] MessageSendHelper: Failed to send opaque network msg: opId:[DpuFailover-1004ef14-80] opCode:12
hostd.8:<timestamps> UTC Er(163) Hostd[2103060]: [Originator@6876 sub=Hostsvc opID=DpuFailover-1004ef14] MessageSendHelper: Task [DpuFailover-1004ef14-80] failed or has no response
hostd.8:<timestamps> UTCIn(166) Hostd[2103060]: [Originator@6876 sub=Hostsvc opID=DpuFailover-1004ef14] Fail to launch DPU failover on NSXA. Err: No response from NSXA
esxcfg-vswitch -l
Find the impacted switch name. This is used in the following command:
net-dvs -u "com.vmware.common.host.dpu.failover.status" -p hostPropList <switch_name>