1. You may see "IO was aborted" errors in /var/run/log/vmkernel.log
YYYY-MM-DD:T:HH:MM:SS Al(177) vmkalert: cpu##:#######)ALERT: nmlx5_SetL2TableEntryCmd:170 command failed: IO was abortedYYYY-MM-DD:T:HH:MM:SS In(182) vmkernel: cpu##:#######) nmlx5_core: vmnic#: nmlx5_en_L2TableIndexAdd - (nmlx5_core_en_main.c:8810) Failed to set L2 table entry (#########)YYYY-MM-DD:T:HH:MM:SS In(182) vmkernel: cpu##:#######) nmlx5_core: vmnic#: nmlx5_en_RxQueueFiltersDbApply -(nmlx5_core_en_multiq.c:1631) Failed to add filter into L2 (Failure)YYYY-MM-DD:T:HH:MM:SS In(182) vmkernel: cpu##:#######) nmlx5_core: vmnic#: nmlx5_en_RxQueueFiltersDbApply - (nmlx5_core_en_multiq.c:1657) donestatus: FailureYYYY-MM-DD:T:HH:MM:SS In(182) vmkernel: cpu##:#######) nmlx5_core: vmnic#: nmlx5_en_UplinkQApplyFilter - (nmlx5_core_en_multiq.c:1839) nmlx5_en_RxQueueFiltersDbApply failed - Failureuplink issues with "lag1: not found" messages in var/run/log/vmkernel.logYYYY-MM-DD:T:HH:MM:SS In(182) vmkernel: cpu##:####### opID=#######)Uplink: 2703: lag1: not foundYYYY-MM-DD:T:HH:MM:SS In(182) vmkernel: cpu##:####### opID=#######)Uplink: 2703: lag1: not foundYYYY-MM-DD:T:HH:MM:SS In(14) vobd[#######]: [vmfsCorrelator] 720696188us: [vob.vmfs.heartbeat.timedout]########-####-####-############ ########-####-####-############ YYYY-MM-DD:T:HH:MM:SS In(14) vobd[#######]: [vmfsCorrelator] 720696053us: [esx.problem.vmfs.heartbeat.timedout] ########-####-####-############ ########-####-####-############ VMware vSAN 7.x
This is known bug in the nmlx5 health mechanism logic where the driver incorrectly detects NIC is in faulty state.
This issue is been resolved in ESXi 8.0 Update 3e build 24674464. Refer to PSOD: nmlx5_QueryNicVportContext:188 command failed: IO was aborted
As a workaround , implement the steps below