Lost network connectivity on virtual switch error in vSAN cluster
search cancel

Lost network connectivity on virtual switch error in vSAN cluster

book

Article ID: 406815

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptom:

  • During physical switch maintenance activity we see below events on the vSAN node.

     

  • VMs in went into down state.

Verification:

  • In "/var/run/log/vobd.log", we will see below entries -

    2025-07-18T20:23:10.002Z In(14) vobd[2097617]:  [netCorrelator] 22256873367990us: [esx.problem.net.connectivity.lost] Lost network connectivity on virtual switch "vSwitch0". Physical NIC vmnic3 is down. Affected port groups: "vSAN", "Management Network".

Environment

VMware vSphere vSAN

Cause

  • Redundant vmnics connected to the physical switch went into down state since the physical switch was not accessible.

    In below example, vmnic3 and vmnic5 both went into down state, affecting the management and vSAN network.

      Switch Name      Num Ports   Used Ports  Configured Ports  MTU     Uplinks
      vSwitch0         8960        18          128               1500    vmnic5,vmnic3

      PortGroup Name                            VLAN ID  Used Ports  Uplinks
      vSAN                                      54       1           vmnic5,vmnic3
      Management Network                        54       1           vmnic5,vmnic3

    In "/var/run/log/vobd.log" we see below entries -

    2025-07-18T20:23:08.266Z In(14) vobd[2097617]:  [netCorrelator] 22256871633299us: [vob.net.vmnic.linkstate.down] vmnic vmnic5 linkstate down
    [vob.net.pg.uplink.transition.down] Uplink: vmnic5 is down. Affected portgroup: vSAN. 1 uplinks up. Failed criteria: 128

    2025-07-18T20:23:08.792Z In(14) vobd[2097617]:  [netCorrelator] 22256872158978us: [vob.net.vmnic.linkstate.down] vmnic vmnic3 linkstate down
    [vob.net.pg.uplink.transition.down] Uplink: vmnic3 is down. Affected portgroup: vSAN. 0 uplinks up. Failed criteria: 128

    Note -Failed Criteria 128 is driver reporting a link state down. This can be caused by unplugging the network cable or administratively downing the physical switchport. If this was not an intended link outage it will likely be an issue with the driver, firmware, SFP+ module, cable, and/or switchport of the physical switch. Contact the host hardware vendor for further troubleshooting when Failed criteria 128's are seen in the vobd log.
    Refer - Network adapter (vmnic) is down or fails with a Failed Criteria Code


    In "/var/run/log/vmkernel.log", we see below entries -

    2025-07-18T20:23:08.265Z In(182) vmkernel: cpu29:2098029)bnxtnet: bnxtnet_display_link:2013: [vmnic5 : 0x45215b706000] NIC Link is down
    2025-07-18T20:23:08.791Z In(182) vmkernel: cpu22:2097927)bnxtnet: bnxtnet_display_link:2013: [vmnic3 : 0x452148be0000] NIC Link is down


    Parallely, in vsansystem.log, we see below entries which suggests that the vSAN node loses connectivity with other vSAN nodes leading to network partition -

    2025-07-18T20:07:20.008Z In(166) vsansystem[2099838]: [vSAN@6876 sub=VsanSystemProvider opId=2830adf0-49f7] Complete, cluster: 5252b9f3-####-####-####-8fb10ec970e0, nodeCount: 3, status: (vim.vsan.host.ClusterStatus) {
    2025-07-18T20:23:12.521Z In(166) vsansystem[2099848]: [vSAN@6876 sub=VsanSystemProvider opId=CMMDSMembershipUpdate-522a] Complete, nodeCount: 1, runtime info: (vim.vsan.host.VsanRuntimeInfo) {

Resolution

  • vmnics should be connected to physical switches in such a way that if one physical switch is going through maintenance window due to operations like upgradation, then the redundant vmnic should be connected to a physical switch which is still accessible and allows vSAN traffic.

Additional Information