Multiple VMs become inaccessible on the network.
search cancel

Multiple VMs become inaccessible on the network.

book

Article ID: 420014

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • VMs become unreachable on the network. When checked from the console, some of them would have gone into read-only state.

  • Datastore hosting the affected VMs would have experienced storage/connectivity issues. In this case, we can see vSAN datastore experiencing heartbeat timeouts:

    vmkernel: cpu0:2098253)HBX: 298: 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX5cf0': HB at offset 3964928 - Reclaimed heartbeat [Timeout]:
    vmkernel: cpu68:2098254)HBX: 298: 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX6180': HB at offset 3969024 - Reclaimed heartbeat [Timeout]:
    vmkernel: cpu2:2098253)HBX: 298: 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXecb8': HB at offset 3964928 - Reclaimed heartbeat [Timeout]:
    vmkernel: cpu2:2098253)HBX: 298: 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX5c88': HB at offset 3964928 - Reclaimed heartbeat [Timeout]:
    vmkernel: cpu28:79135366)HBX: 298: 'XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXX5cf0': HB at offset 3964928 - Reclaimed heartbeat [Timeout]:

  • Around the same time, we would see vmnic link down notification:

    vmkernel: cpu58:2097485)netschedHClk: NetSchedHClkNotify:5067: vmnic0: link down notification
    vmkernel: cpu60:2097485)netschedHClk: NetSchedHClkNotify:5067: vmnic1: link down notification
    vmkernel: cpu42:2097485)netschedHClk: NetSchedHClkNotify:5067: vmnic2: link down notification
    vmkernel: cpu58:2097485)netschedHClk: NetSchedHClkNotify:5067: vmnic3: link down notification

Environment

VMware vSphere ESXi

Cause

The root cause of the network disruption was an administrative task executed to increase the Rx (Receive) buffers on all four vmnic adapters:

vmkernel: cpu68:2097485)<NMLX_INF> nmlx5_core: vmnic0: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:762) Changing Ring parameters:
vmkernel: cpu68:2097485)<NMLX_INF> nmlx5_core: vmnic0: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:764) from: RQ size 1024 SQ size 1024
vmkernel: cpu68:2097485)<NMLX_INF> nmlx5_core: vmnic0: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:767) to  : RQ size 4096 SQ size 4096
vmkernel: cpu58:2097485)netschedHClk: NetSchedHClkNotify:5067: vmnic0: link down notification
vmkernel: cpu68:2097485)netschedHClk: NetSchedHClkNotify:5059: vmnic0: link up notification
[....]
vmkernel: cpu71:2097485)<NMLX_INF> nmlx5_core: vmnic1: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:762) Changing Ring parameters:
vmkernel: cpu71:2097485)<NMLX_INF> nmlx5_core: vmnic1: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:764) from: RQ size 1024 SQ size 1024
vmkernel: cpu71:2097485)<NMLX_INF> nmlx5_core: vmnic1: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:767) to  : RQ size 4096 SQ size 4096
vmkernel: cpu60:2097485)netschedHClk: NetSchedHClkNotify:5067: vmnic1: link down notification
vmkernel: cpu57:2097485)netschedHClk: NetSchedHClkNotify:5059: vmnic1: link up notification
[....]
vmkernel: cpu36:2097485)<NMLX_INF> nmlx5_core: vmnic2: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:762) Changing Ring parameters:
vmkernel: cpu36:2097485)<NMLX_INF> nmlx5_core: vmnic2: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:764) from: RQ size 1024 SQ size 1024
vmkernel: cpu36:2097485)<NMLX_INF> nmlx5_core: vmnic2: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:767) to  : RQ size 4096 SQ size 4096
vmkernel: cpu42:2097485)netschedHClk: NetSchedHClkNotify:5067: vmnic2: link down notification
vmkernel: cpu37:2097485)netschedHClk: NetSchedHClkNotify:5059: vmnic2: link up notification
[....]
vmkernel: cpu36:2097485)<NMLX_INF> nmlx5_core: vmnic3: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:762) Changing Ring parameters:
vmkernel: cpu36:2097485)<NMLX_INF> nmlx5_core: vmnic3: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:764) from: RQ size 1024 SQ size 1024
vmkernel: cpu36:2097485)<NMLX_INF> nmlx5_core: vmnic3: nmlx5_en_UplinkRingParamsSet - (nmlx5_core_en_uplink.c:767) to  : RQ size 4096 SQ size 4096
vmkernel: cpu58:2097485)netschedHClk: NetSchedHClkNotify:5067: vmnic3: link down notification
vmkernel: cpu50:2097485)netschedHClk: NetSchedHClkNotify:5059: vmnic3: link up notification

Resolution

  • The connectivity loss was an expected result of the configuration change, normal operation resumed once the uplinks stabilized.
  • For future changes to network adapter settings (such as Ring Buffers), ensure the ESXi host is placed in Maintenance Mode.
  • Schedule such maintenance tasks during low-impact windows to prevent disruption to active VM workloads.