Physical switch replacement causing network partition issues in vSAN cluster
search cancel

Physical switch replacement causing network partition issues in vSAN cluster

book

Article ID: 414013

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

Multiple virtual machines became inaccessible during the physical switch replacement .

Multiple Network cards report Receive packet drops .

 NIC statistics for vmnic2:
      Packets received: 387052756615
      Packets sent: 402665577066
      Bytes received: 1169281332964393
      Bytes sent: 1413182323365846
      Receive packets dropped: 866782

NIC statistics for vmnic4:
      Packets received: 8967026944
      Packets sent: 6361985276
      Bytes received: 5985798543827
      Bytes sent: 2069983881146
      Receive packets dropped: 616977

 

Environment

VMware vSAN 7.x

VMware vSAN 8.x

 

Cause

vSAN host connectivity was impacted because both core switches were replaced simultaneously, disrupting all network paths used for vSAN traffic causing the nodes to go into network partition state .

In "/var/run/log/vsansystem.log", nodecount drops are observed below,

2025-10-07T01:39:52.205Z info vsansystem[2105596] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSMembershipUpdate-56f8] Complete, nodeCount: 13, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-10-07T01:40:11.436Z info vsansystem[2105370] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSMembershipUpdate-5738] Complete, nodeCount: 1, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-10-07T01:40:32.312Z info vsansystem[2105601] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSNodeUpdate-584c] Complete, nodeCount: 14, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-10-07T01:40:32.341Z info vsansystem[2105601] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSNodeUpdate-584c] Complete, nodeCount: 14, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-10-07T01:40:32.352Z info vsansystem[2105601] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSNodeUpdate-584c] Complete, nodeCount: 14, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-10-07T01:53:36.310Z info vsansystem[10852382] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSNodeUpdate-d6b8] Complete, nodeCount: 14, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-10-07T01:53:36.337Z info vsansystem[2105601] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSNodeUpdate-d6ba] Complete, nodeCount: 14, runtime info: (vim.vsan.host.VsanRuntimeInfo)

In "/var/run/log/vmkernel.log",  membership drops can be observed in the logs, where hosts are dropping out of the cluster. 

CMMDS: LeaderRemoveNodeFromMembership:7965: ########-####-####-####-############: Removing node ########-####-####-####-############(vsanNodeType: data) from the cluster membership

CMMDS: CMMDSUtil_PrintArenaEntry:100: ########-####-####-####-############: [457652584]:Adding a new Membership entry (a########-####-####-####-############) with 1 members:

CMMDS: LeaderRemoveNodeFromMembership:########-####-####-####-############: Removing node ########-####-####-####-############(vsanNodeType: data) from the cluster membership

CMMDS: CMMDSUtil_PrintArenaEntry:100: ########-####-####-####-############: [457652633]:Adding a new Membership entry (########-####-####-####-############) with 14members:

Resolution

To ensure vSAN traffic availability during maintenance, connect vmnics to separate physical switches. This allows redundancy if one switch goes down (e.g., for an upgrade), the other remains active to carry vSAN traffic.