Multiple objects are inaccessible state due to network issues on vSAN cluster

Products

VMware vSAN

Issue/Introduction

Symptoms :

- The vSAN objects are intermittently getting into inaccessible state.

- Example output shows 5 objects as inaccessible when you check from an ESXi host which is part of vSAN cluster.

[root@V2:~] esxcli vsan debug object health summary get
Health Status Number Of Objects
--------------------------------------------------------- -----------------
remoteAccessible 0
inaccessible 5
reduced-availability-with-no-rebuild 0

- None of the objects are reporting as inaccessible after some time on the same host when checked next time.

[root@v2:~] esxcli vsan debug object health summary get
Health Status Number Of Objects
--------------------------------------------------------- -----------------
remoteAccessible 0
inaccessible 0
reduced-availability-with-no-rebuild 0
reduced-availability-with-no-rebuild-delay-timer 303
reducedavailabilitywithpolicypending 0
reducedavailabilitywithpolicypendingfailed 0

When reviewing the logs /var/run/log/vsansystem.log the node count fluctuates due to ongoing network issues.

2025-03-13T00:26:18.716Z info vsansystem[2099681] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSMembershipUpdate-b825] Complete, nodeCount: 4, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-03-13T00:26:22.644Z info vsansystem[2099772] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSMembershipUpdate-b859] Complete, nodeCount: 3, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-03-13T00:26:26.293Z info vsansystem[2099778] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSMembershipUpdate-b8a2] Complete, nodeCount: 4, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-03-13T00:26:26.700Z info vsansystem[2099791] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSNodeUpdate-b8a3] Complete, nodeCount: 4, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-03-13T00:26:37.854Z info vsansystem[2099779] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSMembershipUpdate-bc4a] Complete, nodeCount: 5, runtime info: (vim.vsan.host.VsanRuntimeInfo) {
2025-03-13T00:26:38.102Z info vsansystem[2099781] [vSAN@6876 sub=VsanSystemProvider opId=CMMDSMembershipUpdate-bc69] Complete, nodeCount: 6, runtime info: (vim.vsan.host.VsanRuntimeInfo) {

Environment

VMware vSAN 7.x

Cause

The issue is seen when vmnic on ESXi host experience errors like, CRC, RxMissed and Receive errors.

When ping test is run between hosts, packet loss would show up.

example: pinging 192.168.x.x

1480 bytes from 192.168.x.x: icmp_seq=993 ttl=64 time=0.131 ms
1480 bytes from 192.168.x,x: icmp_seq=996 ttl=64 time=0.109 ms
1480 bytes from 192.168.x.x: icmp_seq=997 ttl=64 time=0.121 ms

--- 192.168.x.x ping statistics ---
1000 packets transmitted, 646 packets received, 35.4% packet loss
round-trip min/avg/max = 0.089/0.202/3.215 ms

Network stats :

[root@v1:/vmfs/volumes/e6e9c139-########] esxcli network nic stats get -n vmnic0
NIC statistics for vmnic0
Packets received: 11615094673
Packets sent: 2705982369
Bytes received: 15157097757492
Bytes sent: 4784107499235
Receive packets dropped: 0
Transmit packets dropped: 0
Multicast packets received: 10247164
Broadcast packets received: 29481196
Multicast packets sent: 105222
Broadcast packets sent: 12874
Total receive errors: 765
Receive length errors: 2
Receive over errors: 0
Receive CRC errors: 763
Receive frame errors: 0
Receive FIFO errors: 0
Receive missed errors: 22506

Link down events :

The link down events would show up in the /var/run/log/vmkernel.log on ESXi host.

2025-03-13T01:47:24.137Z: [netCorrelator] 3148104485784us: [vob.net.vmnic.linkstate.down] vmnic vmnic1 linkstate down
2025-03-13T01:47:25.002Z: [netCorrelator] 3148103027015us: [esx.problem.net.vmnic.linkstate.down] Physical NIC vmnic1 linkstate is down
2025-03-13T01:54:44.194Z: [netCorrelator] 3148544542160us: [vob.net.vmnic.linkstate.down] vmnic vmnic1 linkstate down

Resolution

As the issue is with physical networking and vmnic(hardware), request to work with network engineer and hardware vendor.