A VMware vSAN 8.0.x Original Storage Architecture (OSA) or Express Storage Architecture (ESA) stretched cluster may report a persistent Network Partition error. In the vSAN Skyline Health UI, the witness node appears missing from the sub-cluster membership, often resulting in objects entering a reduced-availability-with-no-rebuild state. This condition can persist even when basic ICMP ping connectivity is established, typically due to network-level blocking of essential transport ports or an MTU mismatch along the physical routing path.
The partition is generally caused by the inability of the Unicast Agent to maintain membership due to:
Restore bidirectional communication across all required vSAN ports and ensure MTU consistency.
1. Validate MTU Path Consistency Confirm the network path can pass full-size, non-fragmented frames between the Master data node and the Witness. Run the following command from an ESXi host:
Note: If this fails, investigate physical switch configurations or routing interface bottlenecks.
2. Open Required Firewall Ports Ensure the following ports are open bidirectionally between all vSAN data nodes and the witness node:
3. Verify Port Accessibility Once ports are opened, verify reachability from the ESXi hosts using the nc (netcat) command:
4. Verify Port Accessibility using the pktcap-uw command between data and witness nodes and witness and data nodes .
pktcap-uw --vmk vmk --dir 2 -o - | tcpdump-uw -enr - | grep -i 12321
5. Monitor Recovery Trigger a manual Skyline Health check. The witness node should automatically rejoin the sub-cluster, and the partition alert should clear.