vSAN cluster hosts show in partition state (different groups) post changing the physical switches used for the vSAN networking purpose.
vSAN skyline health may show hosts in different groups. You may see the hosts in different groups in partition.
The issue would occur when the MTU is not correctly or fully configured on new switches.
Upon verification you may see hosts are able to communicate with MTU 1500 with neighbor hosts. This confirms that the ports are configured with the VLANs.
However, upon checking for the MTU 9000 which is configured with vSAN vmk ports, the communication would fail for one or more hosts, but may not be for all the hosts. This confirms that there is an issue with MTU configuration on new switches.
You may refer the below snippets upon pinging with jumbo frames (MTU 9000), the ping would be successful for few hosts and would fail for few hosts.
------------------------------------------------------------[root@host :~ ] vmkping -I vmk1 10.##.##.23 -8-1472 -d -c 3PING 10.##.##.23 (10.##.##.23): 1472 data bytes1480 bytes from 10.##.##.23: icmp_aeq-0 ttl-64 time-0.196 ms1480 bytes from 10.##.##.23: icmp seq-1 ttl-64 time-0.306 ms1480 byces from 10.##.##.23: icmp_seq=2 ttl=64 time=0.336 mg
10.##.##.23 ping statistics -3 packets transmitted, 3 packets received, 0% packet lossround-trip min/avg/max = 0.196/0.279/0.336 m------------------------------------------------------------
[root@host :~ ] vmkping -I vmk1 10.##.##.23 -s 8972 -d -c 3PING 10.##.##.23 (10.##.##.23) : 8972 data byces8980 bytes from 10.##.##.23: 1cmp seq=0 ttl=64 time=0.424 mg8980 bytes from 10.##.##.23: icmp_seq-1 ttl-64 time-0.218 msB980 byces from 10.##.##.23: icmp seq-2 tel=64 time=0.392 mg
10.##.##.23 ping statistics ---3 packets transmitted, 3 packets received, 0% packet lossround-trip min/avg/max = 0.218/0.345/0.424 mg------------------------------------------------------------
[root@host :- ] vmkping -I vmk1 10.##.##.20 -s 8972 -d -c 3PING 10.##.##.20 (10.##.##.20): 8972 data bytes
10.##.##.20 ping statiatics3 packets transmitted, 0 packets received, 100% packet loss
------------------------------------------------------------[root@host :~ ] vmkping -I vmk1 10.##.##.20 -s 1472 -d -c 3PING 10.##.##.20 (10.##.##.20) : 1472 data bytes1480 bytes from 10.##.##.20: icmp_seq-0 ttl-64 time-0.385 ms1480 bytes from 10.##.##.20: icmp_seq-1 ttl-64 time-0.301 ms1480 byces from 10.##.##.20: 1cmp_seq=2 tol=64 time=0.300 ms
--- 10.##.##.20 ping statistics ---3 packets transmitted, 3 packets received, 0t packet lossround-trip min/avg/max = 0.300/0.329/0.385 ma
------------------------------------------------------------[root@host :~ ] vmkping -I vmk1 10.##.##.26 -s 8972 -d -c 3PING 10.##.##.20 (10.##.##.26) : 8972 data bytes10.##.##.26 ping statistics ---3 packets transmitted, 0 packets received, 100% packet loss[root@host :~ ]------------------------------------------------------------
Verify if the hosts are hosted to one or more server racks and verify if the top of the rack switches are standalone for the hosts on each server rack.
When changing all the switches, please ensure to have the MTU configured at all the levels TOR switch as well as mediator switches with required size.