After firewall changes were made multiple vSAN clusters showed many vSAN objects as inaccessible and VM show as invalid in vCenter or host UI. The vSAN Skyline Health checks shows alerts for vSAN cluster partition and vSAN Basic (unicast) connectivity.
ESXi and vSAN with LACP networking
A change on the firewall interrupted LACP traffic so that after a host reboot or other network interruption the network connection to other hosts could not be reestablished. The firewall issue caused a cluster partition between various hosts in three separate vSAN clusters. This lack of communication between hosts led to a situation where many data objects did not have quorum and could not be accessed. Ping testing across the vSAN network confirmed loss of connectivity between hosts.
PING 53.55.202.144 (53.55.202.144): 8972 data bytes--- 53.55.202.144 ping statistics ---10 packets transmitted, 0 packets received, 100% packet loss--PING 53.55.202.148 (53.55.202.148): 8972 data bytes--- 53.55.202.148 ping statistics ---10 packets transmitted, 0 packets received, 100% packet loss--PING 53.55.202.143 (53.55.202.143): 8972 data bytes--- 53.55.202.143 ping statistics ---10 packets transmitted, 0 packets received, 100% packet loss--PING 53.55.202.141 (53.55.202.141): 8972 data bytes--- 53.55.202.141 ping statistics ---10 packets transmitted, 0 packets received, 100% packet loss--PING 53.55.202.146 (53.55.202.146): 8972 data bytes--- 53.55.202.146 ping statistics ---10 packets transmitted, 0 packets received, 100% packet loss--PING 53.55.202.151 (53.55.202.151): 8972 data bytes--- 53.55.202.151 ping statistics ---10 packets transmitted, 0 packets received, 100% packet loss
The customer must engage with their network team to address the problem created by the firewall change. Once firewall issues are corrected host connectivity is restored and VMs display correctly and can power on.