This article outlines some of the recommended best practices for configuring network redundancy across VMware vSphere, vSAN, and NSX environments to eliminate single points of failure.
Network outages or uplink misconfigurations can lead to host isolation, vSAN partition states, and virtual machine communication failures. Implementing standardized redundancy and failover policies prevents these scenarios.
Physical Uplink Redundancy: Assign a minimum of two physical NICs (vmnics) to every vSphere Standard Switch (vSS) or vSphere Distributed Switch (vDS). Distribute these uplinks across physically diverse Top of Rack (ToR) switches to protect against hardware failure.
Teaming and Failover Policies:
For vDS, configure the load balancing policy to "Route based on physical NIC load" (Load Based Teaming / LBT) for optimal traffic distribution without requiring physical switch configuration.
For vSS, use "Route based on originating virtual port" as the default policy.
Physical Switch Aggregation: When utilizing multiple active uplinks for aggregate bandwidth via IP Hash, configure LACP (Link Aggregation Control Protocol) or multi-chassis link aggregation (e.g., vPC, MLAG) on the physical ToR switches.
Failover Detection: Set network failover detection to "Link status only" for upstream link failure detection.
net-Physical-configuration