TeamPolicyUpDelay can be seen not being honored when used with the 'Routing based on PNIC load' teaming policy. This can cause VMs on the ESXi to lose connectivity.
VMware vSphere ESXi
This is a known issue impacting VMware vSphere ESXi and will be fixed in upcoming releases.
Workaround:
Modify the teaming policy for all DVPGs using 'Routing based on PNIC load' to another suitable policy temporarily during any TOR upgrade.
Revert to using 'Routing based on PNIC load' after the TOR upgrade completes and the expiration of teamPolicyUpDelay on the DVS.
During upgrades on TOR or physical switches, an ESXi's uplink connected to this physical switches may go down. IF ESXi detects this uplink to go down, it chooses other uplink depending on the teaming and failover policy. This is expected behavior.
After a reboot the physical switch's interface may come up earlier than when this switch is actually ready to handle traffic . If ESXi starts forwarding packets through this uplink, these packets may get dropped because the physical switch is still not ready to forward traffic. To prevent this from happening, TeamPolicyUpDelay feature is sometimes used. TeamPolicyUpDelay defines the time, in milliseconds, that the ESX host waits after detecting that a pNIC has transitioned from a down state to an up state before the host starts using the pNIC again for network traffic. This delay helps to ensure that the pNIC has fully recovered and stabilized before it is put back into use.
However , when used with the 'Routing based on PNIC load' teaming policy, this TeamPolicyUpDelay is not followed. This is due to a known issue and bug in ESXi.