HCX - NE HA Failover Threshold Guidelines
search cancel

HCX - NE HA Failover Threshold Guidelines

book

Article ID: 367619

calendar_today

Updated On:

Products

VMware HCX

Issue/Introduction

In some cases, HCX Network Extension High Availability (NE HA) Failover can happen more often than expected in a busy network that is prone to occasional packet loss. NE HA periodically transmits Bidirectional Forwarding Detection (BFD) UDP packets as a heartbeat mechanism between the active and standby NE appliance at the same local site. When the sustained loss of these packets is detected it will trigger a standby to active NE failover

NE HA is designed to send 2 independent BFD heartbeat streams between the active and standby NE appliance at the same local site. One BFD stream is sent between the NE management interfaces on the active and standby NE appliance at the same local site and the other BFD stream is sent between the NE uplink interfaces on the active and standby NE appliance at the same local site. If a period of sustained packet loss is only experienced on one of the interfaces, either the management or uplink but not both, then no NE HA failover will take place. This is to avoid false positives where the NE appliance is not truly down. Sustained packet loss must be detected by both BFD streams simultaneously to trigger a standby to active NE failover

The dual BFD heartbeat fail safe may not work as planned however when a shared single interface is used on an NE appliance for both management and uplink. In this scenario, a period of sustained packet loss may impact both BFD streams since they share the same interface. Because of this, it is recommended to always use a separate management and uplink interface on an NE appliance that is configured for HA

Resolution

From HCX release 4.9.1, the default values used for BFD heartbeats are the following which results in a 1.5 second failover detection time during periods of sustained packet loss

heartbeat interval: 250ms
heartbeat loss threshold: 6 heartbeats

In future software releases, these BFD heartbeat values will be fully configurable through HCX Manager UI to allow more tolerance for busy network environments that are prone to more frequent packet loss

In the meantime, the BFD heartbeat values can be adjusted via a script. Please open a Case with the Broadcom Support team for further assistance if this is required