Avi Virtual Service configured in an NSX-T cloud is unreachable after rapid state changes
book
Article ID: 404938
calendar_today
Updated On:
Products
VMware Avi Load Balancer
Issue/Introduction
When an Avi virtual service, configured in an NSX-T cloud, experiences rapid state changes, often referred to as "flapping" (going down and up in a short period), the Virtual IP (VIP) address may become unreachable.
This can result in a service outage even if the virtual service reports an "Up" status.
You can confirm you are facing this issue if the route for the VIP is absent from the NSX-T T1 router.
Environment
VMware NSX-T
Cause
The loss of connectivity is caused by a race condition in the underlying network operations.
When a virtual service changes state, its VIP is either detached from (on Down) or attached to (on Up) the virtual network interface of the Avi Service Engine.
If these state changes happen too quickly, an 'attach_ip' command can be issued before the system receives a response that the previous 'detach_ip' command has completed.
This timing conflict prevents an internal flag from being cleared properly.
Consequently, the system fails to add the necessary route for the VIP to the NSX-T Tier 1 router, leading to a loss of reachability.
Resolution
A permanent fix is available in a later software version. A temporary workaround can be used to immediately restore service.
Resolution:
This issue is fully resolved in the following Avi Load Balancer versions:
30.2.4-2p1
31.1.2
31.2.1
Upgrading to a fixed version is the recommended permanent solution. The bug ID is AV-241172.
Workaround:
You can restore connectivity by toggling the affected virtual service by performing the following steps:
Disable the virtual service.
Enable the virtual service.
This action forces a clean detach and re-attach operation, which correctly programs the route in the NSX-T router.