After an Edge failover, Load balance health check TCP sessions are flapping to reach pool members.
It is observed in /var/log/syslog on the edge node that health check status becomes down and up.
|
<DATE_TIME> <HOSTNAME> NSX 141555 LOAD-BALANCER [nsx@6876 comp=""nsx-edge"" subcomp=""lb"" s2comp=""lb"" level=""WARN""] [<UUID>] HLCK: monitor <UUID> server: <POOL_MEMBER_IP>:<POOL_MEMBER_PORT> change to down, code: 8) <DATE_TIME> <HOSTNAME> NSX 141555 LOAD-BALANCER [nsx@6876 comp=""nsx-edge"" subcomp=""lb"" s2comp=""lb"" level=""WARN""] [<UUID>] HLCK: monitor <UUID> server: <POOL_MEMBER_IP>:<POOL_MEMBER_PORT> change to up |
And it is seen in /var/log/syslog on the edge node that the tunnel status becomes down and up as well.
|
<DATE_TIME> <HOSTNAME> NSX 1 FABRIC [nsx@6876 comp=""nsx-edge"" subcomp=""nsxa"" s2comp=""tunnel"" level=""INFO""] Tunnel <TEP_IP>:<TEP_IP>(geneve) state updated from up to down <DATE_TIME> <HOSTNAME> NSX 1 FABRIC [nsx@6876 comp=""nsx-edge"" subcomp=""nsxa"" s2comp=""tunnel"" level=""INFO""] Tunnel <TEP_IP>:<TEP_IP>(geneve) state updated from down to up |
VMware NSX
By checking ESX hosts of the pool member IP, duplicate mac addresses from the different edge nodes are found to trigger the tunnel flapping.
|
Output of : localcli network ip neighbor list -N vxlan Neighbor Mac Address Vmknic Expiry State Type |
The resolution can be found in the following KB article:
https://knowledge.broadcom.com/external/article/345804