BFD tunnels flap during log collection on NSX Edge
search cancel

BFD tunnels flap during log collection on NSX Edge

book

Article ID: 431892

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Log collection is running on an Edge.
  • There may be core dump files present in the system, and the collector may be configured to gather the core dump files as well. 
  • Amount of logs to be collected is several gigabytes.
  • During the log collection, a brief flap of Edge's BFD tunnels is observed.
  • BFD tunnels automatically recover after few seconds. 

  • In Edge's /var/log/syslog, the flap during log collection will be logged:
    2026-01-13T13:09:00.001Z <edge-name> Log bundle collection started 
    2026-02-13T13:13:13.102Z <edge-name> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="tunnel" level="INFO"] Tunnel 10.1.2.3:10.1.2.4(geneve) state updated from up to down
    2026-02-13T13:13:21.684Z <edge-name> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="tunnel" level="INFO"] Tunnel 10.1.2.3:10.1.2.4(geneve) state updated from down to up
    2026-01-13T13:23:13.085Z <edge-name> Log bundle collection completed

  • A high number of interrupts is observed on the fast path CPU, e.g.:
    # cat /proc/interrupts | grep -E "CPU|ioc|megasas"
                CPU0       CPU1       CPU2       CPU3

       0:          7          0          0          0
       8:          0          0          0          0
       9:          0          0          0          0
      10:          0          0          0          0
      16:          0          0          0          0
      24:          0          0          0          0
      25:          0          0          0          0
      27:          0          0          0          0
      28:          0          0          0          0
      30:          0          0          0          0
      31:          0          0          0          0
      32:          0          0          0          0
      34:          0          0          0          0
      35:          0  168811611          0          0

  • To confirm fast path CPU allocation, use command:
    # cat /proc/cmdline | grep -o 'isolcpus=[^ ]*'
    Output will be similar to the output below:
    isolcpus=0,1

  • Fast-path interfaces (fp-ethX) RX_MISSES counter will have high number of drops. In monitoring solutions (e.g. Aria Operations for Networks), this will appear as a burst of dropped packets:
    1. SSH to the Edge as admin.
    2. Check the statistics on the fp-eth interfaces:
      edge01> get physical-port fp-eth0 stats
      Thu Mar 05 2026 UTC 14:12:52.862
      Physical Port Stats
      NAME              : fp-eth0
      RX_BYTES          : 2382435250
      RX_ERRORS         : 0
      RX_MISSES         : 80235254
      RX_NOMBUFS        : 0

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware NSX

Cause

Storage interrupts are mapped to fast path CPUs, and activity like log collection will stress the fast path CPU, which may lead to loss of some network traffic. 

Resolution

This is a known issue impacting VMware NSX.

 

Workaround:

If you believe you have encountered this issue, and require a workaround, open a support case with Broadcom Support and refer to this KB article.

For more information, see Creating and managing Broadcom support cases.

Additional Information

Note this issue may impact NSX Edge deployed as a Virtual Machine (any form factor), as well as an Edge running as Bare Metal Edge.