NSX Gloabal Manager Alarm "Queue Occupancy Threshold Exceeded"
search cancel

NSX Gloabal Manager Alarm "Queue Occupancy Threshold Exceeded"

book

Article ID: 414606

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Federation
  • Alarm Queue Occupancy Threshold Exceeded  "LM_2_GM_NOTIFICATION"
  • GM logs
    ar.log:2025-09-22T21:10:20.094Z  WARN EventReportProcessor-1-3 EventReportSyslogSender 77561 MONITORING [nsx@6876 comp="global-manager" entId="ID" eventFeatureName="federation" eventSev="warning" eventState="On" eventType="queue_occupancy_threshold_exceeded" level="WARNING" subcomp="async-replicator"] Queue 
  • From GM UI System>Location Manager "Delta Sync"

Environment

NSX 4.2.x.x

Cause

The Traceflow feature, when the user starts Traceflow on the UI, the system creates a Traceflow Observation, Traceflow Status, along with Traceflow Config that the user made.

Traceflow Config holds information about the source and destination MAC and IP.
Traceflow Observation holds information about each hop.
Traceflow Status holds information about the status of the current trace.

Every two hours, the system initiates a cleanup of the Traceflow Config. This process also removes the corresponding Traceflow Observations and Traceflow Status entries. During this cleanup, the system populates the cache on the Global Manager (GM) to ensure that Traceflow Observations are sent only to the appropriate Local Managers (LMs). The expected outcome is that all observations are successfully cleaned up.

However, when the queue size continues to increase, we observe that Traceflow Observations persist in the system even after the associated Traceflow Config and Traceflow Status have been deleted. This behavior causes inconsistencies in the GM cache, as the DELETE notifications from the LMs are not handled correctly by the GM.

Resolution

Workaround:

  1.  Access via SSH to the active Global Manager with user "root"
  2.  Run this command:
          corfu_tool_runner.py -n nsx -t TraceflowObservation -o clearTable
  3. Refresh and access the NSX UI. Check the “Delta Sync” — the message queue should be decreasing.
  4. After a few minutes, the “Queue Occupancy Threshold Exceeded” alarm should clear automatically.