Network Latency High alert persists between NSX manager and Transport Node
search cancel

Network Latency High alert persists between NSX manager and Transport Node

book

Article ID: 391288

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Manager reports an alarm "Network Latency High between Manager Node and Transport Node".
  • Testing pings between the concerned Transport Node and any NSX manager does not show latency near or above 150 milliseconds, nor any packet drop.
  • After resolving the alarm manually in the NSX UI, it is triggered anew within a few minutes.
  • On the NSX manager, in the /var/log/messaging-manager/messaging-manager.log file you can see periodic entries such as:
    2025-03-06T12:40:27.170Z  INFO nsx-rpc:unix:///var/run/vmware/nsx-opsagent/alarms-provider-service.sock:user-executor-0 EventSource - MONITORING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="messaging"] EventSource: Sync triggered. featureName: communication, eventType: network_latency_high, entityId: <Transport Node ID>, status: true, context: {"transport_node_name":"<FQDN>","transport_node_address":"<TN IP>"}
  • No other Transport Node is affected by the same persistent alarm.

 

Environment

VMware NSX v4.1.x

Cause

In NSX v4.1 there is a known issue causing the high network latency conditions which could occasionally be detected for a specific host to remain uncleared if a leadership change occurs at the same time in the Management cluster.

Resolution

This issue is resolved in VMware NSX 4.2.0 and later, available at Broadcom downloads. If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

In earlier versions: connect to the NSX management VIP in CLI as root and run the command:

systemctl restart messaging-manager

Additional Information

A similar condition may cause recurrent "Management Channel to Transport Node Down" alarms repeating every few days, in which case the host lists 1 or more tunnels being down in the general Host Transport Nodes view but no tunnels are shown as down in its detailed view:

Such unwarranted alarms auto-resolve on their own and the host returns to status 'Up' after clicking SYNC or refreshing the view in the browser.

The same resolution method of restarting the messaging-manager service on all NSX Management nodes will purge the stale monitoring data triggering this issue.