Heavy volume of communication.control_channel_to_transport_node_down_long alarms
search cancel

Heavy volume of communication.control_channel_to_transport_node_down_long alarms

book

Article ID: 428911

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • The below alarm is seen on NSX UI with severity 'Critical'

    Due to heavy volume of communication.control_channel_to_transport_node_down_long alarms, the alarm service has temporarily stopped reporting alarms of this type. The NSX UI and GET /api/v1/alarms NSX API are not reporting new instances of these alarms; however, syslog entries and SNMP traps (if enabled) are still being emitted reporting the underlying event details. When the underlying issues causing the heavy volume of communication.control_channel_to_transport_node_down_long alarms are addressed, the alarm service will start reporting new communication.control_channel_to_transport_node_down_long alarms when new issues are detected again. 

  • Along with the above alarm, hundreds of the 'control_channel_to_transport_node_down_long' alarms can be observed as 'open' alarms. 

Environment

VMware NSX 4.x

Cause

The 'heavy volume of communication.control_channel_to_transport_node_down_long' appears when large number of (usually over 100) 'control_channel_to_transport_node_down_long' are kept open and not resolved/ not suppressed for long period of time.

One possible reason of such 'control_channel_to_transport_node_down' alarms not being addressed and investigated is that these alarms are generated due to stale transport nodes. Perhaps the hosts are removed from the vCenter inventory or are powered down permanently without first being properly unprepared through the NSX Manager. The stale transport nodes will show up under Fabric > Hosts > 'Standalone' tab. 

Resolution

  1. If stale transport nodes are generating the high number of communication.control_channel_to_transport_node_down_long alarms, then these stale transport nodes need to be removed from NSX Manager. The below KB may be used to complete this task:

    Script for removing large number of stale transport nodes from NSX

  2. If it is verified and confirmed that there are no stale transport nodes, then investigate the reason for control channel down consulting the KB article linked below:

Control Channel To Transport Node Down Long Alarm

 

Additional Information

For heavy volume alarm for other types of issues, please refer to the following KB article:

Heavy Volume of Alarms alarm in NSX Manager