NSX-T Alert "Management Channel To Manager Node Down", "Management Channel To Manager Node Down Long" is reported in NSX Manager
search cancel

NSX-T Alert "Management Channel To Manager Node Down", "Management Channel To Manager Node Down Long" is reported in NSX Manager

book

Article ID: 377799

calendar_today

Updated On:

Products

VMware NSX-T Data Center

Issue/Introduction

  • On the NSX-T Manager, the alarm Management Channel To Manager Node Down or Management Channel To Manager Node Down Long is reported. Essentially these alarms are raised when Transport node is not able to connect to Control Plane (CCP).
  • These alarms are observed on connected and working transport nodes.
  • There is no impact on the services.
  • Rebooting NSX-T Managers does not resolve the alert.

In the NSX Manager log /var/log/syslog you see the following entries : 

syslog.49:2024-08-12T00:28:04.628Z NSX 20100 MONITORING [nsx@6876 alarmId="2f3####a-ac##-4##0-80##-f8db######40" alarmState="OPEN" comp="nsx-manager" entId="ae1a####-e##c-4##a-b9##-437a######38" eventFeatureName="communication" eventSev="MEDIUM" eventState="On" eventType="management_channel_to_transport_node_down" level="WARNING" nodeId="22b5####-##c0-48##-97##-235e########" subcomp="monitoring"] Management channel to Transport Node ##d4-v##-esxi0#.##.services (X.X.X.X) is down for 5 minutes.
syslog.49:2024-08-12T00:31:04.646Z NSX 20100 MONITORING [nsx@6876 alarmId="6c####fc-b4##-41##-b##8-b551######d1" alarmState="OPEN" comp="nsx-manager" entId="f83####4-6##e-4#b#-8f##-61c64f######" eventFeatureName="communication" eventSev="MEDIUM" eventState="On" eventType="management_channel_to_transport_node_down" level="WARNING" nodeId="22b5####-bd##-##f0-97##-235e####fa15" subcomp="monitoring"] Management channel to Transport Node ##d4-v##-esxi0#.##.services (X.X.X.X) is down for 5 minutes.

2024-08-11T13:08:14.325Z  INFO Thread-503 EventSource - MONITORING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="messaging"] Sending management_channel_to_transport_node_down_long's event instances with true status to provider/collector when full sync happens.
2024-08-11T17:08:14.257Z  INFO Thread-504 EventSource - MONITORING [nsx@6876 comp="nsx-manager" level="INFO" subcomp="messaging"] Sending management_channel_to_transport_node_down's event instances with true status to provider/collector when full sync happens.

Environment

VMware NSX-T

Cause

CCP (Central Control Plane) data migration from NSX-T Data Center 3.0.x/3.1.x to later release may leave conflict records, which may generate these alarms as this alarm feature is introduced in the NSX-T version 3.2.0. 

Management Channel To Manager Node Down:

>>  This alarm is raised when messaging channel between Manager Node and Transport node is down for 5 minutes.

Management Channel To Transport Node Down Long:

>>  This alarm is raised when messaging channel between Manager Node and Transport node is down for more than 15 minutes.

Title: Control channel to manager node down too long
Event ID: control_channel_to_manager_node_down_too_long

Alarm Description

  • Purpose: Transport node's control plane connection to the Manager node is down.
  • Impact: In this scenario, no new configuration can be pushed down to the Transport node from the Control plane and features like vMotion will not be available.

Resolution

Steps to Resolve
For 3.2.0 and higher
 

Resolution: 

>> When this alarm is raised, check the connectivity between Transport node and Control plane.

   localcli network ip connection list | grep 1235 (on ESX node)
   netstat -anp | grep 1235 (on Edge node)

>> Also ensure no firewalls are blocking traffic between the nodes.

>> Ensure the messaging manager service is running on Manager nodes by invoking the following command, 

'/etc/init.d/messaging-manager status'

>> If the messaging manager is not running, restart it by invoking the below command,

'/etc/init.d/messaging-manager restart'

 

Maintenance window required for remediation? No

Additional Information