Symptoms:
- On the NSX-T Manager, the alarm "Control Channel To Transport Node Down" is reported.
- There is no impact on the services or VMs running on the Transport Node.
- The Transport Node is connected to the Manager and Controller services:
esx-05.corp.local> get controllers
Wed Jul 21 2021 UTC 08:16:30.127
Controller IP Port SSL Status Is Physical Master Session State Controller FQDN
192.168.120.1 1235 enabled connected true up NA
192.168.120.2 1235 enabled connected false up NA
192.168.120.3 1235 enabled connected false up NA
esx-05.corp.local> get managers
Wed Jul 21 2021 UTC 08:18:43.213
- 192.168.120.1 Connected (NSX-RPC) *
- 192.168.120.2 Connected (NSX-RPC)
- 192.168.120.3 Connected (NSX-RPC)
- As soon as the alarm is "resolved" in the NSX-T web interface, it reappears within few minutes.
- From the on the NSX-T Manager, we may see the following behavior:
Transport node is reported as disconnected: /var/log/cloudnet/nsx-ccp.log
2021-07-09T08:56:30.719Z WARN pool-79-thread-1 EventReportSyslogSender 19972 MONITORING [nsx@6876 comp="nsx-manager" entId="aabbccdd-0011-5a11-8052-b12acf4a1234" eventFeatureName="communication" eventSev="warning" eventState="On" eventType="control_channel_to_transport_node_down" level="WARNING" subcomp="ccp"] Controller service 301ba123-0123-12d1-123a-1cf123e12c12 to Transport node aabbccdd-0011-5a11-8052-b12acf4a1234 down for at least three minutes from Controller service's point of view.
Therefore, after the connection comes up online but the alarm is not cleared:
2021-07-09T09:57:40.719Z WARN pool-79-thread-1 EventReportSyslogSender 19972 MONITORING [nsx@6876 comp="nsx-manager" entId="aabbccdd-0011-5a11-8052-b12acf4a1234" eventFeatureName="communication" eventSev="warning" eventState="Off" eventType="control_channel_to_transport_node_down" level="WARNING" subcomp="ccp"] Controller service 301ba123-0123-12d1-123a-1cf123e12c12 restores connection to Transport node aabbccdd-0011-5a11-8052-b12acf4a1234.