Some NSX prepared hosts lose controller connectivity
search cancel

Some NSX prepared hosts lose controller connectivity

book

Article ID: 322531

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
  • You're running VMware NSX-T Datacenter 3.2.1 to 3.2.3.x or VMware NSX 4.x
  • You're running three NSX manager cluster.
  • Some of the NSX prepared hosts lose controller connectivity.
  • The affected hosts are all connected to a single NSX manager when you verify using the following command: nsxcli -c get controllers
  • You observe the below entries in the respective log directories
/var/log/cloudnet/nsx-ccp-*.log
2023-06-14T19:35:01.202Z WARN CorfuRuntime-0 CorfuRuntime 12591 Tried to get layout from <nsx-manager-ip>:9000 but failed by timeout
....
2023-06-14T19:35:04.759Z ERROR org.corfudb.runtime.collections.streaming.StreamPollingScheduler-worker-0 ResumeStreamListener 12591 SYSTEM [nsx@6876 comp="nsx-manager" erro rCode="MP4" level="ERROR" subcomp="ccp"] Exception caught during streaming processing. Re-subscribe this listener to latest timestamp
....
2023-06-14T19:35:07.191Z ERROR org.corfudb.runtime.collections.streaming.StreamPollingScheduler-worker-0 ResumeStreamListener 12591 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP4" level="ERROR" subcomp="ccp"] Failed to re-subscribe [tag:ccp] monitoring$[[PimConfigInternal, RedistributionInternalConfig, InternalIpSecVpnIkeProfile, LogicalDhcpServer, ContextProfileInternal, BgpConfig, LogicalRouterConfig,
....


Environment

VMware NSX-T Data Center

Cause

This issue occurs in NSX-T Data Center 3.2.1, when the manager node, which is the controller for these impacted ESXi hosts, is unable to resubscribe to corfu BD.

Resolution

This issue is resolved in NSX 3.2.4 & NSX 4.1.1 to download please refer to this Download Broadcom products and software

Workaround:

  • Run the following command on the ESXi hosts to identify which manager they are connected to:
nsxcli -c get controllers
  • Restart the controller service on the node identified above using the below command as root user:
/etc/init.d/nsx-ccp restart
  • If this fails to resolve this issue, you can restart the identified manager. Before restarting the NSX-T manager, make sure the cluster is healthy, by running the get cluster status command as admin user.