VMs running on ESXi hosts managed by the Secondary NSX Manager become unreachable
search cancel

VMs running on ESXi hosts managed by the Secondary NSX Manager become unreachable

book

Article ID: 326352

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • VMs running on ESXi hosts managed by the Secondary NSX Manager become unreachable.
 
  • NSX Host preparation health check shows a Warning for the connection between the ESXi hosts managed by the Secondary NSX Manager and the NSX controller.
 
  • ESXi hosts managed by the Secondary NSX Manager are not connected to the NSX controllers
#esxcli network ip connection list | grep 1234
 
Example of expected output:
tcp         0       0  192.168.110.51:32071              192.168.121.33:1234   ESTABLISHED   2101695  newreno  netcpa
tcp         0       0  192.168.110.51:12212              192.168.121.31:1234   ESTABLISHED   2101695  newreno  netcpa
tcp         0       0  192.168.110.51:52938              192.168.121.32:1234   ESTABLISHED   2101695  newreno  netcpa

 
  • On the ESXi hosts managed by the Secondary NSX Manager, the file /etc/vmware/netcpa/config-by-vsm.xml does not contains NSX controllers information.
#grep -E 'connectionList|server|port' /etc/vmware/netcpa/config-by-vsm.xml
        <connectionList>
        </connectionList>


Example of expected output:
        <connectionList>
                        <server>192.168.121.31</server>
                        <port>1234</port>
                        <server>192.168.121.32</server>
                        <port>1234</port>
                        <server>192.168.121.33</server>
                        <port>1234</port>
        </connectionList>

 
  • Secondary NSX Manager logs (vsm.log) display similar messages indicating the NSX Manager Service restarted:
vsm.log.1:2019-07-12 10:55:18.757 CEST  INFO localhost-startStop-2 VsmServletContextListener:75 - NSX Status : STOPPED
vsm.log.1:2019-07-12 10:56:03.588 CEST  INFO localhost-startStop-1 VsmServletContextListener:75 - NSX Status : STARTING
vsm.log.1:2019-07-12 10:58:41.847 CEST  INFO localhost-startStop-1 VsmServletContextListener:75 - NSX Status : RUNNING

 
  • Secondary NSX Manager logs (nsx-wrapper.log) display messages similar to the messages below at the time the NSX Manager Service restarted:
INFO   | jvm 1    | 2019/07/12 10:55:12 | WrapperManager Error: Found 2 deadlocked threads!
STATUS | wrapper  | 2019/07/12 10:55:12 | A Thread Deadlock was detected.  Restarting JVM


Environment

VMware NSX Data Center for vSphere 6.4.x
VMware NSX for vSphere 6.3.x
VMware NSX for vSphere 6.4.x

Cause

This issue is due to a race condition which cause the secondary NSX Manager to enter an inconsistent state and NSX controller information is not pushed to ESXi hosts managed by the Secondary NSX Manager.

Resolution

This issue is resolved in VMware NSX for Data Center 6.4.5.

Workaround:
To workaround the issue you can either:
  • Delete one NSX controller on the Primary NSX Manager and redeploy a new Controller.
  • Remove Secondary Role from the Secondary NSX Manager and re-add it as Secondary.
Both workarounds will clear the inconsistent state of the Secondary NSX Manager and the file /etc/vmware/netcpa/config-by-vsm.xml on the ESXi hosts managed by the Secondary NSX Manager will be updated with the NSX controllers information and the connections to the NSX controllers will be restored.

Verify on the ESXi host using:
#grep -E 'connectionList|server|port' /etc/vmware/netcpa/config-by-vsm.xml
#esxcli network ip connection list | grep 1234