Deployment of Manager or Host transport nodes fails
search cancel

Deployment of Manager or Host transport nodes fails

book

Article ID: 323539

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:

  • NSX-T 4.0.0 to 4.0.1
  • Adding a new Manager or Host transport node fails.
  • On the manager node failing to join the cluster:
/var/log/syslog
2022-11-23T03:43:08.545Z <nsx-manager-hostname> NSX 12182 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="INFO"] {10000} CMD: join <manager-IP> cluster-id ee82b2c5-2d91-45d3-9789-95494624c774 thumbprint 2165e3cfb8d593b25986653dac372fc18489ab746a5f57eac6b8d61d0b719fc1 token <token-obfuscated> force
2022-11-23T03:43:08.546Z <nsx-manager-hostname> NSX 12182 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="WARNING"] Unable to determine terminal size: [OSError] [Errno 25] Inappropriate ioctl for device
2022-11-23T03:43:08.571Z <nsx-manager-hostname> NSX 11662 - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Received event: CoordinationEvent[type=MEMBER_ADDED, source=5ecb4de3-03b9-4a3a-ace7-1a66fff7948d]
2022-11-23T03:43:08.572Z <nsx-manager-hostname> NSX 11662 - [nsx@6876 comp="nsx-manager" level="INFO" subcomp="manager"] Ignoring event MEMBER_ADDED from source CoordinationEvent[type=MEMBER_ADDED, source=5ecb4de3-03b9-4a3a-ace7-1a66fff7948d].

o: [CBM125] SSL exception when making an attach call to the destination. Please check the thumbprint.
2022-11-23T03:43:13.618Z <nsx-manager-hostname> NSX 12182 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="ERROR" errorCode="('CLI110',)"] Error processing join request, status: 500, obj: {'error_code': 36752, 'error_message': 'Operation failed. Reason: [CBM125] SSL exception when making an attach call to the destination. Please check the thumbprint.', 'module_name': 'node-services'}, err: None
2022-11-23T03:43:13.620Z <nsx-manager-hostname> NSX 12182 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="WARNING"] An error occurred while joining the specified cluster. Reason: [CBM125] SSL exception when making an attach call to the destination. Please check the thumbprint.
2022-11-23T03:43:13.621Z <nsx-manager-hostname> NSX 12182 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="INFO" audit="true"] CMD: join <manager-IP> cluster-id ee82b2c5-2d91-45d3-9789-95494624c774 thumbprint 2165e3cfb8d593b25986653dac372fc18489ab746a5f57eac6b8d61d0b719fc1 token <token-obfuscated> force (duration: 5.074s), Operation status: CMD_EXECUTED_WITH_ERROR_RESULT
2022-11-23T03:43:13.621Z <nsx-manager-hostname> NSX 12182 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="INFO"] NSX CLI stopped for user: admin
  • On the ESXi host failing to join NSX-T cluster:
    • In the UI: "Failed to install software on host. Failed to install software on host. Time out waiting for host to join NSX Manager."
      /var/log/nsxcli.log
      2023-03-03T11:16:25.360Z 2186011 cli.server.cli_command_service INFO {0} CMD: join management-plane <manager-IP> thumbprint 489487579181ca59a583eb65b6eae14f92c2308bf6f877544a98f80cef5d1229 token <token-obfuscated> node-uuid e4dac940-6afc-46bc-9135-21f2676737d85
      2023-03-03T11:16:25.362Z 2186011 cli.utils.render_utils WARNING Unable to determine terminal size: [OSError] [Errno 25] Inappropriate ioctl for device
      2023-03-03T11:16:26.017Z 2186011 cli.commands.host_shared.register INFO version 7.0.3 buildnum 19193900
      2023-03-03T11:16:26.019Z 2186011 cli.commands.host_shared.register INFO Tokenfile is not given
      2023-03-03T11:16:26.021Z 2186011 vmware.runcommand INFO runcommand called with: args = '['/usr/bin/openssl', 'x509', '-in', '/etc/vmware/nsx/host-cert.pem', '-subject', '-noout']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
      2023-03-03T11:16:26.040Z 2186011 vmware.runcommand INFO runcommand called with: args = '['/usr/bin/openssl', 'req', '-new', '-newkey', 'rsa:2048', '-days', '3650', '-nodes', '-x509', '-keyout', '/tmp/tmpki5pqek1', '-out', '/tmp/tmpyr9x091b', '-config', '/tmp/tmpwaoivm0c', '-extensions', 'req_ext']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
      2023-03-03T11:16:29.099Z 2186011 cli.utils.apiclient ERROR POST /api/v1/fabric/nodes/e4dac940-6afc-46bc-9135-21f267626d85?action=register_node raised exception: <class 'ssl.SSLEOFError'>
      Traceback (most recent call last):
        File "/opt/vmware/nsx-cli/bin/python/cli/utils/apiclient.py", line 87, in request
          conn.connect()
        File "/lib64/python3.8/http/client.py", line 1427, in connect
        File "/lib64/python3.8/ssl.py", line 500, in wrap_socket
        File "/lib64/python3.8/ssl.py", line 1040, in _create
        File "/lib64/python3.8/ssl.py", line 1309, in do_handshake
      ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:1125)
      2023-03-03T11:16:29.103Z 2186011 cli.commands.host_shared.register INFO Stopping nsx-proxy
      2023-03-03T11:16:29.104Z 2186011 vmware.runcommand INFO runcommand called with: args = '['/etc/init.d/nsx-proxy', 'stop']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
      2023-03-03T11:16:31.109Z 2186011 cli.commands.host_shared.register INFO Starting nsx-proxy
      2023-03-03T11:16:31.111Z 2186011 vmware.runcommand INFO runcommand called with: args = '['/etc/init.d/nsx-proxy', 'start']', outfile = 'None', returnoutput = 'True', timeout = '0.0'.
      2023-03-03T11:16:36.248Z 2186011 cli.server.cli_command_service WARNING Exception when registering host: 'Unable to connect to the API service'
      2023-03-03T11:16:36.252Z 2186011 cli.audit INFO CMD: join management-plane <manager-IP> thumbprint 489487579181ca59a583eb65b6eae14f92c2308bf6f877544a98f80cef5d1229 token <token-obfuscated> node-uuid e4dac940-6afc-46bc-9135-21f267737d85 (duration: 10.890s), Operation status: CMD_EXECUTED_WITH_ERROR_RESULT
      2023-03-03T11:16:36.253Z 2186011 cli INFO NSX CLI stopped for user: root
          
  • The thumbprint being used to join the cluster is different to the thumbprint returned by the existing manager node
On existing node:
<nsx-manager-hostname>> get certificate api thumbprint
b7006bb89be6bcb1b3f6a66816235ff24d06109fabcde590e40b6347d9fec9d4

On Manager/Host node trying to join cluster:
# /var/log/syslog
2022-11-23T03:43:08.545Z <nsx-manager-hostname> NSX 12182 - [nsx@6876 comp="nsx-manager" subcomp="cli" username="admin" level="INFO"] {10000} CMD: join <manager-IP> cluster-id ee82b2c5-2d91-45d3-9789-95494624c774 thumbprint 2165e3cfb8d593b25986653dac372fc18489ab746a5f57eac6b8d61d0b719fc1 token <token-obfuscated> force



Environment

VMware NSX 4.0.0.1

Cause

  • This is caused by the certificates PEM encoding containing '\r\n' linebreaks used in DOS/Windows, making it unreadable to certain NSX services.

Resolution

  • This is resolved in NSX-T 4.1.1.


Workaround:

  • If you encounter this issue please open a SR with the NSX-T team and reference this KB article (94326).