Disconnection between NSX Global Managers and NSX Local Managers post NSX upgrade from 3.x to 4.x in NSX federation setup
search cancel

Disconnection between NSX Global Managers and NSX Local Managers post NSX upgrade from 3.x to 4.x in NSX federation setup

book

Article ID: 369820

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX-T Federation sites are being upgraded to NSX 4.1.2.3
  • The connections between GM and LM are down and no global object is sync to the LM site.
  • After re-entering the LM site credential under location manager, the connection status remained disconnected. 
  • From the UI, there maybe error message as "unable to fetch full sync status" along with other error message showing cannot fetch status for objects such as TransportZoneListResultDto and error message as "500139". 
  • Validate the logs below from the Global Manager: /var/log/vmware/appl-proxy-rpc.log
    <Time Stamp> manager-node.example.com NSX 82660 - [nsx@6876 comp="global-manager" subcomp="appl-proxy" s2comp="nsx-net" tid="82691" level="WARNING"] StreamConnection[162723 Connecting to ssl://#.#.#.#:1236 sid:162723] Couldn't connect to 'ssl://#.#.#.#:1236' (error: 336130315-wrong version number)
    <Time Stamp> manager-node.example.com NSX 82660 - [nsx@6876 comp="global-manager" subcomp="appl-proxy" s2comp="nsx-net" tid="82691" level="WARNING"] StreamConnection[162723 Error to ssl://#.#.#.#:1236 sid:-1] Error 336130315-wrong version number
  • Validate the logs below from the Local Manager: /var/log/vmware/appl-proxy-rpc.log
    2024-05-23T09:07:57.727Z <manager-node> NSX 1846 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-rpc" tid="1876" level="INFO"] Frame format is not recognized
    2024-05-23T09:07:57.727Z <manager-node> NSX 1846 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-rpc" tid="1876" level="ERROR" errorCode="RPC400"] RpcConnection[166221 Negotiating on tcp://#.#.#.#:1236 0] Frame format is not recognized
    2024-05-23T09:08:05.749Z <manager-node> NSX 1846 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-net" tid="1876" level="INFO"] StreamSocket[166229 Closing f:69 i:257659720 tcp://0.0.0.0:1236 <- #.#.#.#:45080] DoClose
    2024-05-23T09:08:06.751Z <manager-node> NSX 1846 - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-net" tid="1876" level="INFO"] StreamConnection[166230 Connected on tcp://0.0.0.0:1236 sid:166230] Accepted connection from tcp://#.#.#.#:45094
  • Validate if the below highlighted lines are present in "/etc/vmware/nsx-appl-proxy/appl-proxy.xml" for both GM and LM:
    <applProxyPublicCfgFile>/etc/vmware/nsx-appl-proxy/appl-proxy-public-cfg.xml</applProxyPublicCfgFile>
    <applProxyPrivateKeyFile>/etc/vmware/nsx-appl-proxy/appl-proxy-privkey.pem</applProxyPrivateKeyFile>
    <applProxyCertificateFile>/etc/vmware/nsx-appl-proxy/appl-proxy-cert.pem</applProxyCertificateFile>
    <applProxyArPrivateKeyFile>/etc/vmware/nsx-appl-proxy/appl-proxy-ar-privkey.pem</applProxyArPrivateKeyFile>
    <applProxyArCertificateFile>/etc/vmware/nsx-appl-proxy/appl-proxy-ar-cert.pem</applProxyArCertificateFile>

    <external_ar>
    <ip>0.0.0.0</ip>
    <ipv6>::</ipv6>
    <!-- <fqdn>localhost</fqdn> →
    <!-- <fqdnv6>localhost</fqdnv6> →
    <port>1236</port>
    <!-- <path>unix:///tmp/aphexternal.sock</path> →
    <sslEnabled>true</sslEnabled>
    </external_ar>

Environment

VMware NSX-T Data Center 3.x
VMware NSX 4.1.2.3

Cause

The configuration file /etc/vmware/nsx-appl-proxy/appl-proxy.xml is not correctly updated during the upgrade from 3.x to 4.x. In NSX 4.x, the sslEnabled value for the external_ar service is read directly from this file. If the file is missing this configuration, it defaults to sslEnabled=false, preventing successful SSL-negotiated sync between Global and Local Managers

Resolution

This issue is resolved in VMware NSX 4.2.1, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

Workaround: 

Note: Replace /etc/vmware/nsx-appl-proxy/appl-proxy.xml with the new file "appl-proxy.xml_4.1.2.3" from this KB attachment, and the new file needs to be the same name as appl-proxy.xml

If the workaround did not resolve the issue or if the version affected isn't 4.1.2.3, please contact Broadcom Support and upload the following log files to the case. 

  • All Global Manager support bundles (including Active and Standby)
  • All Local Managers support bundles (including all Locations)
  • /etc/vmware/nsx-appl-proxy/appl-proxy.xml file from all the above nodes

Note: If you are contacting Broadcom support about this issue, please provide the following:

  • NSX Manager support bundles.
  • ESXi host support bundles for hosts that are failing to configure as transport nodes.
  • Text of any error messages seen in NSX GUI or command lines pertinent to the investigation.

Handling Log Bundles for offline review with Broadcom support:

Attachments

appl-proxy.xml_4.1.2.3 get_app