MPA connectivity down on NSX Edge VM after certificate replacement
search cancel

MPA connectivity down on NSX Edge VM after certificate replacement

book

Article ID: 389595

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  1. An 'MPA Connect' error is displayed on the Edge node after replacing the manager certificate.

  1. On the manager node, you will see entries similar to the following in the /var/log/proton/nsxapi.log file:
####-##-##T##:##:##.###Z  INFO UfoIndexer-BatchExecutor-search_manager-2 EdgeTNValidationUtils 5296 FABRIC [nsx@#### comp="nsx-manager" level="INFO" subcomp="manager"] Set FN state error MPA disconnected TRANSPORT_NODE_SYNC_PENDING
####-##-##T##:##:##.###Z  INFO UfoIndexer-BatchExecutor-search_manager-2 EdgeTNValidationUtils 5296 FABRIC [nsx@#### comp="nsx-manager" level="INFO" subcomp="manager"] [entId=/infra/sites/default/enforcement-points/default/edge-transport-node/0000-0000-0000-00] Edge either in error state, not ready or mpa disconnected, failure code: 0,state:MPA_DISCONNECTED, mpa_connection: false
####-##-##T##:##:##.###Z ERROR WrapperStartStopAppMain TrustStoreServiceImpl ###### SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP100" level="ERROR" subcomp="manager"] Failed to sync certificate between DB and disk for profile: profileName: Message Bus Client for K8S Platform, serviceType: K8S_MSG_CLIENT, preProcessor: com.vmware.nsx.management.cloudnative.pre_processor.KafkaMsgClientCertPreProcessor, postProcessor: null, uniqueUse: false, clusterCertificate: true, requiresPrivateKey: true, nodeTypes: [global-manager, nsx-manager, nsx-shared], alias: k8s-msg-client, keyStorePath: /home/secureall/secureall/.store/.bluelane_keystore, keyStorePasswordPath: /config/http/.http_cert_pw
  1. On the manager node, the files under /etc/vmware/nsx-appl-proxy/ have permissions similar to the following (ls -la /etc/vmware/nsx-appl-proxy/)
-rw-r-----  1 uproton    uproton    1.7K XXX XX 16:22 appl-proxy-privkey.pem
-rw-r-----  1 uproton    uproton    1.7K XXX XX 22:20 appl-proxy-privkey.pem.
-rw-r-----  1 uproton    uproton    1.7K XXX XX 22:15 appl-proxy-privkey.pem.
-rw-rw-r--  1 appl-proxy appl-proxy 1.3K XXX XX 22:15 appl-proxy-ar-cert.pem
-rw-r-----  1 uproton    uproton    1.3K XXX XX 22:15 appl-proxy-ar-cert.pem.
-rw-rw-r--  1 appl-proxy appl-proxy 1.7K XXX XX 22:15 appl-proxy-ar-privkey.pem
-rw-r-----  1 uproton    uproton    1.7K XXX XX 22:15 appl-proxy-ar-privkey.pem

 

  1. On the faulty edge, messages similar to the following are seen in the /var/log/syslog file:

 

####-##-##T##:##:##.###Z  NSX - [nsx@xxx comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-net" tid="####" level="INFO"] StreamSocket[754 Open f:64 i:############? -> ssl://#.#.#.#:1234] on_connect ############-certificate verify failed (SSL routines)

####-##-##T##:##:##.###Z  NSX - [nsx@xxx comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-net" tid="####" level="WARNING"] StreamConnection[754 Connecting to ssl://#.#.#.#:1234 sid:754] Couldn't connect to 'ssl://<ip_of_the_manager> (error: xxxxxxxxxxx-certificate verify failed (SSL routines))

####-##-##T##:##:##.###Z NSX - [nsx@xxxx comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-net" tid="####" level="WARNING"] StreamConnection[754 Error to ssl://#.#.#.#:1234 sid:-1] Error ############-certificate verify failed (SSL routines)

####-##-##T##:##:##.###Z NSX- [nsx@xxx comp="nsx-edge" subcomp="nsx-proxy" s2comp="nsx-rpc" tid="####" level="WARNING"] RpcConnection[754 Connecting to ssl://#.#.#.#:1234 0] Couldn't connect to ssl://#.#.#.#:1234 (error: ############-certificate verify failed (SSL routines))

 

Environment

VMware NSX

Cause

Changing the certificates on the manager nodes may cause some discrepancy on the manager certificate thumbprint.

Resolution

Workaround

  1.  Get the certificate thumbprint from each manager node.
    Manager> get certificate api thumbprint
  2. SSH to the faulty edge node as the admin user.
  3. Run the following commands:
    push host-certificate <manager-IP-FQDN> username <username> thumbprint <cert-api-thumbprint-of-manager> password <password>

    sync-aph-certificates <manager-IP-FQDN> username <username> thumbprint <cert-api-thumbprint-of-manager> password <password>
  4. Repeat Step 3 for each manager node thumbprint.
  5. Switch to root (st en, enter root password when prompted)
  6. Run the following command
    /etc/init.d/nsx-proxy restart

Note: After performing above workaround, If required Sync Edge configuration from NSX GUI , System > Nodes > Edge transport nodes > Select the Edge node

then Drop down,  Actions > Sync Edge Node Configuration and check the "Configuration State" status comes up as "Success" after sync with NSX Manager

It may also be necessary to perform a rolling reboot of NSX Managers for the changes to take effect. If the Edge remains disconnected, restart the Manager appliances one at a time. Wait for each rebooted NSX Manager to appear as available in the NSX Manager interface before proceeding to reboot the next one.

Additional Information