NSX 4.1.x reverse-proxy fails to load API certificate
search cancel

NSX 4.1.x reverse-proxy fails to load API certificate

book

Article ID: 314345

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX UI/API may only work when directed to the VIP IP/FQDN.
  • API calls to individual Manager Cluster nodes may fail.
  • TLS handshakes to the node on TCP/443 fails even locally. In curl you may see the following error when attempting to make an API call against the node:
    curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to <FQDN>:443
  • NSX UI log support bundles may fail to any and all manager nodes except the one that currently holds the VIP.
  • Running /etc/init.d/envoy status as root on the NSX manager node reveals log lines similar to the following reported by systemd:
    /home/secureall/secureall/.store/.tomcat_cert.pem should start with -----BEGIN 
  • In /var/log/proxy/envoy.log you see log lines similar to as follows:
    https-node-v4-local: Failed to load certificate chain from <inline>
  • In /var/log/proton/nsxapi.log after hitting the failure condition and attempting to update the API cert, you see stack traces similar to the below. These indicate that the code was checking the certificate thumbprint to determine if a different certificate was used before overwriting the file, but in this case it failed to parse or read the content of the existing file.
    2023-07-18T18:22:02.587Z ERROR org.corfudb.runtime.collections.streaming.StreamPollingScheduler-worker-3 ResumeStreamListener 59976 SYSTEM [nsx@6876 comp="nsx-manager" errorCode="MP4" level="ERROR" subcomp="manager"] Exception caught during streaming processing. Re-subscribe this listener to latest timestamp
    java.lang.NullPointerException: null
    at com.vmware.nsx.management.common.trust.TrustUtil.getThumbprint(TrustUtil.java:58) ~[nsx-common-util.jar:?]
  • If the same certificate is used for all 3 nodes in the NSX manager cluster, the UI/API may be unavailable to those IP/FQDNs and may only be accepting requests via the VIP IP/FQDN.
  • Support bundle collection via the UI for any non-VIP manager nodes affected may fail.
  • During NSX upgrades repo_sync may fail to complete and the upgrade cannot proceed.

 

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware NSX 4.x

Cause

  • When importing a certificate with extra information (extra attributes) outside of PEM encoding and applying it to be the API certificate, the NSX Manager cannot correctly parse the new certificate.
  • If the Envoy service was restarted, UI and API endpoint stops accepting requests. Once the system gets in this state, applying a different certificate won't resolve the issue even though the API shows the new certificate has been applied as Envoy won't pick up the new certificate.

Resolution

This issue is resolved in VMware NSX 4.1.2, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.



Workaround
Please open a service request with VMware GSS NSX support and refer to this article in order to implement workaround steps.

Additional Information

If you are contacting Broadcom support about this issue, please provide the following:

  • NSX Manager log bundles
  • Text of any error messages seen in NSX GUI or command lines pertinent to the investigation

 

Handling Log Bundles for offline review with Broadcom support