NSX-T reverse proxy server fails to start: 'Failed to load trusted CA certificates from <inline>'
search cancel

NSX-T reverse proxy server fails to start: 'Failed to load trusted CA certificates from <inline>'

book

Article ID: 322423

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:

  • You have VMware NSX 4.0 or above installed.
  • You have a Principle Identity (PI) configured and have a certificate attached to it, this certificate has the service_type CLIENT_AUTH.
  • You may also have a Federated environment, VMware NSX Federation creates a PI account for connections between sites.
  • You are upgrading VMware NSX-T and the Management Plane upgrade is stuck at 11% and not progressing.
    • Or
  • You have stopped the reverse proxy service and it will not start again.
  • In the NSX-T Manager node log /var/log/proxy/envoy.log we see the following warning:
[warning][config] [source/common/config/filesystem_subscription_impl.cc:43] Filesystem config update rejected: Error adding/updating listener(s) https-node-v4-local: Failed to load trusted CA certificates from <inline>

Environment

VMware NSX-T

Cause

  • NSX-T fails to load the certificate if the certificate length is multiples of 253, which is due to an underlying issue with the envoy component used for the reverse proxy service on the NSX-T manager since VMware NSX 4.0.
  • The issue manifests when the service is restarted, when the reverse proxy service starts, it attempts to load all the certificates from the certificate store, if there is such a certificate in the store, it will fail to start and generate the above log entries.
  • This can occur during MP upgrade (as the manager nodes restart during an upgrade) or after a manager restart or reverse proxy service restart.
  • Note: Federation environments use a PI with certificates using service_type CLIENT_AUTH, so this issue may also be seen in a Federation setup and you have no specific PI's accounts configured.

Resolution

This is a known issue impacting VMware NSX.

Workaround:

  • Identify the certificate which is preventing the reverse proxy service to start, you can use the following API to retrieve all certificates:
GET /api/v1/trust-management/certificates
  • Then note which one uses the service_type CLIENT_AUTH.
  • For each of these, check the length of the certificate, excluding "-----BEGIN CERTIFICATE-----" and "-----END CERTIFICATE-----", so all characters in between these headers.
  • If you find one which has a multiple of 253, then remove this certificate, you can use the following API as root user on the NSX-T manager to delete the certificate:
curl -H "x-nsx-username: admin" -X DELETE http://127.0.0.1:7440/nsxapi/api/v1/trust-management/certificates/<cert-id>
  • Where '<cert-id>' is the ID of the certificate which was identified as having a length of multiples of 253 and using service_type CLIENT_AUTH.
  • Once the certificate is removed, the service should start again.
  • If it occurred during an upgrade, you should now be able to proceed with the upgrade.

Note: If you are using Federation and the certificate is assigned to a PI account used by one of the sites, do not use the delete API above. Please follow the administration guide to replace the site certificate, this will automatically update the certificate used by the PI for that site.