VMware NSX Federation LOCAL_MANAGER or GLOBAL_MANAGER replaced expired certificate still alerting as expired
search cancel

VMware NSX Federation LOCAL_MANAGER or GLOBAL_MANAGER replaced expired certificate still alerting as expired

book

Article ID: 314332

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Using NSX Federation.
  • Federation is deployed since NSX-T 3.1.x with a Global manager certificate and upgraded to NSX-T 3.2.x.
  • There is an alarm that LOCAL_MANAGER and/or GLOBAL_MANAGER certificate(s) have expired.
  • Checking in the UI (System - Certificates), these certificate(s) still have a 'Where Used' count greater than 0.
  • Steps in the Replace Certificates section of the NSX-T administration guide, step 6, where used to replace the Federation Principal Identity (PI) certificates.
  • Unable to remove the expired certificate(s), as the system is reporting that it is still in use.
  • Running the following API call, we see the certificate which is alerting in the UI (take note of the certificate UUID from the UI) is of type: "service_types" : [ "CLIENT_AUTH" ] with no GLOBAL_MANAGER or LOCAL_MANAGER entry co-located on that certificate.
GET api/v1/trust-management/certificates/<certificate_UUID>
{
  "pem_encoded" : "-----BEGIN CERTIFICATE-----
  ...
  -----END CERTIFICATE-----",
  "has_private_key" : false,
  "used_by" : [ {
    "node_id" : "{name: 'localmanageridentity',node_id: '<node_using_certificate_UUID>',certificate_id: '<certificate_UUID_from_UI>'}",
    "service_types" : [ "CLIENT_AUTH" ]
  } ],
  "resource_type" : "certificate_self_signed",
  "id" : "bbf6a54f-####-####-####-296f3e6634d4",
  "display_name" : "<certificate_UUID_from_UI>",
...
}
  • Checking the PI using the following API we see they are now using the new certificate provided in the replacement API call from Replace Certificates step 6 and not the stale cert identified in the previous step.
    • Refer to the section at the bottom of the page titled: Principal Identity (PI) Users for NSX Federation.
GET api/v1/trust-management/principal-identities

 

Note: The presence of "service_types": ["CLIENT_AUTH"] on a certificate alone is not sufficient to indicate a match to this issue.
 
CLIENT_AUTH service_type is indicative of a PI certificate, but is not necessarily indicative of a Federation PI certificate. If PI certificate(s) (but not Federation PI certificate(s)) need to be updated, please see the PI section of  Certificates for NSX Federation.

A Federation PI certificate is characterized by either one of the following:
  • Presence of CLIENT_AUTH and LOCAL_MANAGER (or GLOBAL_MANAGER) service_types on a single certificate with "node_id": containing name: 'localmanageridentity' or name: 'globalmanageridentity'.
    •  These entries have been noted to be stale when confirmed with all of the symptoms above this note are present.
  • Presence of LOCAL_MANAGER or GLOBAL_MANAGER service_type alone.
    • This is the certificate entry on the cluster/site that holds the private key for this certificate and these entries have not been noted to be stale and should not be released. 

Environment

VMware NSX-T Data Center

Cause

  • In a Federation environment, each manager will generate its own certificate which is used for communications.
  • Each manager will send this certificate to the other sites in the environment, which will then use that certificate for the PI which is used to configure the other managers.
  • When we replace the expired certificate, the other managers should replace the certificate in use by the PI that is used to communicate with the manager, which renewed its certificate.
  • This is an automatic process carried out when the certificate is renewed.
  • Due to an issue with the automatic release process, there is a stale entry and the manager believes the certificate is still being held by one of the other managers.
  • This occurs in Federation environments that where deployed in 3.1 or prior and upgraded to 3.2.x.

Resolution

This is a known issue impacting VMware NSX.
The workaround this issue, use the CARR script attached to KB: Using Certificate Analyzer, Results and Recovery (CARR) Script to fix certificate related issues in NSX