NAPP registration failing at 70% with error due to stale Global NSX Manager Principal Identity
search cancel

NAPP registration failing at 70% with error due to stale Global NSX Manager Principal Identity

book

Article ID: 390357

calendar_today

Updated On:

Products

VMware vDefend Firewall

Issue/Introduction

NAPP deployment failed due to stale principal identity for global manager.

Environment

NAPP 4.2.0

Cause

  • When trying to deploy NAPP on a manager that was previously part of the NSX federation, if a Global Manager's principal identity was left behind while it's correspoding certificate no longer exist, it confuses the site onboarding service as the corresponding certificate can not be found.  As a result, the deployment workflow failed, due to inability to verify existing principal identities and the associating certificates.

  • In PI creation process, an API call is made to fetch siteCertificates. In this API call, certificates are fetched and while loading site certificates the API fails due to "not found" (certificate) error.

For example, in /var/log/proton/nsxapi.log, we can see during PI creation process, an API call is made is made to fetch siteCertificates. In this API call, certificates are fetched and while loading site certificates the API fails due to "not found" (certificate) error.

2025-02-26T19:42:08.658Z  INFO http-nio-127.0.0.1-7440-exec-18 PreAuthenticationFilter 76623 PreAuthenticationFilter setting username to nsx-opsagent from x-nsx-username header.
2025-02-26T19:42:08.662Z  INFO http-nio-127.0.0.1-7440-exec-18 SiteCertificateServiceImpl 76623 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" reqId="2ef23d8a-6ab5-4bad-8efd-cd3321611941" subcomp="manager" username="nsx-opsagent"] Collecting all federation certificates.
2025-02-26T19:42:08.662Z  INFO http-nio-127.0.0.1-7440-exec-18 SiteCertificateServiceImpl 76623 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" reqId="2ef23d8a-6ab5-4bad-8efd-cd3321611941" subcomp="manager" username="nsx-opsagent"] Creating metadata for local certs.
2025-02-26T19:42:08.670Z  WARN http-nio-127.0.0.1-7440-exec-18 MultiVersionObject 76623 SnapshotProxy[778a] encountered trimmed addresses [] during sync to 4507780385 on attempt 1 of 2
2025-02-26T19:42:08.710Z  WARN http-nio-127.0.0.1-7440-exec-18 MultiVersionObject 76623 SnapshotProxy[595e] encountered trimmed addresses [] during sync to 4507780385 on attempt 1 of 2
2025-02-26T19:42:08.722Z  WARN http-nio-127.0.0.1-7440-exec-18 TrustStoreServiceImpl 76623 SYSTEM [nsx@6876 comp="nsx-manager" level="WARNING" reqId="2ef23d8a-6ab5-4bad-8efd-cd3321611941" subcomp="manager" username="nsx-opsagent"] Certificate object with id 'd2e2f7d0-43a1-4796-9153-a8d7bd16da97' cannot be found
2025-02-26T19:42:08.722Z  WARN http-nio-127.0.0.1-7440-exec-18 SiteCertificateServiceImpl 76623 SYSTEM [nsx@6876 comp="nsx-manager" level="WARNING" reqId="2ef23d8a-6ab5-4bad-8efd-cd3321611941" subcomp="manager" username="nsx-opsagent"] Unable to match site-certificate to its actual certificate.
com.vmware.nsx.management.common.exceptions.ObjectNotFoundException: null
        at com.vmware.nsx.management.truststore.service.impl.TrustStoreServiceImpl.getCertificateInternal(TrustStoreServiceImpl.java:793) ~[?:?]
        at com.vmware.nsx.management.truststore.service.impl.TrustStoreServiceImpl.getCertificate(TrustStoreServiceImpl.java:785) ~[?:?]
        at com.vmware.nsx.management.truststore.service.impl.SiteCertificateServiceImpl.createSiteCertificateData(SiteCertificateServiceImpl.java:994) ~[?:?]
        at com.vmware.nsx.management.truststore.service.impl.SiteCertificateServiceImpl.getSiteCertificates(SiteCertificateServiceImpl.java:1132) ~[?:?]

Resolution

1. Make sure the NSX manager is indeed no longer part of the federation. Collect the list of principal identities using GET /api/v1/trust-management/principal-identities and note down the stray Global Manager's principal identity ID and the certificate ID associated with the principal identify.  Verify that the certificate associated to the principal identity is indeed non-existence.

Get all principal identities
curl 'https://<nsx-manager>/api/v1/trust-management/principal-identities' -k -u admin
{
  "results" : [ {
    "name" : "GlobalManagerIdentity-cab594d9-a31c-49b3-9148-32d4929ff472",
    "node_id" : "cab594d9-a31c-49b3-9148-32d4929ff472",
  "certificate_id" : "d2e2f7d0-43a1-4796-9153-a8d7bd16da97",               <----- Verify non-existence of this certificate
    "roles_for_paths" : [ {
      "path" : "/",
      "roles" : [ {
        "role" : "enterprise_admin"
      } ],
      "delete_path" : false
    } ],
    "is_protected" : true,
    "resource_type" : "PrincipalIdentity",
    "id" : "8ecdc212-3d71-401d-8e99-15d3e0de4612",                           <------ Principal identity's ID
    "display_name" : "8ecdc212-3d71-401d-8e99-15d3e0de4612",
    "_system_owned" : false,
    "_protection" : "NOT_PROTECTED",
    "_create_time" : 1745913329847,
    "_create_user" : "admin",
    "_last_modified_time" : 1745913329847,
    "_last_modified_user" : "admin",
    "_revision" : 0
  } ]
}

2. Once verified, manually delete the stale federation principal identity from the NSX manager UI.  Go to System → User Management, under User Role Assignment tab, delete the stale principal identify by clicking on the 3-dot vertical bar and select Delete.

Alternatively, one can delete the principal identity by issuing and API to the manager endpoint to DELETE /api/v1/trust-management/principal-identities/<principal-identity-uuid>

curl  -X DELETE 'https://<nsx-manager>/api/v1/trust-management/principal-identities/8ecdc212-3d71-401d-8e99-15d3e0de4612' -k -u admin