VCF workload domain deployment fails at NSX Transport Node Collection due to invalid API certificate
search cancel

VCF workload domain deployment fails at NSX Transport Node Collection due to invalid API certificate

book

Article ID: 434521

calendar_today

Updated On:

Products

VMware NSX VMware Cloud Foundation VMware SDDC Manager / VCF Installer

Issue/Introduction

During a VMware Cloud Foundation (VCF) workload domain deployment, the orchestration fails at the "Create NSX Transport Node Collection" task. The SDDC Manager UI and logs report that transport node realization has failed across the compute collection

Symptoms:

  • SDDC Manager UI Error:
    Message: One or more transport node(s) realization failed while creating the transport node collection.....through NSX Manager <nsx-manager-vip-fqdn>.
    
    Remediation Message: Check the error(s) on NSX Manager <nsx-manager-vip-fqdn> and resolve them before restarting the failed task.
    
    Cause: Certificate for <nsx-manager-vip-fqdn> doesn't match any of the subject alternative names: [<nsx-manager-fqdn>, <nsx-manager-IP>]

     

  • Error in /var/log/vmware/vcf/domainmanager/domainmanager.log in SDDC Manager:
    DEBUG [vcf_dm,69b.............] [c.v.v.c.f.p.n.h.NsxtCommonOperations,dm-exec-19]  Waiting 2700000 ms for NSX cluster prepartion to be completed
    DEBUG [vcf_dm,69b.............] [c.v.v.c.f.p.n.p.s.i.PolicyTransportNodeCollectionServiceImpl,dm-exec-19]  Current state of cluster preparation FAILED_TO_REALIZE

     

    ERROR [vcf_dm,69b.............] [c.v.v.c.f.p.n.p.a.CreateTransportNodeCollectionAction,dm-exec-19]  Exception occurred while creating transport node collection using profile <NSX-Profile-ID> on the cluster(with compute collection) <NSX-Cluster-ID>:domain-<Domain-MOID>
    java.lang.RuntimeException: Failed to realize transport node. Please refer logs.

     

    DEBUG [vcf_dm,69b.............] [c.v.v.c.f.p.n.p.a.TransportNodeCollectionResolver,dm-exec-19]  computeCollectionId: <NSX-Cluster-ID>:domain-<Domain-MOID>, and memberStatusList: [HostNodeStatus (com.vmware.nsx.model.host_node_status) => {
        configStatus = pending,
        deploymentStatus = INSTALL_FAILED,

     

  • Error in /var/log/proton/nsxapi.log in NSX-T Manager
    Failed to install software on host. NSX Manager <nsx-manager-fqdn> has invalid API certificate. Error: (28) Failed to connect to <nsx-manager-fqdn> port 443: [Errno -2] Name or service not known
    . Fix the certificate issue on NSX Manager and retry the operation.

Environment

  • VMware Cloud Foundation (VCF)
  • VMware NSX (formerly NSX-T)
  • VMware SDDC Manager

Cause

The failure is caused by an invalid or mismatched API certificate on the NSX Manager. Even when DNS forward and reverse records (A and PTR) are correctly configured and reachable, the NSX API service may reject a certificate if the Subject Alternative Name (SAN) field is incomplete or if there is a mismatch between the certificate and the node's FQDN during the realization check - Refer Not allowed to apply a service-type API to a CA-signed certificate without hostname check passing. 

Resolution

To resolve this issue, renew and correctly apply the NSX Manager node certificate.

Follow the below steps:

  1. Verify DNS Integrity: Use nslookup on the SDDC Manager or NSX Manager to ensure both the NSX VIP and individual node FQDNs resolve correctly to their assigned IP addresses and match the SAN fields of the intended certificate.
  2. Renew Certificate with Private Key: Generate and apply a new certificate for the NSX Manager node. Ensure the certificate is generated with a matching private key and includes all necessary FQDNs and IP addresses in the SAN field. Refer Create a Self-Signed Certificate for NSX
  3. Apply via API (if UI fails): If applying via the UI returns "Not allowed to apply a service-type API to a CA-signed certificate without hostname check passing," use the NSX API to force the certificate application: Refer Replacing NSX Certificates
  4. Retry Deployment: Once the API certificate is successfully replaced and verified as "Active," return to the SDDC Manager UI and Retry the failed task.