NSX Manager upgrade fails due to APH_TN certificate
search cancel

NSX Manager upgrade fails due to APH_TN certificate

book

Article ID: 433731

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • NSX Manager upgrade fails with the following message. Retrying the upgrade fails with the same message.
    [MPP] Node upgrade failed : org.springframework.web.client.HttpServerErrorException$InternalServerError: 500 Internal Server Error: "{"error_code": 36580, "error_message": "Error proxying request to: <uuid>.", "module_name": "node-services"}".


  • The same error is also logged in /var/log/upgrade-coordinator/upgrade-coordinator.log on the NSX Manager.
    <timestamp>  INFO http-nio-127.0.0.1-7442-exec-9 UpgradeCoordinatorFacadeImpl <pid> SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="upgrade-coordinator"] Component: MP, status: FAILED, % complete: 24.0, details: [MPP] Node upgrade failed : org.springframework.web.client.HttpServerErrorException$InternalServerError: 500 Internal Server Error: "{"error_code": 36580, "error_message": "Error proxying request to: <uuid>.", "module_name": "node-services"}"., canSkip: false

 

  • The Common Name (CN) of the APH_TN certificate applied to each NSX Manager is not unique. You can verify this by logging into any NSX Manager via CLI as root and running the following command. The example below indicates an issue because all CNs are identical.
    root@manager:~# curl -sk -u 'admin:<password>' https://localhost/api/v1/trust-management/certificates | python3 -c 'import sys,json,subprocess; data=json.load(sys.stdin);
    for r in data.get("results", []):
        if any("APH_TN" in u.get("service_types", []) for u in r.get("used_by", [])):
            p=subprocess.run(["openssl","x509","-noout","-subject"], input=r["pem_encoded"], text=True, capture_output=True)
            print(r.get("id"), p.stdout.strip())
    '
    <cert#1_uuid> subject=C = US, CN = VMware-NSX-ApplProxyHub
    <cert#2_uuid> subject=C = US, CN = VMware-NSX-ApplProxyHub
    <cert#3_uuid> subject=C = US, CN = VMware-NSX-ApplProxyHub

 

  • The following error regarding the above certificates is logged in /var/log/vmware/appl-proxy-rpc.log on each NSX Manager.
    <timestamp> <manager_name> NSX <pid> - [nsx@6876 comp="nsx-manager" subcomp="appl-proxy" s2comp="nsx-net" tid="<tid>" level="ERROR" errorCode="NET1111"] Certificate validation failed: 18-self-signed certificate <snip>

Environment

VMware NSX

Cause

Due to the issue described in KB#373270, communication and processing between NSX Managers fails, which causes the upgrade to fail.

Resolution

Update the APH_TN certificate by following the procedure outlined in the Resolution section of KB#373270, and then retry the upgrade.

Additional Information

KB#373270 : After replacing APH-TN or APH-AR certificates, connections between Manager nodes or between GM and LM are disconnected