Transport node upgrade fails when service account has expired - Error code:400
search cancel

Transport node upgrade fails when service account has expired - Error code:400

book

Article ID: 322437

calendar_today

Updated On:

Products

VMware NSX VMware vCenter Server VMware Cloud Foundation

Issue/Introduction

  • NSX environment is upgrading from 4.x. to NSX 4.x.
  • The host cluster is managed using vSphere Lifecycle Manager.
  • The upgrade coordinator has been upgraded and the transports nodes are staged for installation.
  • The host upgrade fails with "General error (Error code:400)".
  • Following error is reported under the upgrade-coordinator log on the NSX manager: 

/var/log/upgrade-coordinator/upgrade-coordinator.log shows

ERROR http-nio-127.0.0.1-7442-exec-5 InventoryRestClientImpl 14418 FABRIC [nsx@6876 comp="nsx-manager" errorCode="MP7044" level="ERROR" subcomp="upgrade-coordinator"] Error in rest call url= /cm-inventory/api/v1/fabric/compute-managers/<uuid>?action=change-service-account-password , method= POST , response= {
    "module_name" : "common-services",
    "error_message" : "The requested with given URI, HTTP method and set of parameters cannot be processed.",
    "error_code" : 269
}
 , error= [{"errorMessage":"The requested with given URI, HTTP method and set of parameters cannot be processed.","errorData":{"moduleName":"common-services","errorCode":"269","statusCode":"NOT_FOUND"}}]
org.springframework.web.client.HttpClientErrorException$NotFound: 404 : "{<EOL> "module_name" : "common-services",<EOL> "error_message" : "The requested with given URI, HTTP method and set of parameters cannot be processed.",<EOL> "error_code" : 269<EOL>}<EOL>"

/var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server.log shows

YYYY-MM-DDThh:mm:ss.665Z error vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] Response from localhost/external-tp/http1/<NSX-Manager-FQDN>/443/##########################################/api/v1/vlcm/esx/health/cluster/perspectives/ready-for-apply/status?action=check: HTTP Status:500 'Internal Server Error'
YYYY-MM-DDThh:mm:ss.665Z warning vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] Retrying on next NSX-T node due to HTTP 500.
YYYY-MM-DDThh:mm:ss.665Z info vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] Current NSX-T node: https://<NSX-Manager-FQDN>:443
YYYY-MM-DDThh:mm:ss.666Z info vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] NSX-T version: 4.x.x.x
YYYY-MM-DDThh:mm:ss.666Z info vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] Calling NSX-T API /api/v1/vlcm/esx/health/cluster/perspectives/r
eady-for-apply/status?action=check (/external-tp/http1/<NSX-Manager-FQDN>/443/##########################################/api/v1/vlcm/esx/health/cluster/perspectives/ready-for-apply/status?action=check).
YYYY-MM-DDThh:mm:ss.009Z error vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] Response from localhost/external-tp/http1/<NSX-Manager-FQDN>/443/##########################################/api/v1/vlcm/esx/health/cluster/perspectives/ready-for-apply/status?action=check: HTTP Status:500 'Internal Server Error'
YYYY-MM-DDThh:mm:ss.009Z warning vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] Retrying on next NSX-T node due to HTTP 500.
YYYY-MM-DDThh:mm:ss.010Z error vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] No reachable NSX-T node found.
YYYY-MM-DDThh:mm:ss.010Z info vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] [domain-c8] A provider [nsxt_rest] has finished (0 remaining).
YYYY-MM-DDThh:mm:ss.010Z info vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] [domain-c8] All providers have finished. Elapsed time (sec): 17
YYYY-MM-DDThh:mm:ss.010Z info vmware-vum-server[#######] [Originator@#### sub=EHP opID=########-####-####-####-############] [domain-c8] [vSphere HA] [com.vmware.vpxd.healthPerspectives.ready_for_apply.ha] returned status: OK

Environment

  • VMware NSX 4.x
  • vCenter 8.x
  • VCF 5.2.x

Cause

  • During the upgrade process new VIBs are installed on the host transport node. This requires vCenter authentication.
  • If the service account password expiry is detected the "change-service-account-password" API is called by the manager.
  • This API call fails with error "NotFound: 404"

Resolution

  • To resolve this issue the service account password expiration:

    1. This can be accomplished by going to 'Compute manager' under 'System' in NSX UI.
    2. Click on Edit next to 'FQDN or IP address' of the Compute manager.
    3. Re-enter the vSphere username and password used to register it with NSX-T.
    4. This will re-create the service account, and trigger a Full inventory sync from vCenter.

  • If the above steps do not resolve the error, then restart proton service on all the NSX Managers. Restarting proton on all the managers will clear out the cache and NSX Manager will be forced to make a new API Call for vAPI Token. 

    1. For restarting proton service on the NSX manager, run the below command from "root" CLI:
      • /etc/init.d/proton restart 
    2. To check the cluster status is healthy run the command: "get cluster status" and proceed to next step, if status is healthy.
    3. As admin user in the CLI of the first manager, run the command: "restart service manager".
    4. Check the cluster status is healthy again "get cluster status" once restart has completed
    5. Repeat on all the NSX managers.

  • If the error persists then restart the upgrade coordinator service:

    1. From admin of the CLI on each manager run the command: "restart service install-upgrade".