Primary web upgrade fails on RegisterCatalogTypes
search cancel

Primary web upgrade fails on RegisterCatalogTypes

book

Article ID: 314908

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Symptoms:
  • Automatic IaaS upgrade fails on primary web node - at the upgrade-server command.
  • In VAMI->Cluster tab all components on the primary IaaS web node are displayed with the new version.
  • When examining the Management Agent's All.log file on the primary web the upgrade server command has failed with an error similar to the following:
Executing:C:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe\Vcac-Config.exe RegisterCatalogTypes -v
Catalog registration starting.
[00:00:00.000] VMware.Cafe: [sub-thread-Id="1" context="" token=""] Setting from config CafeClientCacheDuration: 05:00:00
[00:00:00.000] VMware.Cafe: [sub-thread-Id="1" context="" token=""] (1) GET endpoints/types/sso
[00:00:00.000] VMware.Cafe: [sub-thread-Id="8" context="" token=""] (1) Response: OK 0:00.080
[00:00:00.000] VMware.Cafe: [sub-thread-Id="5" context="" token=""] (2) POST SAAS/t/vsphere.local/auth/oauthtoken?grant_type=client_credentials
[00:00:00.000] VMware.Cafe: [sub-thread-Id="7" context="" token=""] (2) Response: OK 0:00.055
[00:00:00.000] VMware.Cafe: [sub-thread-Id="5" context="" token=""] (3) GET endpoints/types/com.vmware.csp.component.iaas.proxy.provider.api/default
[00:00:00.000] VMware.Cafe: [sub-thread-Id="8" context="" token=""] (3) Response: OK 0:00.026
[00:00:00.000] VMware.Cafe: [sub-thread-Id="1" context="" token=""] (4) POST services/<UUID>
[00:00:00.000] VMware.Cafe: [sub-thread-Id="7" context="" token=""] (4) Response: InternalServerError 0:01.780
[00:00:00.000] VMware.Cafe: [sub-thread-Id="9" context="" token=""] Json Error: (System exception.)
VMware.Cafe.JsonResponseException: System exception.
at VMware.Cafe.JsonRestClient.<HandleErrorResponse>d__2`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at DynamicOps.Common.Client.RestClient.<HandleErrorResponse>d__aa`1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at DynamicOps.Common.Client.RestClient.<PostResourceInternal>d__c4.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at DynamicOps.Common.Client.RestClient.<PostResource>d__8f.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at VMware.Cafe.Client.Registration.<RegisterWithServiceCatalog>d__3.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at VMware.VcacConfig.ServiceCatalogCommands.RegisterCatalogTypes.Execute(CommandLineParser parser)
Warning: Non-zero return code. Command failed.
  • From around the same time in the /var/log/vmware/vcac/catalina.out log of one the VAs (if VAs are load balanced) there is an error similar to:
[UTC:YYYY-MM-DD HH:MM:SS, Local:YYYY-MM-DD HH:MM:SS] vcac: [component="cafe:endpoint-configuration-service" priority="ERROR" thread="tomcat-http--4" tenant="vsphere.local" context="QzT5BZiA" parent="QzT5BZiA" token="JCHw90KN"] com.vmware.vcac.platform.rest.client.error.ResponseErrorHandler.handleRestError:113 - [Rest Error]: {Status code: 500}, {Error code: 50505} , {Error Source: null}, {Error Msg: System exception.}, {System Msg: org.apache.http.NoHttpResponseException: sc2-rdops-vm02-dhcp-73-203.eng.vmware.com:443 failed to respond}
[UTC:2018-11-02 02:56:37,501 Local:2018-11-02 02:56:37,501] vcac: [component="cafe:endpoint-configuration-service" priority="ERROR" thread="tomcat-http--4" tenant="vsphere.local" context="QzT5BZiA" parent="QzT5BZiA" token="JCHw90KN"] com.vmware.vcac.core.endpointconfiguration.controller.EndpointTypeController.update:233 - Endpoint type with id: [vSphere] update failed: System exception.
com.vmware.vcac.platform.rest.client.error.RestException: System exception.
 
 
 


Environment

VMware vRealize Automation 7.5.x

Cause

  • The WAPI component of the primary IaaS website has just started responding after it has been upgraded so the primary web initiates the RegisterCatalogTypes procedure in the CAFE which relies on healthy connection to IaaS (through the IaaS web load balancer).
  • The subsequent call, however, fails intermittently, if it is performed before the IaaS web load balancer health check monitor (/wapi/api/staus/web) will ping the primary web node to update its status to UP.
  • All secondary web nodes (if any) are reporting status "in maintenance" during the automatic IaaS upgrade.
So the call through the load balancer fails on the CAFE side causing the upgrade-server to fail.

Resolution

To workaround the issue:
  1. Log on the primary IaaS web node execute the following set of commands sequentially and in that particular order. Execute each command after the previous has succeeded.
"C:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe\Vcac-Config.exe" RegisterCatalogTypes -v"
"C:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe\Vcac-Config.exe" UpgradeArchiveDayTo62 -v
"C:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe\Vcac-Config.exe" UpgradeGovernance -v"
"C:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe\Vcac-Config.exe" UpgradeEndpoints -v"
"C:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe\Vcac-Config.exe" UpgradeVirtualMachineSecurity -v"
"C:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe\Vcac-Config.exe" UpgradeBugFixes -v
"C:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe\Vcac-Config.exe" TurnUpgradeFlagOff -v
 
  1. After successfully executing those commands the primary web node could be considered updated and the automatic IaaS upgrade can be retriggered by executing the ./upgrade script (in /usr/lib/vcac/tools/upgrade on the primary VA).
The primary web node will be skipped and the automatic upgrade will continue with the rest of the IaaS nodes.

Note:  
The commands above are idempotent, so if any command fails, details can be found in catalina.out log, and the command can be retried.If there is a consistent failure that cannot be unblocked - the primary IaaS web node should be reverted along with the IaaS MS SQL database to their pre-upgrade snapshots and the automatic iaas upgrade should be re-triggered.