HCX - Service Mesh Upgrade Failure
search cancel

HCX - Service Mesh Upgrade Failure

book

Article ID: 321642

calendar_today

Updated On:

Products

VMware HCX VMware Cloud on AWS

Issue/Introduction

This resource is to inform about HCX service mesh upgrade failure and recovery.


An upgrade workflow on a given SM fleet appliances may fail due to following reasons:
Below exceptions can be seen in the source/Target HCX App Engine logs during failure:

1. Operation not allowed as appliance has ongoing Protections.


Exceptions in the Logs:
Service Mesh modification failed. Process Service Mesh failed. Redeploy check failed. Appliance Name: Test-IX-I1, Blocking Service: VshpereReplicationService,Reason:Operation not allowed as appliance has ongoing Protections.

2. The provided network mapping between OVF networks and the system network is not supported by any host.



Exceptions in the Logs:
DeployAppliance failed, errorCode:null. stacktrace:null, errorMessage:Interconnect Service Workflow OvfUpload failed. Error: Errors encountered during ImportSpec creation: com.vmware.vim.binding.vim.fault.OvfNetworkMappingNotSupported:The provided network mapping between OVF networks and the system network is not supported by any host.. Cause: null.

Location of App Engine log:
  • HCX Manager : /common/log/admin/app.log

Cause

1. Issues related to ongoing DR protections:

During active DR protections on a given IX appliance, the standard upgrade workflow for IX appliance may not be serviced if there are some sync/delta sync operations ongoing between source and target sites for a VM or group of VMs.
Note: We do support the IX appliance upgrade during active DR protections, but preferably the initial sync should have already been completed. Again, its not a mandatory step.


2. Issues related to OvfNetworkMappingNotSupported:

There could be several reasons when "OvfNetworkMappingNotSupported" exceptions can be seen during Service Mesh/Appliances deployment OR Upgrade workflow, as highlighted below:
  • In normal circumstances, we need to ensure the clusters/ESXi hosts configured as deployment cluster in HCX Connector/Cloud Compute Profile must have DVPG network backing association at vCenter level for corresponding Uplink/Management/vSR/vMotion segments specified in HCX Network Profile.
Note: If multiple clusters are added in HCX Connector/Cloud Compute Profile, then network backings should be validated for all ESXi hosts part of all clusters as specified in the deployment cluster.
  • If network backing on cloud side got modified from N-VDS to C-VDS, it may impact the OVF upload process due to different network backing configuration on fleet appliances after C-VDS upgrade at Connector/Cloud side.
IMPORTANT: This is mainly applicable in those environments where N-VDS gets migrated to C-VDS at NSX-T.  In general, vCenter uses Opaque Network for N-VDS backing ports whereas for C-VDS it is same as DVPG.
Note: This condition is also applicable when Onprem/Source HCX environment has NSX-T and if similar transition has been happend for network backing ports.

Resolution

1. For DR protections running environment, user is suggested to choose "force redeploy" option to upgrade IX appliance.

2. To update the correct DVPG information in the HCX cloud side, user needs to perform first SM "resync" operation from connector/source side, which will update correct network backing DVPG information on the target side. Post which, Fleet appliances can get upgraded using standard procedure.
Note: Any redeploy/Upgrade/Sync event should be triggered from OnPrem/Connector/Source side.

IMPORTANT: DO NOT perform "force sync" operation on target IX/NE appliances as it will not help updating the DVPG or network backing information. The force sync operation is mainly to incorporate certain config changes specific to an individual or group of fleet appliance but not on SM as whole.

3. If the Above does not work , disable WAN OPT appliance and attempt to upgrade again, then enable WAN OPT.


Workaround:
NA

Additional Information

Impact/Risks:
1. Incase of DR protections running on IX appliance.
This will only impact IX appliance upgrade workflow.

2. If network backing on connector/cloud side is missing on ESXi hosts/Cluster, it will impact both IX/NE/SGW/SDR appliance upgrade workflow.

3. There will be NO impact to WAN-OPT appliance as they are not part of SM upgrade workflow.