/var/log/vrlcm/vmware_vrlcm.log on the Fleet Management appliance shows a message similar to :
YYYY-MM-DDTHH:MM:SS.XXXZ ERROR vrlcm[1614] [pool-3-thread-9] [c.v.v.l.v.p.t.VmspMountDiskTask] -- Unable to mount disk due to exception:
com.vmware.vrealize.lcm.vmsp.common.exception.RestClientException: API failed with response: {"statusCode":218,"responseMessage":"status code 218","outputData":"{\"error\":\"job failed: BackoffLimitExceeded; Job has reached the specified backoff limit\",\"statusCode\":218,\"output\":\"Warning: Permanently added '###.###.###.###' (ED25519) to the list of known hosts.\\nWelcome to Photon 5.0 (\\\\m) - Kernel \\\\r (\\\\l)\\nnode/<VCFA node> labeled\\nSOURCE_VM_NAME is <Aria Automation VM name>\\ngovc: please specify a datacenter\\nWarning: Permanently added '###.###.###.###' (ED25519) to the list of known hosts.\\nWelcome to Photon 5.0 (\\\\m) - Kernel \\\\r (\\\\l)\\nnode/<VCFA node> not labeled\\nSOURCE_VM_NAME is <Aria Automation VM name> \\ngovc: please specify a datacenter\\n\",\"running\":false,\"completedAt\":\"YYYY-MM-DDTHH:MM:SSZ\"}","token":null,"contentLength":649,"allHeaders":null}
VMware Aria Automation 8.18.1
VCF Automation 9.0.0
This issue will be resolved in a future release of VCF Automation. Meanwhile the below workaround can be used to proceed with the upgrade.
Workaround
1. SSH to VCF-OPS Lifecycle Manager Appliance (Fleet Management appliance) using root account.
2. Change directory to tmp folder :
cd /tmp
3. Copy the kubeconfig file from any of the VMSP nodes using SSH and vmware-system-user account :
ssh vmware-system-user@<VMSP VM IP> "sudo cat /etc/kubernetes/admin.conf" > kubeconfig.conf
4. Set the KUBECONFIG environment variable :
export KUBECONFIG=/tmp/kubeconfig.conf
5. Disable the drift detection on the hookset :
kubectl label hookset vcenter-disk-mount -n vmsp-platform helm.toolkit.fluxcd.io/driftDetection=disabled
6. Edit the hookset for the vCenter disk mount API :
kubectl edit hookset vcenter-disk-mount -n vmsp-platform
In the editor that opens up with the above command, perform the below tasks :
a. Find the section .spec.hooks.k8sJobs.mount and add to the list of environment variables under env (Use spaces for indentation) :
- name: GOVC_DATACENTER
value: '{{ k8sLookup "PackageDeployment.v1alpha1.releases.vmsp.vmware.com" "vmsp-platform" "vmsp-platform" | jsonPath ".spec.values.provider.vsphere.datacenter" }}'
b. Add the above variable under the .spec.hooks.k8sJobs.unmount section in env as well :
c. Save the changes with escape wq!.
7. Retrigger the failed upgrade from VCF Operations UI > Fleet Management > Lifecycle > VCF Management > Tasks .