This KB provides instructions to work around update failures caused due to
SDDC Manager update to VCF 4.5.0.0 fails in "VMware Cloud Foundation Service and Platform Upgrades" step. Below error is reported in SDDC Manager UI:
or
Check /var/log/vmware/capengine/cap-update/workflow.log indicate "Task validate failed" due to unexpected free space in volume group.
(OR)
Check the below two log files
for errors in reclaiming snapshot disks (example error messages below)
Task Failed Error
2022/10/31 09:19:49.463490 validate.go:99: Debug: vgname:[data_vg] actualVFreeSize: [24996] vFreeSize:[26214] toleranceAllowed:[3932]2022/10/31 09:19:49.527247 validate.go:99: Debug: vgname:[lcmmount_vg] actualVFreeSize: [124568] vFreeSize:[104857] toleranceAllowed:[15728]2022/10/31 09:19:49.527298 progress.go:11: Validate failed. VFree size of the volume group lcmmount_vg mismtaches the expectation. Actual: [124568] Expected: [104857].2022/10/31 09:19:49.527490 task_progress.go:24: Validate failed. VFree size of the volume group lcmmount_vg mismtaches the expectation. Actual: [124568] Expected: [104857].2022/10/31 09:19:49.556785 workflow_manager.go:198: Task validate failed. Error: Validate failed. VFree size of the volume group lcmmount_vg mismtaches the expectation. Actual: [124568] Expected: [104857].2022/10/31 09:19:49.556950 workflow_manager.go:138: Stopping workflow execution as task validate failed |
2022/11/03 21:12:26.914537 reclaimvfree.go:242: Executing command: vgreduce data_vg /dev/sdg12022/11/03 21:12:27.014444 reclaimvfree.go:253: Executing command: pvremove -y -ff /dev/sdg12022/11/03 21:12:27.126447 reclaimvfree.go:264: Executing command: parted -s -a opt /dev/sdg rm 12022/11/03 21:12:27.167333 progress.go:11: Reclaimed snapshot /dev/sdg12022/11/03 21:12:27.167401 reclaimvfree.go:242: Executing command: vgreduce lcmmount_vg /dev/sdg22022/11/03 21:12:27.167730 task_progress.go:24: Reclaimed snapshot /dev/sdg12022/11/03 21:12:27.286985 reclaimvfree.go:253: Executing command: pvremove -y -ff /dev/sdg22022/11/03 21:12:27.374610 reclaimvfree.go:264: Executing command: parted -s -a opt /dev/sdg rm 22022/11/03 21:12:27.400884 progress.go:11: Reclaimed snapshot /dev/sdg22022/11/03 21:12:27.401049 reclaimvfree.go:242: Executing command: vgreduce lcmmount_vg /dev/sdg22022/11/03 21:12:27.401154 task_progress.go:24: Reclaimed snapshot /dev/sdg22022/11/03 21:12:27.478621 progress.go:11: Failed to reclaim snapshot disk /dev/sdg2 from VG lcmmount_vg. Error : exit status 52022/11/03 21:12:27.478859 task_progress.go:24: Failed to reclaim snapshot disk /dev/sdg2 from VG lcmmount_vg. Error : exit status 52022/11/03 21:12:27.491478 workflow_manager.go:198: Task reclaim-vfree failed. Error: Failed to reclaim snapshot disk /dev/sdg2 from VG lcmmount_vg. Error : exit status 52022/11/03 21:12:27.491630 workflow_manager.go:138: Stopping workflow execution as task reclaim-vfree failed |
2022/11/03 20:40:06.100186 reclaimvfree.go:242: Executing command: vgreduce data_vg /dev/sdg12022/11/03 20:40:06.292377 reclaimvfree.go:253: Executing command: pvremove -y -ff /dev/sdg12022/11/03 20:40:06.444020 reclaimvfree.go:264: Executing command: parted -s -a opt /dev/sdg rm 12022/11/03 20:40:06.538938 progress.go:11: Reclaimed snapshot /dev/sdg12022/11/03 20:40:06.539027 reclaimvfree.go:242: Executing command: vgreduce lcmmount_vg /dev/sde /dev/sdg22022/11/03 20:40:06.539239 task_progress.go:24: Reclaimed snapshot /dev/sdg12022/11/03 20:40:06.772812 progress.go:11: Failed to reclaim snapshot disk /dev/sde /dev/sdg2 from VG lcmmount_vg. Error : exit status 1262022/11/03 20:40:06.773629 task_progress.go:24: Failed to reclaim snapshot disk /dev/sde /dev/sdg2 from VG lcmmount_vg. Error : exit status 1262022/11/03 20:40:06.819900 workflow_manager.go:198: Task reclaim-vfree failed. Error: Failed to reclaim snapshot disk /dev/sde /dev/sdg2 from VG lcmmount_vg. Error : exit status 1262022/11/03 20:40:06.819970 workflow_manager.go:138: Stopping workflow execution as task reclaim-vfree failed |
2022/11/07 09:35:18.875054 reclaimvfree.go:242: Executing command: vgreduce lcmmount_vg /dev/sdc /dev/sdg22022/11/07 09:35:18.875229 task_progress.go:24: Reclaimed snapshot /dev/sdg22022/11/07 09:35:18.941316 progress.go:11: Failed to reclaim snapshot disk /dev/sdc /dev/sdg2 from VG lcmmount_vg. Error : exit status 1272022/11/07 09:35:18.941490 task_progress.go:24: Failed to reclaim snapshot disk /dev/sdc /dev/sdg2 from VG lcmmount_vg. Error : exit status 1272022/11/07 09:35:18.959857 workflow_manager.go:198: Task reclaim-vfree failed. Error: Failed to reclaim snapshot disk /dev/sdc /dev/sdg2 from VG lcmmount_vg. Error : exit status 1272022/11/07 09:35:18.959911 workflow_manager.go:138: Stopping workflow execution as task reclaim-vfree failed |
The presence of multiple PVs in a volume group causes this failure. To assert this,
Currently there is no resolution. We are working on this
Pre-requisite:
Procedure:
Assign execute permission to the script using the following command
cd /home/vcfchmod +x update_failure_workaround.sh |
Run the below command to identify the Snapshot Device Name
grep "Configured" /var/log/vmware/capengine/cap-required-hardware-addition/workflow.log | grep "/storage/lvm_snapshot"example output:Configured disk "/dev/sdg" in the appliance and mounted on /storage/lvm_snapshot |
Perform the cleanup using the following command
./update_failure_workaround.sh <Snapshot Device>
example usage:
./update_failure_workaround.sh /dev/sdg |
example output: please check for the "Success" at the end.
INFO Remove Snapshots if present....INFO Mount all filesystems mentioned in fstabINFO lvm_snapshot is mounted successfullyINFO Cleanup Done.INFO altered cap update workflowsINFO Success |
Once the update finishes, remove the workaround script by running the below command
rm /home/vcf/update_failure_workaround.sh |