This KB article addresses an observed failure to upgrade VCFA from 9.0.0 to 9.0.1. The upgrade task/workflow in Fleet Manager (Ops LCM) fails repeatedly.
Logs from Fleet Manager suggest that the 2hr timeout is hit. Retrying does not resolve the issue.
VCF Automation 9.0.x
An underlying issue with the log system causes significant growth of the /var/log/messages file on the VCFA cluster VMs. Eventually, this file can grow to be many tens of gigabytes (can grow up to 55GB).
The cause of this growth is the lack of management of this file and audit logging causing high log volume.
The management of this file was addressed in the 9.0.1 release where its size was capped at 50 MB.
Note: The upgrade process may still be proceeding and retrying within the VCFA cluster. Performing the remediation below will cause the upgrade to continue automatically in that case.
Log into each of the VCFA node VMs using the vmware-system-user account and SSH
ssh vmware-system-user@<NODE IP>
sudo su -
ls -lh /var/log/messages -rw-r----- 1 root root 4.7M Jan 13 14:32 /var/log/messages
truncate /var/log/messages -s 0
ls -lh /var/log/messages -rw-r----- 1 root root 0 Jan 13 14:32 /var/log/messages
Retry the upgrade.