This is a known issue affecting VMware Cloud Foundation. This issue issue is less likely to occur in versions 2.2 and beyond.
Workaround:
To work around the issue, increase the amount of time that the LCM process will wait for a host to complete its upgrade operation.
For VMware Cloud Foundation 2.1.x:
- Log in to the VRM virtual machine as the root user.
- Make a copy of the /home/vrack/lcm/lcm-app/conf/application-prod.properties file:
cp /home/vrack/lcm/lcm-app/conf/application-prod.properties /tmp/
- Open the /home/vrack/lcm/lcm-app/conf/application-prod.properties file using a text editor:
vi /home/vrack/lcm/lcm-app/conf/application-prod.properties
- Find the line that reads:
lcm.upgrade.esx.maxallowed.increase.endTime=14400000
and change 14400000 to a higher value.
Note: 14400000 corresponds to four hours.
- Save and close the file.
- Restart the lcm-init service:
service lcm-init restart
For VMware Cloud Foundation 2.2.x and higher:
- Log in to the SDDC Manager Controller virtual machine as the root user.
- Make a copy of the /home/vrack/lcm/lcm-app/conf/application-evo.properties file:
cp /home/vrack/lcm/lcm-app/conf/application-evo.properties /tmp/
- Open the /home/vrack/lcm/lcm-app/conf/application-evp.properties file using a text editor:
vi /home/vrack/lcm/lcm-app/conf/application-evo.properties
- Find the line that reads:
esx.upgrade.timeout=14400000
and change 14400000 to a higher value.
Note: 14400000 corresponds to four hours. - Save and close the file.
- Restart the lcm service:
systemctl restart lcm
Note: The preceding steps may not resolve the issue and manual intervention on the ESXi host(s) in question may be required. It may be necessary to reboot the host.