Settings which can be tuned to prevent timeouts during upgrades in VMware Cloud Foundation
search cancel

Settings which can be tuned to prevent timeouts during upgrades in VMware Cloud Foundation

book

Article ID: 316978

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

The purpose this article is to provide users with information regarding tunable settings which can be used to prevent timeouts during upgrades of VMware Cloud Foundation.

Symptoms:
When Deduplication/Compression is enabled on a cluster, hosts can take longer than normal to enter maintenance mode or to reboot. This may exceed timers in LCM or in VMware Update Manager, resulting in failures during LCM Upgrades.  The settings below can be tuned to prevent failures during LCM upgrades.

Environment

VMware Cloud Foundation 2.3.x
VMware Cloud Foundation 2.2.x

Resolution

  • VSAN/CLOMD Repair Delay:  This is the most important timer to consider changing before an upgrade.  The default repair delay is 1 hour; consider changing this to at least 4 hours.  If a host does not exit maintenance mode within the timer, VSAN Object Resynchronization will initiate. 
Note: VSAN Object Resynchronization may take hours or even days to complete on a busy cluster with Deduplication/Compression enabled.
  1. Open an SSH session to each host in the cluster.
  2. Issue the following command to check the current setting: 
esxcli system settings advanced list -o /VSAN/ClomRepairDelay
  1. The setting can be increased by issuing a command similar to the following: 
esxcli system settings advanced set -o /VSAN/ClomRepairDelay -i <value in minutes>

Note: For example, to set the Repair Delay to 4 hours, issue the following command:

esxcli system settings advanced set -o /VSAN/ClomRepairDelay -i 240
  1. Issue the following command to restart the clomd daemon and have the setting take effect: 
/etc/init.d/clomd restart
  • There is a default timeout of 4 hours for ESXi hosts to enter maintenance mode when performing an upgrade through LCM. This can be increased by following the steps noted in VMware Cloud Foundation upgrade fails while putting an ESXi host into maintenance mode.
  • In VMware Cloud Foundation 2.2 and higher, VMware Update Manager (VUM) is utilized to perform the upgrades of ESXi hosts. VUM has a built in timer of 7200 seconds (2 hours) for upgrade tasks. This can be increased via the following steps:​
    1. Open an SSH session to the vCenter Server Appliance VM.
    2. Open the /usr/lib/vmware-updatemgr/bin/vci-integrity.xml file in a text editor.
    3. Find the <HostUpgradeConfig> section and add the following entry to increase the timeout to 5 hours:
<RebootTaskTimeoutSec>18000</RebootTaskTimeoutSec>
 
Note: The following is an example of the relevant section:
 
<HostUpgradeConfig>
  <MinHostVersion>5.5.0</MinHostVersion>
  <PackageVersions>6.5.0</PackageVersions>
  <RebootTaskTimeoutSec>18000</RebootTaskTimeoutSec>
</HostUpgradeConfig>
  1. Save and close the file.
  2. Issue the following commands to restart the VUM service: 
service-control --stop vmware-updatemgr
service-control –-start vmware-updatemgr


Additional Information

VMware Cloud Foundation upgrade fails while putting an ESXi host into maintenance mode

Impact/Risks:
Always make a backup copy of configuration files before making any changes.