Auto Deploy Cache Corruption When Using Multiple ESXi Images
search cancel

Auto Deploy Cache Corruption When Using Multiple ESXi Images

book

Article ID: 390917

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Auto Deploy Cache Corruption When Using Multiple ESXi Images in vCenter 8.0 Environment

  • ESXi hosts managed by Auto Deploy fail to boot after 3-4 days of adding multiple ESXi images
  • Error messages similar to the following appear in /var/log/vmware/rbd/rbd-syslog.log:
Failed to repair the cache: Something went wrong while converting items to pxe profile:'NoneType' object has no attribute 'key'
  • Regular execution of Repair-DeployImageCache temporarily resolves the issue, but the problem recurs after 3-4 days

Environment

After upgrading to vCenter 8.0.x, when multiple ESXi images (7.x and 8.x) are added to Auto Deploy via PowerCLI, the Auto Deploy cache becomes corrupted after 3-4 days. This corruption prevents ESXi hosts from booting properly, requiring regular cache repairs using the Repair-DeployImageCache command.

Cause

The primary causes of this issue are:

  • The Auto Deploy cache size in the configuration file may not update correctly when changed through the vCenter UI after an upgrade from 7.0.x to 8.0.x
  • The default cache size (2GB) is insufficient to handle multiple ESXi images, especially when mixing 7.x and 8.x images
  • The issue specifically occurs with PowerCLI-created image profiles, while UI-created profiles do not experience the same corruption

Resolution

  1. Manually edit the Auto Deploy configuration file to increase the cache size:

    a. SSH into the vCenter Server Appliance

    b. Open the Auto Deploy configuration file using vi or another text editor:

    vi /etc/vmware-rbd/autodeploy-setup.xml
    

    c. Locate the <maxSize> parameter under <defaultValues> and change its value from the default (likely 2) to a higher value (8 or more is recommended):

    <defaultValues>
        <port>6501</port>
        <portAdd>6502</portAdd>
        <maxSize>8</maxSize>
    </defaultValues>
    

    d. Save the file and exit the editor

    e. Restart the Auto Deploy service:

    service-control --restart rbd
    

    f. Verify that the cache size has been updated correctly:

    grep -rin /var/log/vmware/rbd -e "cacher starting"
    

    g. The output should include a line similar to:

    INFO:rbd_cached:cacher starting (/var/lib/rbd/cache, 8589934592)...
    

    The number 8589934592 represents 8GB in bytes, confirming the increased cache size.

  2. Alternatives to modifying the configuration file:

    a. Use the vCenter UI to add ESXi images to Software Depots instead of using PowerCLI

    b. Use only one ESXi version at a time in Auto Deploy via PowerCLI

    c. Regularly run the Repair-DeployImageCache command if the above solutions cannot be implemented

Additional Information

The issue appears to be specifically related to how cache management occurs after an upgrade from vCenter 7.0.x to vCenter 8.0.x. The UI setting for cache size may not properly update the underlying configuration file.

The problem is more likely to occur when using both 7.x and 8.x images simultaneously via PowerCLI because:

  • ESXi 8.0 images are larger than 7.0 images due to unified depots
  • PowerCLI-created rules store VIBs differently than UI-created Software Depot entries

For related Auto Deploy booting issues see: Auto deploy is not booting hosts and web ui is not responding