Remediation failed with 'Failed to reboot Host' and Host is not connected error while upgrading ESXi hosts through vLCM
search cancel

Remediation failed with 'Failed to reboot Host' and Host is not connected error while upgrading ESXi hosts through vLCM

book

Article ID: 423010

calendar_today

Updated On:

Products

VMware vSphere ESXi VMware vCenter Server

Issue/Introduction

  • Customer try to upgrade their ESXi hosts through vLCM and applying ''REMEDIATE ALL'
    This host does not reboot and exit maintenance mode even after applied image.

  • The task failed with 'Remediation failed' and 'Failed to reboot host' because of Host is not connected like below.

    or 
    Failed to remediate firmwares list: Lenovo Storage Linux Firmware Maintenance Mode OS is in booting progress or previous Maintenance Mode OS exit abnormally. You can use command “./OneCli restorebmu --bmc <BMC connection information>" to force finish it.  Please collect BMC service data and LXCI log if you want to request Lenovo Service Support.

 

  • From the screenshot and inside the redbox, you can find 'Failed to remediate firmware list: Lenovo...' and it's the key point.

  • You can find below log entries in vmware-vum-server-x.log when the issue happened.

    Log location: /var/log/vmware/vmware-updatemgr/vum-server/vmware-vum-server-x.log

    [DATE/TIME] info vmware-vum-server[938890] [Originator@6876 sub=Hsl::RemediationManager opID=OPERATION_ID] [RemediationManager 386] ACTION [POST_IMAGE_UPDATE] was invoked on HSM [com.lenovo.hsm]. Task ID is [TASK_ID].

    [DATE/TIME] info vmware-vum-server[938890] [Originator@6876 sub=Hsl::TaskManager opID=OPERATION_ID] [TaskManager 359] Task is still in progress on hsm: com.lenovo.hsm for taskId: TASK_ID is RUNNING.. Waiting.

    [DATE/TIMME] info vmware-vum-server[938890] [Originator@6876 sub=ServiceProvider opID=OPERATION_ID] [EmbeddedPyServiceProvider 1834] HSM Task Info JSON String: {"
    id": "TASK_ID", "description": "", "messages": ["Failed to remediate firmwares list:\n  Lenovo RAID Adapter Linux Firmware:BMC RDOC space not enough, Please remove other mou
    nted device from BMC's RDOC.\nIntel Networking Adapter/Device Linux Firmware:BMC RDOC space not enough, Please remove other mounted device from BMC's RDOC.\nLenovo System UEFI/BIOS Firmware:Update d
    oesn't take effect after successful flash. Please collect BMC service data and LXCI log if you want to request Lenovo Service Support."], "action": "POST_IMAGE_UPDATE", "startTime": "DATE/TIME", "status": "FAILED", "hosts": ["host-ID"], "estimatedTimeRemaining": 0, "progress": 100}

    [DATE/TIME] info vmware-vum-server[938890] [Originator@6876 sub=Hsl::TaskManager opID=OPERATION_ID] [TaskManager 406] Task: TASK_ID on hsm: com.lenovo.hsm is FAILED

    [DATE/TIME] info vmware-vum-server[938890] [Originator@6876 sub=RemediateClusterTask opID=OPERATION_ID] [ApplyHelpers 1245] Updating status with failure. -origin
    ator = vSphere Lifecycle Manager - retriable = false

    [DATE/TIME] error vmware-vum-server[938890] [Originator@6876 sub=HostOps opID=OPERATION_ID] [HostOps 672] Host(ID: host-ID, Name: HOST_FQDN) is not connected

Environment

vCenter 8.X

Cause

It's because Lenovo HSM update failed.

When any host update fails, the cluster update is aborted.

Resolution

Please contact Lenovo support to find out why the firmware remediation is failing