DPM fails to wake up hosts from standby mode in vSphere 8.0.x despite high cluster resource usage
search cancel

DPM fails to wake up hosts from standby mode in vSphere 8.0.x despite high cluster resource usage

book

Article ID: 406334

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • The Distributed Power Management feature in vCenter might fail moving host out of standby mode despite the high resource usage in the cluster.
  • vSphere Client shows error messages such as:

Cannot perform power operation on host. DRS cannot move <hostname> out of standby mode 

and:

Alarm 'Exit standby error' on <hostname> triggered by event 451990711 'DRS cannot move <hostname> out of standby mode'

 

  • With enhanced IPMI logging enabled for vpxd, /var/log/vmware/vpxd/vpxd-<number>.log contains messages similar to the ones below:
    <timestamp> info vpxd[34441] [Originator@6876 sub=vpxLro opID=CdrsLoadBalancer-########-########-01] [VpxLRO] -- BEGIN task-### -- vitesxu636.vismait.no -- Drm.ExitStandbyLRO --
    <timestamp> error vpxd[34441] [Originator@6876 sub=Default opID=CdrsLoadBalancer-########-########-01] IPMILIB - too many open connections

 

  • To enable enhanced IPMI logging, ensure that the following settings are present in the vpxd.cfg file, found in /etc/vmware-vpx/ then restart the vpxd service using the command "service-control --restart vpxd":
    <config>
      ..
      <ipmi>
        <debugLevel>3</debugLevel>
        <recvWaitTimeout>10000</recvWaitTimeout>
      </ipmi>
      ..
      <log>
        ..
        <level>verbose</level>
        ..
      </log>
    </config>

 

  • In some cases there might also be BMC_MISSING_SUPPORT messages in the log, like:
    <timestamp> error vpxd[33822] [Originator@6876 sub=Default opID=CdrsLoadBalancer-########-########-01] [VpxLRO] -- ERROR task-### --  -- <host-fqdn> -- Drm.ExitStandbyLRO: :vim.fault.HostPowerOpFailed\n--> Result:\n--> (vim.fault.HostPowerOpFailed) {\n-->    faultCause = (vmodl.MethodFault) null, \n-->    faultMessage = <unset>\n-->    msg = ""\n--> }\n--> Args:\n-->
    <timestamp> error vpxd[33822] [Originator@6876 sub=MoHost opID=CdrsLoadBalancer-########-########-01] ILO lib: GetPowerState call failed; error: 66 ('BMC_MISSING_SUPPORT')

 

Environment

VMware vCenter Server 8.0.x

Cause

This is caused by a limitation in the code used by the IPMI library vCenter used to communicate with the standby hosts. This code currently has a 1024-descriptor limit, which is prone to get exhausted in DPM setups.

Resolution

There is currently no resolution available for this issue.

Please subscribe to this article to be made aware when a fix has been released.