vCenter Server triggered an alarm: ESX Agent Manager (EAM) status changed from Green to Red, then recovered to Green within 30 minutes.
search cancel

vCenter Server triggered an alarm: ESX Agent Manager (EAM) status changed from Green to Red, then recovered to Green within 30 minutes.

book

Article ID: 414085

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

vCenter Server triggered alarms that ESX Agent Manager (EAM) status changed from Green to Red, and then recovered to Green within 30 minutes.

  * Event: eam status changed from green to red
  * Alarm: Alarm 'ESX Agent Manager Health Alarm' on Datacenters changed from Green to Red

Environment

VMware vCenter Server 7.0
VMware vCenter Server 8.0

Resolution

When the vmon health check of EAM is triggered while a Full Garbage Collection (Full GC) is running on the EAM side, the EAM fails to respond to the health check, and health status turns to Red.

This is an event that rarely occurs, and no action is required for this issue. The EAM health status typically returns to Green when the health check runs again after 30 minutes.

 

Example logs of vmon health check and eam Full GC (The log output has been partially omitted.) :

 - /var/log/vmware/vmon/vmon.log

YYYY-MM-DDT17:50:05.223Z In(05) host-#### <eam> Running the API Health command as user eam
YYYY-MM-DDT17:50:05.223Z In(05) host-#### <eam-healthcmd> Constructed command: /usr/bin/python -B /usr/lib/vmware-eam/watchdog/vmon/healthCommandVmon.py /etc/vmware/../vmware-eam/catalina.properties 
YYYY-MM-DDT17:50:15.459Z Wa(03) host-#### <eam> Service api-health command's stderr: Exception while retrieving health xml from url http://localhost:15005/eam/healthstatus. Exception: timed out
YYYY-MM-DDT17:50:15.459Z Wa(03)+ host-#### 
YYYY-MM-DDT17:50:15.474Z Wa(03) host-#### <eam> Health of service failed. Health data: {"localizable_msgs": [{"id": "com.vmware.vmon.svc_health_fail", "default_message": "Failed to retrieve service health.", "args": []}]}
YYYY-MM-DDT17:50:15.474Z In(05) host-#### <eam> Recover from service api health check failure. Fail count 0

 

 - /var/log/vmware/eam/vmware-eam-gc.log

YYYY-MM-DDT17:49:50.197+0000: 59446095.329: [Full GC (Ergonomics) ... 28.7186387 secs] [Times: user=0.31 sys=0.25, real=28.72 secs]

 

To determine if the alarm is due to the same issue, please check whether these logs were output at close times.

Additional Information

If the alarms occur frequently, there may be issues such as insufficient heap memory in EAM, so please contact to support.