Collection of metric times out intermittently causing high priority alerts in VMware vFabric Hyperic Server 4.x
search cancel

Collection of metric times out intermittently causing high priority alerts in VMware vFabric Hyperic Server 4.x

book

Article ID: 342452

calendar_today

Updated On:

Products

VMware

Issue/Introduction

Symptoms:
  • Metric values that take too long to collect arrive at the VMware vFabric Hyperic server later than expected.
  • Metric values that arrive late at the vFabric Hyperic server result in unexpected alert behavior.
  • You may see this error in the Hyperic Administrator agent.log:

    2012-05-11 03:20:44,840 ERROR [pool-1-thread-1] [ScheduleThread] Metric 'java.lang:type=MemoryPool,name=PS Old Gen:Availability:jmx.url=service%3Ajmx%3Armi%3A///jndi/rmi%3A//localhost%3A9096/jmxrmi,jmx.username=,jmx.password=' took too long to run (31438ms), cancelled (result=true)

    Note: This error indicates the metric took too long to collect and the Agent ended the attempt to collect that metric value.
     


Environment

VMware vFabric Hyperic Server 4.6.x
VMware vFabric Hyperic Server 4.5.x
VMware vFabric Hyperic Server 4.4.x

Cause

The default setting for this entry in the agent.properties file is set, by default, to 5000. This timeout entry may need to be increased.

Resolution

There is a maximum time, in milliseconds, the ScheduleThread allows a metric collection process to to run before attempting to interrupt it. When the timeout is exceeded, collection of the metric is interrupted. If it is not in a state where it can be interrupted, such as in a wait(), sleep() or non-blocking read() state, open the agent.properties file and look for this information:

scheduleThread.cancelTimeout=5000

Edit the timeout number to a higher number. For instance, if the timeout number is 5000, change it to 9000.