reboot events repeating


Article ID: 127812


Updated On:


CA Infrastructure Management CA Infrastructure Management CA Performance Management - Usage and Administration


I noticed that the same devices keep on displaying Rebooted events every poll cycle.

These devices have not been rebooted.
Event Type = reconfiguration
Event SubType = rebooted
Description = A device reboot was detected during this poll period.  The device last restarted x:xx:xx.xx ago.


The Availability Metric Family polls for sysuptime and expects the values to be in time ticks.
If the difference between the previousSysUpTime and the currentSysUpTime is not equal to the timeTicksSincePreviousPoll we will generate a reboot event.

Enable detailed poll logging for the problem device and wait at least 1 poll cycle or until the event occurs then collect all logs for this IP.
Check for this message:

date: The device rebooted, but we are not dropping the response because this is the Availability poll response: previousSysUpTime=42537758, currentSysUpTime=42538058, timeTicksSincePreviousPoll=30000, response=SnmpPollResponse [itemID=454340, deviceItemId=454340, internetAddress=x.x.x.x, pollGroupId=4090, cycleTimestamp=1550763000000
  readTimestamp=1550763229444, duration=300000, pollRate=-1, error=SUCCESS, errorIndex=-1, rowData=[
            SnmpResponseVariable [oid=, type=TIME_TICKS, value=42538058, isDelta=true, isList=false, error=SUCCESS, isDynamicIndex=false, indexList=null]
actual difference = 300


PM only allows 1000 timeticks offset between TicksSincePreviousPoll and the actual diff. So it appears the DC is correctly advancing time by 5 minutes of timeticks, but the device is not keeping up with advancing timeticks for the same amount of real-time. There is no way to adjust the 1000 timeticks difference allowed.

Note:  ( prevSysUpTime + timeTicksSincePrevious - REBOOT_TOLERANCE ) > currentSysUpTime

So it's the previous sysUptime we read, plus a number of timeticks DC has advanced minus a 1000 timeticks.  If that value > currentSysUptime read in a new poll, we think the device rebooted.

So the issue is because SysUpTime on the device isn't moving fast enough on the device in relation to real-time.



CAPC 3.x on linux


Device is reporting the wrong value for sysuptime.
There are 2 options:
  1. Contact the vendor and find out why timeTicks is in 1/1000 second instead of 1/100 second.
  2. Create a group and add all devices except the problem ones.

Remove the Availability Monitoring Profile (MP) from the All Manageable Devices collection and apply it to your custom group.

Additional Information