alarm_enrichment max restarts error due to java.lang.AssertionErrorĀ 
search cancel

alarm_enrichment max restarts error due to java.lang.AssertionErrorĀ 

book

Article ID: 214336

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM) Unified Infrastructure Management for Mainframe CA Unified Infrastructure Management SaaS (Nimsoft / UIM)

Issue/Introduction

We are unable to run pre-processing rules in the last few hours. Then alarm_enrichment is reaching max restarts and turns red. nas is still running.

alarm_enrichment hub queue is increasing.

nas and alarm_enrichment version 9.31 had been running fine since January. Then today, alarm_enrichment turned red and would no longer stay up and running and you could not deactivate it nor restart it.

Redeploy of the probe allowed it to recover and work for a minute or so but then it turned red again and the log showed a java.lang.AssertionError after a lot of TOT rules were in the middle of processing. AssertionError in Java is thrown when an assert statement fails (i.e. the result is false).

May 05 14:51:03:705 [main, alarm_enrichment] MapDb being opened
May 05 14:51:03:864 [main, alarm_enrichment] java.lang.AssertionError
at org.mapdb.SerializerPojo.deserializeUnknownHeader(SerializerPojo.java:444)
at org.mapdb.SerializerBase.deserialize3(SerializerBase.java:1216)
at org.mapdb.SerializerBase.deserialize(SerializerBase.java:1132)
at org.mapdb.SerializerBase.deserialize(SerializerBase.java:867)
at org.mapdb.SerializerPojo.deserialize(SerializerPojo.java:701)
at org.mapdb.HTreeMap$2.deserialize(HTreeMap.java:135)
at org.mapdb.HTreeMap$2.deserialize(HTreeMap.java:121)
at org.mapdb.Store.deserialize(Store.java:297)
at org.mapdb.StoreDirect.get2(StoreDirect.java:486)
at org.mapdb.StoreDirect.get(StoreDirect.java:439)
at org.mapdb.Caches$HashTable.get(Caches.java:246)
at org.mapdb.EngineWrapper.get(EngineWrapper.java:58)
at org.mapdb.HTreeMap$HashIterator.findNextLinkedNodeRecur(HTreeMap.java:1089)
at org.mapdb.HTreeMap$HashIterator.findNextLinkedNode(HTreeMap.java:1054)
at org.mapdb.HTreeMap$HashIterator.advance(HTreeMap.java:1041)
at org.mapdb.HTreeMap$HashIterator.moveToNext(HTreeMap.java:1000)
at org.mapdb.HTreeMap$EntryIterator.next(HTreeMap.java:1147)
at org.mapdb.HTreeMap$EntryIterator.next(HTreeMap.java:1140)
at
com.nimsoft.probe.service.nas.timeoverthreshold.TOTAlarmCandidateTable.importDataFromAlarmCandidatesPersistentMap(TOTAlarmCandidateTable.java:80)
at com.nimsoft.probe.service.nas.timeoverthreshold.TOTAlarmCandidateTable.<init>(TOTAlarmCandidateTable.java:73)
at com.nimsoft.probe.service.nas.timeoverthreshold.TOTAlarmCandidateTable.getInstance(TOTAlarmCandidateTable.java:57)
at com.nimsoft.probe.service.nas.timeoverthreshold.TimeOverThresholdService.<init>(TimeOverThresholdService.java:78)
at com.nimsoft.probe.service.nas.Nas.postLoginInitialization(Nas.java:488)
at com.nimsoft.probe.service.nas.Nas.login(Nas.java:418)
at com.nimsoft.nimbus.NimProbeBase.nimAttachProbe(NimProbeBase.java:303)
at com.nimsoft.nimbus.NimProbe.initSubscribe(NimProbe.java:572)
at com.nimsoft.nimbus.NimProbe.doForever(NimProbe.java:416)
at com.nimsoft.probe.service.nas.Nas.run(Nas.java:611)
at com.nimsoft.probe.service.nas.Nas.main(Nas.java:394)
May 5 14:51:05:105 [9308] Controller: Max. restarts reached for probe 'alarm_enrichment' (command = <startup java>)

Environment

Release: Any UIM release

Component : UIM OPERATOR CONSOLE - ALARM VIEWER
Component:  UIM - ALARM_ENRICHMENT

 

Cause

- corruption of the TOTAlarmCandidateDb files

Resolution

1. Deactivate nas
2. Deactivate alarm_enrichment
3. On the UIM Primary hub, delete the files from:
...\probes\service\nas\alarm_enrichment\db
or 
.../probes/service/nas/alarm_enrichment/db folder

   a. TOTAlarmCandidateDb.ser
   b. TOTAlarmCandidateDb.ser.p

4. Activate alarm_enrichment
5. Activate nas

The TOT db files are automatically rebuilt.

Check the hub Status Tab and make sure that the nas, alarm_enrichment and TOT queues are processing messages.

Check the alarm_enrichment log to make sure there are no more errors.