UIM logmon probe is sending alarms for old log entries
search cancel

UIM logmon probe is sending alarms for old log entries

book

Article ID: 110047

calendar_today

Updated On:

Products

DX Unified Infrastructure Management (Nimsoft / UIM)

Issue/Introduction

logmon probe is being used to monitor the Dell OMSA log dcsys32.xml for hardware failure. The probe is configured for Mode 'updates' which should ready only the newly added data to the file. Yet with each monitoring cycle the problem is generating alarms for the same old log entries each time. 

Environment

Release:


Component:

Cause

The updates to dcsys32.xml are at the end of the file. However all the LogEntry records are in between tags EventLog.  When new a LogEntry is added, it is added before </EventLog> but not after. Since the file always ends with </EventLog>, logmon sees it as if the file update was not at the end. So it knows the file has changed but can't locate the EOF end of file. 
 
Here are samples of the file endings from two successive copies of dcsys32.xml.
</TimeStamp><DateTime>Wed Aug 01 02:30:17 2018</DateTime><ComputerName>MENTS3</ComputerName><Type>4</Type><ID>2242</ID><Link>help/hip/en/msgguide/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Messages_Guide&amp;topic=2242</Link><UserInfo></UserInfo><Source>Server Administrator</Source><Category>Storage Service</Category><Description>The Patrol Read has started.:  Controller 0 (PERC H700 Integrated) </Description><Data></Data></LogEntry></EventLog>

</TimeStamp><DateTime>Fri Aug 03 00:28:39 2018</DateTime><ComputerName>MENTS3</ComputerName><Type>4</Type><ID>2180</ID><Link>help/hip/en/msgguide/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Messages_Guide&amp;topic=2180</Link><UserInfo></UserInfo><Source>Server Administrator</Source><Category>Storage Service</Category><Description>The controller battery Learn cycle will start in 4 days.:  Battery 0 Controller 0</Description><Data></Data></LogEntry></EventLog>

Here 
Aug  3 00:31:45:146 [12904] logmon: getFromDB: refbuf /Data)(/LogEntry)(LogEntry)(TimeStamp)1533114625(/TimeStamp)(DateTime)Wed Aug 01 04:10:25 2018(/DateTime)(ComputerName)MENTS3(/ComputerName)(Type)4(/Type)(ID)2243(/ID)(Link)help/hip/en/msgguide/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Messages_Guide&topic=2243(/Link)(UserInfo)(/UserInfo)(Source)Server Administrator(/Source)(Category)Storage Service(/Category)(Description)The Patrol Read has stopped.:  Controller 0 (PERC H700 Integrated) (/Description)(Data)(/Data)(/LogEntry)(/EventLog)

Aug  3 00:31:45:146 [12904] logmon: FileOpen:h->save.eof 576754 h->curr.eof 577239
Aug  3 00:31:45:146 [12904] logmon: locateEOF pos 576255 h->save.refEOF 576753 h->save.refSize 498
Aug  3 00:31:45:146 [12904] logmon: locateEOF: Last char: 60
Aug  3 00:31:45:146 [12904] logmon: locateEOF: First char: 47
Aug  3 00:31:45:146 [12904] logmon: locateEOF: h->save.refSize: 498
Aug  3 00:31:45:146 [12904] logmon: locateEOF: Match not found

This shows logmon has detected the increase in file size "save.eof 576754 h->curr.eof 577239"
Yet because of the way the file is update it is not possible for logmon to detect the EOF " locateEOF: Match not found"


 

Resolution

For this type of file updates, logmon will always have to read the whole file and there is no provision to prevent it from generating new alarms for the older log entries when the file is read.

Additional Information

For reference and better understanding here is an example from the logmon log from when it finds the log size has not changed. 

Aug  3 01:41:45:859 [12904] logmon: getFromDB: refbuf ry><LogEntry><TimeStamp>1533274119</TimeStamp><DateTime>Fri Aug 03 00:28:39 2018</DateTime><ComputerName>MENTS3</ComputerName><Type>4</Type><ID>2180</ID><Link>help/hip/en/msgguide/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Messages_Guide&amp;topic=2180</Link><UserInfo></UserInfo><Source>Server Administrator</Source><Category>Storage Service</Category><Description>The controller battery Learn cycle will start in 4 days.:  Battery 0 Controller 0</Description><Data></Data></LogEntry></EventLog>
Aug  3 01:41:45:859 [12904] logmon: getFromDB: refEOF 577238


Aug  3 01:51:45:509 [12904] logmon: getFromDB: refbuf ry><LogEntry><TimeStamp>1533274119</TimeStamp><DateTime>Fri Aug 03 00:28:39 2018</DateTime><ComputerName>MENTS3</ComputerName><Type>4</Type><ID>2180</ID><Link>help/hip/en/msgguide/wwhelp/wwhimpl/common/html/wwhelp.htm?context=Messages_Guide&amp;topic=2180</Link><UserInfo></UserInfo><Source>Server Administrator</Source><Category>Storage Service</Category><Description>The controller battery Learn cycle will start in 4 days.:  Battery 0 Controller 0</Description><Data></Data></LogEntry></EventLog>
Aug  3 01:51:45:509 [12904] logmon: getFromDB: refEOF 577238

Aug  3 01:51:45:509 [12904] logmon: (ptScanPclose) - closing Dell_OMSA_logmonitoring not modified