Justification for collection of the OS Logs for the CARE packages
search cancel

Justification for collection of the OS Logs for the CARE packages

book

Article ID: 430941

calendar_today

Updated On:

Products

Network Observability CA Performance Management

Issue/Introduction

Justification for collection of the OS Logs for the CARE (Remote Engineer) packages

Environment

DX NetOps Performance Management releases

Resolution

With CARE (Remote Engineer), we collect the following information:

  • /var/log/messages: General system activity and error messages.
  • /var/<registry xml file>: Created by the InstallAnywhere software installer which may be key in troubleshooting install/upgrade issues.
  • /etc/security/limits.conf: User resource limits (ulimit).
  • /etc/systemd/system/<services>.service: Service Definitions employed by various components of PM
  • /etc/<os release related files>: OS release version details
  • Along with the above, we do collect outputs from some commands which capture the specifications of processes run by PM along with server sizing specifications.

The product does not read the messages logs, the re.sh gathers them as a troubleshooting resources they are invaluable and have proved such over and over.
These logs are where Support first find evidence of the kernel OOM killer being invoked, a place where we find evidence of management utilities like Boks locking things down where they shouldn't, where we have found evidence of MTU misalignment, where we have found signs that SELinux is blocking application capabilities, and where we have found evidence related time skew issues
To summarize, /var/log/messages has many error messages that help us diagnosis why the app may get killed by the OS, or why time shifted, or other possible error messages that help us analyze the environmental issues. 

Hence, the collection of OS logs is justified to isolate "Environmental" vs. "Application" defects, specifically for time synchronization and resource availability (CPU/Memory) which are critical for the time-series databases used in NetOps Performance Management.