Investigating core dump issues for Monitor Solution for UNIX
search cancel

Investigating core dump issues for Monitor Solution for UNIX

book

Article ID: 181143

calendar_today

Updated On:

Products

IT Management Suite Monitor Solution

Issue/Introduction

You are getting some issues with Monitor Solution for UNIX and you would like to check what cold be the cause.

How do you investigate core dump issues for Monitor Solution for UNIX?

Environment

ITMS 8.x

Resolution

The following are steps to take to gather information pertinent to core dump issues with Monitor Solution for UNIX. The commands may be slightly AIX-specific but can be modified for any Unix/Linux platform.

  1. * Stop monitor

    # /opt/altiris/notification/monitor/bin/rcscript stop

  2. * Remove old monitor logs.

    # rm /opt/altiris/notification/monitor/var/monitor.log*
  3. * Rename old config file

    # mv /opt/altiris/notification/monitor/etc/Config.xml /opt/altiris/notification/monitor/etc/Config.xml.old

  4. * Check and remember fullcore variable:

    # lsattr -El sys0 | grep fullcore

    The output should look like this:

    fullcore     true          Enable full CORE dump                             True
  5. Set the variable to True and switch off limited core dump:

    # chdev -l sys0 -a fullcore=true
    # ulimit -c unlimited

  6. Check that the variable is set correctly (should be True):

    # lsattr -El sys0 | grep fullcore

    fullcore     true          Enable full CORE dump                             True

  7. Start monitor agent with detailed log and threshold 100 MB

    # aex-metricprovider -l v -logsize 100

  8. Wait some time until monitor service will refresh its configuration and start monitoring. See if the problem is reproducible again.
  9. When the problem is reproducible, proceed with the following steps.
  10. Print the CPU usage information:

    # ps gu | grep aex

  11. Print the threads information

    # ps -mp `ps -ef | grep aex-metricprovider | grep -v grep | awk '{print $2}'` -o THREAD

  12. Kill the process in the following way:

    # kill -ILL `ps -ef | grep aex-metricprovider | grep -v grep | awk '{print $2}'`

  13. Make sure core file is generated in the .../monitor/bin directory. If not, look in the current directory.
  14. Restore the Fullcore variable to the previous value:

    # chdev -l sys0 -a fullcore=<previous>

  15. Provide the following information:
    • Generated core file (please make sure it's transferred in binary mode)
    • Everything printed on the screen
    • log files from /opt/altiris/notification/monitor/var/monitor.log*
    • config file /opt/altiris/notification/monitor/etc/Config.xml
    • output of the following commands (please note that some of the files may be missing, this is OK):
      • ls -l /usr/tmp/dhcpsd.log
      • ls -l /var/log/messages
      • ls -l /tmp/syslog.out
      • ls -l /home/cognos/c8/logs/cogserver.log
      • ls -l /var/log/apache2/access_log
      • ls -l /var/opt/freeware/apache/logs/access_log
      • ls -l /home/cognos/c8/bin/core
      • ls -l /var/adm/syslog/syslog.log
      • ls -l /etc/httpd/logs/access_log
      • ls -l /usr/local/apache/adm/access_log
      • ls -l /opt/hpws/apache/logs/access_log
      • ls -l /usr/local/apache2/logs/access_log
      • ls -l /usr/local/apache/logs/access_log
      • ls -l /var/adm/messages