How to collect basic information for VMware GemFire issues
search cancel

How to collect basic information for VMware GemFire issues

book

Article ID: 294112

calendar_today

Updated On:

Products

VMware Tanzu Gemfire

Issue/Introduction

This document contains guidelines for collecting information, such as logs, statistics files, thread dumps, and heap dumps for Tanzu GemFire related issues. Besides providing the artifacts, it is important to provide a timeline or overview of the issue with details on impact and actions taken.

Environment


Cause

You can use the gfsh command "export logs" to make it easier to collect artifacts, such as logs and stats, and export from a GemFire cluster as described here in the "export logs" section.

It is recommended to use GemFire Management Console (GMC) to identify the correct incident time period, by searching the GemFire logs. Once the period is identified GMC can be used to export the correct logs and statistics using filtering.

What to collect when

  • For all GemFire issues, support will need logs and statistics from all cluster members (locators and servers) covering the time period when the issue occurred.
  • For issues where members are hung (any time an unscheduled "restart" becomes necessary), support will also need thread dumps from, at a minimum, the members that appear unresponsive. It is important that more than one thread dump is taken on each host.
  • For GC tuning issues, support will also require GC logs.
  • For out-of-memory issues and memory leaks, a heap dump will be required (if this is not feasible, a heap histogram is better than nothing).

Resolution

1. Logs:

  1. Locator and Server

    Copy logs from the location configured by the log-file property in the gemfire.properties file or given as a parameter in your startup script. For example (from the gemfire.properties file):

    log-level=config
    log-file=log/cacheserver1.log
    

    Make sure to provide complete logs, including those that cover the header and startup information. This information is valuable when investigating an issue.

  2. Java Client logs

    Copy any client logs from the location defined in the Java client code or gemfire.properties file. For example (in code):

    ClientCache cache = new ClientCacheFactory()
    .set("name", "CqClient")
    .set("cache-xml-file", "xml/CqClient.xml")
    .set("log-level", "config")
    .set("log-file", "cqclient.log")
    .create();
    
  3. Native Client logs

    Copy logs from the path defined in the log-file property of the gfcpp.properties file or defined in native code. For example, in gfcpp.properties:

    log-level=config
    log-file=log/nativeclient1.log
    
  4. Pulse logs

    See more details from How to Configure GemFire Pulse Logging in an Embedded Mode. Note that Pulse is deprecated in GemFire 10.1 and will be removed in a future release. Use GemFire Management Console in instead.

  5. Security logs

    Copy the logs from the path defined by the security-log-file property of the gemfire.properties or gfsecurity.properties file. For example, gemfire.properties:

    security-log-file=log/locatorsecurity.log
    
  6. GFSH logs

    By default, gfsh session logging is disabled. To enable gfsh logging, you must set the Java system property -Dgfsh.log-level=<desired_log_level> where desired_log level is one of the following values: severe, warning, info, config, fine, finer, finest.
    For example, in Linux:

    $ export JAVA_ARGS=-Dgfsh.log-level=config
    

    Then, start gfsh.

    Copy any logs from the directory in which the gfsh command was run. For example, if the gfsh command was run from /home/user1/GemWorkdir1, the gfsh log would be in a file similar to the following:

    /home/user1/GemWorkdir1/gfsh-2013-12-31_17-36-25.log
    
  7. GC Logs

    GC logging is enabled with startup parameters added to the JVM. The following parameters should be added when enabling GC logging for CMS GC (for JDK11+ with G1GC or ZGC use -Xlog:gc+):

    -XX:+PrintGC (or the alias: -verbose:gc)
    -XX:+PrintGCDetails
    -XX:+PrintGCTimeStamps
    -XX:+PrintAdaptiveSizePolicy
    -XX:+PrintTenuringDistribution
    -Xloggc: [file path and name]
    -XX:+UseGCLogFileRotation
    -XX:NumberOfGCLogFiles=100
    -XX:GCLogFileSize=1m
    

    Use Xloggc to specify the path and file name of the GC log file. Default is standard out. If restarting the cluster make sure to collect gc logs before restarting as the JVM can overwrite needed logs when starting.

    The overhead of GC logging is usually rather small so it is generally recommended to have it enabled. However, it is good to know that it is not needed to decide on this at JVM startup. The JVM has a category of flags called "manageable". For manageable flags, it is possible to change their values at run time. All the GC logging flags that start with "PrintGC" belong to the "manageable" category. Thus, it is possible to activate or deactivate GC logging for a running JVM.

    Manageable flags can be set with the use of a JMX client calling the setVMOption operation of the HotSpotDiagnostic MXBean or using the jinfo tool shipped with the JDK.
     

2. Statistics files:

  1. Locator Statistics files, CacheServer Statistics files

    Copy any stats files from the location defined by the statistic-archive-file property of gemfire.properties. For example, gemfire.properties:

    statistic-sampling-enabled=true
    statistic-archive-file=myStatisticsArchiveFile.gfs
    enable-time-statistics=false
    

    The statistic-sample-rate can be changed from the default sample rate of 1000 milliseconds, but this shouldn't be needed as impact is very small.

    Note: Time statistics should only be enabled in dev and QA environments and not in production as this setting has a relatively large impact on VMware GemFire performance.

    To setup rolling of statistics files use the following parameters:

    archive-disk-space-limit=1000
    archive-file-size-limit=100
    

    This will makes gfs files roll when they reach 100MB and keep the last 10 files.

  2. Java Client Statistics files

    Copy any client-side stats files from the path defined in code or in the gemfire.properties file. For example (in code):

    ClientCache cache = new ClientCacheFactory()
    .set("name", "CqClient")
    .set("cache-xml-file", "xml/CqClient.xml")
    .set("log-level", "config")
    .set("log-file", "cqclient.log")
    .set("statistic-archive-file", "myClientStats.gfs"
    .set("statistic-sampling-enabled", "true")
    .create();
    
  3. Native Client Statistics files

    Copy any stats files from the location defined in the statistic-archive-file property of the gfcpp.properties file or defined in native code. For example, gfcpp.properties:

    statistic-sampling-enabled=true
    statistic-archive-file=myClientStats.gfs
    

     

3. Thread dumps

For some issues, such as hung systems or performance issues, thread dumps from the server or client are essential to analyzing the issue. It is very important that multiple thread dumps are taken periodically (i.e. every few seconds) over a period of time.

Thread dumps can be taken using the following procedure:

  • Step 1. Find out the relevant VMware GemFire process id, i.e:
    $ jps -l
    7904 sun.tools.jps.Jps
    5388 sample.JClient
    
  • Step 2. Generate the thread dump(s).

    On Solaris, Linux, and other Unix platforms, sending a SIGQUIT signal to the VMware GemFire Java process will generate a thread dump(s), i.e.:

    kill -QUIT <pid>
    

    In Windows, you can press the CTRL-Break keys in the command shell where the VMware GemFire Java process was started.

    Alternatively, these tools can be used to generate the thread dump:

    1. jstack command: jstack <pid>
    2. Java VisualVM (jvisualvm)
    3. jconsole

4. Heap dump

For investigating issues, such as an Out-of-Memory issue or Memory Leak, a heap dump will help track down underlying issues.

Generate the heap dump of gemfire process using the following:

  • Step 1. First, identify the specific gemfire process id using a command like jps (as in the procedure for getting thread dumps).
  • Step 2. Generate the heap dump.
    1. Using jmap command
      [JDK_INSTALLATION]/bin/jmap -dump:live,format=b,file=heap.dump.out <pid>
      
    2. Using Java VisualVM (jvisualvm):
      • run [JDK_INSTALLATION]/bin/jvisualvm
      • select the target process and select, [Application] menu-->[Heap Dump], and select the generated heap dump, then choose [Save As] "to local disk."
    3. Getting a Heap Dump Automatically on an "Out Of Memory" error:

      Add the following jvm parameter to the java process before it starts.

      -XX:+HeapDumpOnOutOfMemoryError
      -XX:HeapDumpPath
      

      For example, on Windows:

      JAVA_OPTS=%JAVA_OPTS% "-XX:+HeapDumpOnOutOfMemoryError" "-XX:HeapDumpPath=C:\TEMP"
      



Additional Information