How to Catch and Generate a Core Dump for Apps Running on Windows Diego Cell
search cancel

How to Catch and Generate a Core Dump for Apps Running on Windows Diego Cell

book

Article ID: 297661

calendar_today

Updated On:

Products

VMware Tanzu Application Service for VMs

Issue/Introduction

This article explains how to trace and trigger a core dump of a running windows Diego Cell application. We will not go into details regarding how to review the dump once collected. Triggering a core dump is particularly useful when an app that is running for a period of time all of a sudden crashes with errors like "Access Violation" and you want to determine why the fault occurred.

What will you Need?

What are the symptoms

In this example, we will describe some symptoms that you may see that cannot be root caused without a core dump. Let's assume you have a .NET application running in a windows Diego container and the app all of the sudden crashes with this error

2017-02-02T15:51:14.64-0800 [API/0]      OUT App instance exited with guid cf9f685d-2562-4cb4-a9b0-834451b88c13 payload: {"instance"=>"", "index"=>0, "reason"=>"CRASHED", "exit_description"=>"2 error(s) occurred:\n\n* Exited with status -1073741819\n* cancelled", "crash_count"=>4, "crash_timestamp"=>1486079474619372130, "version"=>"276d4084-18ed-48fe-9aee-1b29e6525a8d"}

The interesting part of this error is the application exit status code "Exited with status -1073741819".  Status code -1073741819 in Hex is 0xc0000005 which means "Access Violation".  An Access Violation is usually some form of a memory access fault or some other IO related issue.  We can get more info on the error code if we check the Windows Application Event logs

The error shows that there was an access violation in the iisfreb.dll at offset 0x67da. Using windbg we can open that dll and find the line of code at offset 0x67da

  1. Launch windbg
     'C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\windbg.exe' -z C:\Windows\System32\inetsrv\iisfreb.dll
  2. Get the module version and make sure it matches the windows event log
    0:000> lm -v
      CheckSum:         00037232
        ImageSize:        0002C000
        File version:     8.5.9600.16384
        Product version:  8.5.9600.16384
  3. Get the starting offset for the module code so you can inspect offset 0x67da
    0:000> lm
        start             end                 module name
    00000001`80000000 00000001`8002c000   iisfreb    (pdb symbols)          C:\Program Files (x86)\Windows Kits\10\Debuggers\x64\sym\iisfreb.pdb\89CF8B470B1B48BA829E6B4C6A27A7391\iisfreb.pdb
  4. Then fetch the line number by adding the starting offset with the offset reported in the event log. Here we see the function that triggered the violation and the error occurred at line number 0xAE = 174
    0:000> ln 180000000+67da
    (00000001`8000672c)   iisfreb!FREB_REQUEST_CONTEXT::FilterWWWServerAreasAndVerbosity+0xae   |  (00000001`80006844)   iisfreb!FREB_REQUEST_CONTEXT::SerializeAllTraceEventsToLogDataString

In some cases this will be enough info to determine root cause, however, if further information is required then we can apply the DebugDiag procedure to trigger a core dump when this access violation occurs

 


Environment


Resolution

Using DebugDiag tool to trigger a core dump

In this example we will use a small .NET app we called CpuBurner and show how to enable tracing for access violation errors on this process. 

1. The first thing we should check is to make sure we are targeting the right application. We can use cf cli to get the app guid

$ cf app cpuburner --guid
91f18699-87f9-43a3-95da-e06a8844795d

2. Then we can go to windows task manager -> right click the process -> Open File Location

3. You will see windows explorer opens a path like this C:\containerizer\BCC2AB46FF4649B4FE\user\app. The directory name of BCC2AB46FF4649B4FE is the username created for this container. Garden windows will create a new user for each app container and all processes run in that container will use this user account.
4. If we open the acsii text file in location C:\containerizer\BCC2AB46FF4649B4FE\private\properties with notepad we can see the app guid "network.app_id":"91f18699-87f9-43a3-95da-e06a8844795d" matches what we get in the cli and we know we are working with the correct app
5. Using Task manager we can look up the process ID of the cpuburner app and make a note of it
6. Launch "DebugDiag 2 Collections" and use the crash wizard to start tracing the cpuburner process 



​7. Then select to trace on a specific process

8. Select the CpuBurner.exe that matches the process id we found in taskmaster

9. On the next prompt click Exceptions -> Add Exception and then populate the Configure Exception form to trigger the "Full Userdump" action when the Access Violation error code 0xc0000005 is encountered


10. DebugDiag will generate all core dumps and trace logs in the C:\Program Files\DebugDiag\Logs\Crash rule for all instances of CpuBurner.exe directory

11. Once the rule is activated you simply have to run steps to reproduce the fault or wait for the problem to resurface. The developer can use Microsoft tools to analyze the core dump and determine the root cause for the fault.

 

 

Additional Information

Some helpful links are below:

  1. There are many different releases of DebugDiag for windows and they don't all work in all releases of windows.  Here is a link to Microsoft blog that should have the most recent information available https://blogs.msdn.microsoft.com/debugdiag/
  2. Refer to the official site for more windbg information http://www.windbg.org/