Troubleshooting High Memory Usage on Windows VMs: Data Collection Guide
search cancel

Troubleshooting High Memory Usage on Windows VMs: Data Collection Guide

book

Article ID: 424332

calendar_today

Updated On:

Products

VMware Tanzu Application Service VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

High memory consumption on a Windows VM can lead to severe performance degradation, application failures, and system unresponsiveness. To address memory exhaustion, it is critical to capture specific diagnostic data while the issue is occurring. This article outlines the step-by-step procedure to collect the data that will be helpful in investigating Windows memory issues. These artifacts are essential for identifying whether the root cause is a driver memory leak, application resource exhaustion, or OS-level contention.  The steps in this article are suited for Windows VMs in a TAS or TKGI deployment.

Environment

VMware Tanzu Application Service

VMware Tanzu Kubernetes Grid Integrated Edition

Resolution

Steps:

  1. Using 'bosh', ssh into the Windows instance that has an ongoing high memory usage.  Once logged in, start 'powershell'.

    bosh -d $DEPLOYMENT ssh $WINDOWS-INSTANCE
    
    powershell
     

  2. Run the following memory diagnostics commands and save their output into files in C:\var\vcap\sys\log folder.

    systeminfo | findstr /C:"Memory" > C:\var\vcap\sys\log\systeminfo.txt
    
    tasklist > C:\var\vcap\sys\log\tasklist.txt
    
    Get-Process | Sort-Object -Property VirtualMemorySize -Descending | Select-Object ProcessName, VirtualMemorySize, PrivateMemorySize, WorkingSet, PagedMemorySize, NonpagedSystemMemorySize, PeakWorkingSet, PeakPagedMemorySize, HandleCount  > C:\var\vcap\sys\log\get-process-vmemsize.txt
     
    Get-Process | Sort-Object WorkingSet -Descending | Select-Object Id, Name, WorkingSet, WS, PM, VirtualMemorySize, PagedMemorySize64 > C:\var\vcap\sys\log\get-process-workingset.txt
     
    Get-Counter '\Memory\*' > C:\var\vcap\sys\log\get-counter-memory.txt
     
    Get-Counter '\Paging File(*)\% Usage' > C:\var\vcap\sys\log\get-counter-paging.txt
    
    & 'C:\Program Files\VMware\VMware Tools\VMwareToolboxCmd.exe' stat balloon > C:\var\vcap\sys\log\vmwtb-stat-balloon.txt


  3. Exit the Windows instance, and then run the following bosh command to collect the logs from the instance.

    bosh -d $DEPLOYMENT logs $WINDOWS-INSTANCE

The log bundle can then be uploaded to a support case accordingly.  The output files can be reviewed to identify where the physical memory is being used.  Note that the files need to be converted to ASCII text format so that Linux commands like grep can parse them successfully.

  • systeminfo.txt - contains information on total and available physical memory
  • tasklist.txt - contains all the running processes info including current memory usage size, which can be summed up to know the total memory used by user space applications
  • get-process-vmemsize.txt - contains all processes' memory usage information sorted by virtual memory size
  • get-process-workingset.txt - contains all processes' memory usage information sorted by working set (physical memory usage)
  • get-counter-memory.txt - contains the memory counters including the memory usage of the drivers in the kernel space, and the system cache as well
  • get-counter-paging.txt - contains paging file usage information
  • vmwtb-stat-balloon.txt - contains output of a VMware specific command that tells if VMware memory ballooning is happening.  If output is higher than 0MB, then it indicates that ballooning is happening (a condition where the ESXi host has memory pressure and takes memory from the Windows VM to give it to another VM that needs it more)

To get a rough breakdown of the physical memory based on the output files, fill out the following table.  The categorization in the table is based on an external article "Physical Memory: Where did all of the physical memory go?". 

Category Usage (GB) Notes
Process working set memory   From the "get-process-workingset.txt" or "get-process-vmemsize.txt" file.  Parse (e.g., grep) for the values of "WorkingSet" and sum the values up.
Kernel pool memory   From the "get-counter-memory.txt" file (\memory\pool nonpaged bytes + \memory\pool paged resident bytes) 
System cache   From the "get-counter-memory.txt" file (\memory\system cache resident bytes)
Driver locked   From the "vmwtb-stat-balloon.txt" file
Address windowing extensions (AWE)   Can be collected from a GUI tool named "RamMap".  If the above 4 categories have accounted for most of the used memory, then this might be trivial and can be skipped.  Otherwise, download and use "RamMap" to gather more information.

 

The total of the "Usage (GB)" column should give a value that is close to the value of used physical memory.  Subtract this value from the "Total Physical Memory" value (from the "systeminfo.txt" file), and the difference should be close to the "Available Physical Memory" value (from the "systeminfo.txt" file), and if so then it should be safe to presume that the above table closely represents how the physical memory is being used. 

If the "Process working set memory" usage is high, the "tasklist.txt" and "get-process-workingset.txt" files can be reviewed to identify which process is consuming much of the physical memory.

If the "Kernel pool memory" usage is high, then a driver could be the culprit.  To get more information, use "PoolMon" to identify the specific driver using the most memory.

If the "System cache" usage is high, then a process or a driver could be doing a continuous and high volume of cached read requests.  See the Microsoft document that describes a "System File Cache Issue".

If the "Driver locked" is high, then check the particular ESXi host for memory pressure.

 

Other useful tools to gather more diagnostics:

  • PoolMon
    • To install:
      1. Download the Windows Driver Kit (WDK) 25H2 (build 26100.6584) from the WDK download page.
      2. Install the WDK using default settings/options in any existing full Windows Server machine (not the VM instance)
      3. Once the installation is completed, locate the file "poolmon.exe" in the C:\Program Files (x86)\Windows Kits\10\Tools\10.0.26100.0\x64 folder.
      4. Upload a copy of "poolmon.exe" to the Windows instance using 'bosh scp'.
    • To run:
      • Add the following commands to "Step 2" above.
        # run the following commands from the folder where poolmon is saved.
        cd C:\tmp
        .\poolmon.exe /b /n C:\var\vcap\sys\log\poolmon-b-out.txt
        .\poolmon.exe /p /p /b /n C:\var\vcap\sys\log\poolmon-ppb-out.txt
        
    • Review the output files.  The drivers are identified by their "tag" (e.g., "CIcr").  Search for the tag in Google to get more information on the driver.
      • "poolmon-b-out.txt" contains the list of drivers, both paged and non-paged, and their memory usage.  Sorted by descending memory usage. 
      • "poolmon-ppb-out.txt" contains the list of drivers, only the paged ones, and their memory usage. Sorted by descending memory usage.

  • RAMMap
    • To install:
      • Download RAMMap zip file from the RAMMap download page.
      • Unzip the file and upload a copy of "RAMMap64.exe" file to the Windows instance using 'bosh scp'.
    • To run:
      • Use vCenter Web Console to access the Windows GUI of the Windows instance
      • Send Ctrl-Alt-Del to start the logon process
      • Log on as Administrator.  If there is a need to reset the "Administrator" password, use bosh to ssh to the instance and change the password there.
      • Once logged on, using the command prompt, navigate to the folder where the binary was uploaded and run it from there.  The RAMMap GUI will open and the information presented can then be reviewed.