Quiescing issues in Windows VMs - what to collect for troubleshooting
search cancel

Quiescing issues in Windows VMs - what to collect for troubleshooting

book

Article ID: 385812

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

This article describes the necessary steps to enable detailed logging and collect essential data for troubleshooting quiescing failures in Windows virtual machines.

  • Quiescing failures occur when a virtual machine is unable to successfully quiesce its file system during snapshot operations. These failures typically originate within the guest operating system and may not be fully visible through standard ESXi logging.

  • By default, ESXi only collects basic information for such events, which is often insufficient for effective root cause analysis.

Resolution

Before starting the data collection process:

  1. Verify VMware Snapshot Provider Service

    • Ensure it is installed.

    • Confirm it is set to Manual start-up mode or is currently in the Running state.

  2. Verify Microsoft Virtual Disk Service

    • Ensure it is set to Automatic start-up mode.

    • Confirm it is currently in the Running state.

If either service is missing or not in the required state, correct the configuration and retry the quiescing operation.

If the above checks are in place and quiescing still fails, proceed with the steps below to enable detailed logging and collect relevant data:

1. Enable Debug Logging for VMware Tools

Refer to the article on Enabling debug logging for VMware Tools within a guest operating system

2. Disable Log Throttling in VM Configuration

VMware Tools debug logging can generate extensive logs. To ensure no throttling limits the logs written by the hypervisor, adjust the VM configuration by editing the .vmx file.

Method 1: Using vSphere Client (GUI)

  1. Power off the VM.

  2. Access the datastore where the VM is located:

    • From the vSphere Client, right-click the VM and select Browse Datastore.

    • Navigate to the VM's folder.

  3. Download the .vmx file to your local machine.

  4. Edit the .vmx file using a text editor (e.g., Notepad++).

  5. Add the following line at the end of the file:

    log.throttleBytesPerSec = "0"
  6. Save the file.

  7. Upload the modified .vmx file back to the VM’s datastore, replacing the existing file.

  8. Right-click the VM in the vSphere Client and select Remove from Inventory.

  9. Browse the datastore, right-click the modified .vmx file, and select Register VM.

  10. Power on the VM.

Method 2: Using ESXi CLI (SSH or ESXi Shell)

  1. SSH into the ESXi host or open ESXi Shell.

  2. Locate the VM's folder on the datastore:

    cd /vmfs/volumes/<datastore_name>/<vm_name>/

  3. Verify the VM's .vmx file exists:

    ls *.vmx
  4. Create a backup of the .vmx file (optional but recommended):

    cp <vm_name>.vmx <vm_name>.vmx.bak
  5. Edit the .vmx file using vi or nano:

    vi <vm_name>.vmx
  6. Add the following line at the end of the file:

    log.throttleBytesPerSec = "0"
  7. Save the file and exit the editor.

    :wq!

  8. Reload the VM configuration:

    vim-cmd vmsvc/getallvms  # Note the VM ID
    vim-cmd vmsvc/reload <VMID>

Note:
You can also perform a vMotion to another ESXi host to pick up the configuration changes if a VM restart is not immediately possible.

3. Reproduce the Quiescing Failure

  • Attempt to reproduce the quiescing operation failure.

4. Collect Required Logs and System Information

  • In the guest OS, run the following commands to collect the VSS state:  

    vssadmin list writers

    vssadmin list providers

  • Collect the System and Application Event Logs:

    • Open Event Viewer.

    • Navigate to Windows Logs > System.

    • Right-click and select Save All Events As....

    • Repeat for Application logs.

  • Collect an ESXi log bundle from the host where the VM resides.

Additional Information

Ensure that after troubleshooting, the debug logging and log throttling configurations are reverted to their default state to avoid unnecessary log growth and system resource usage.