Windows and Citrix vGPU based virtual machines may crash with BSOD
search cancel

Windows and Citrix vGPU based virtual machines may crash with BSOD

book

Article ID: 369633

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Windows and Citrix based virtual machines with a vGPU may experience intermittent BSOD.

In the vmware.log files for the affected virtual machine you see something similar:

[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: NVOS status 0x59
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: Assertion Failed at 0x69a6068b:143
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: 8 frames returned by backtrace
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: /usr/lib64/vmware/plugin/libnvidia-vgx.so(_nv009089vgpu+0x35) [0x4869a61145]
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: /usr/lib64/vmware/plugin/libnvidia-vgx.so(_nv004974vgpu+0x1ad) [0x4869a3d93d]
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: /usr/lib64/vmware/plugin/libnvidia-vgx.so(_nv011578vgpu+0x1a2b) [0x4869a6068b]
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: /usr/lib64/vmware/plugin/libnvidia-vgx.so(_nv007372vgpu+0x3fe) [0x4869afd20e]
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: /usr/lib64/vmware/plugin/libnvidia-vgx.so(+0x6886d) [0x4869a0586d]
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: /bin/vmx(+0x850097) [0x4821b9a097]
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: /lib64/libpthread.so.0(+0x7d3b) [0x4864869d3b]
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: /lib64/libc.so.6(clone+0x6d) [0x4864b6c16d]
[YYYY-MM-DDTHH:MM:SS] Er(02) vthread-10735611 - vmiop_log: (0x0): VGPU message 19 failed, result code: 0x59

Environment

vSphere ESXi 7.0U3o
This issue can also occur in early ESXi 8.0 releases (prior to ESXi 8.0u3).

Cause

After upgrading to ESXi 7.0u3o (P08), some Citrix vGPU VMs fail due to insufficient file handles. These VMs may need more than the default limit of 2048 handles.

ESXi 7.0u3o update includes a mitigation for ESXi denial of service bug, That works by reducing the maximum file handles for some vGPU VMs to 2048.

Resolution

Workaround:

  1. In the UI power off the vm, right-click edit settings, VM Options, Advanced, EDIT CONFIGURATION, ADD CONFIGURATION PARAMS
  2. Add the parameter vmiop.maxFileHandles with a value of 16384

     

Alternatively by manually editing the vmx file for the virtual machine: 

  • Power off the virtual machine
  • Make backup copy of vmx file
  • Edit the current vmx file and add the following:
    vmiop.maxFileHandles = "16384"
  • Power the VM on.