Virtual Machines Shut Down Unexpectedly Due to Veeam CDP I/O Filter (veecdp) Thread Hang on ESXi
search cancel

Virtual Machines Shut Down Unexpectedly Due to Veeam CDP I/O Filter (veecdp) Thread Hang on ESXi

book

Article ID: 438563

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

Virtual machines (VMs) may shut down unexpectedly and intermittently on VMware ESXi hosts where Veeam Continuous Data Protection (CDP) is enabled. After the shutdown event, the affected VMs may require approximately 45 minutes to power on again.

Symptoms:

The following symptoms may be observed before and after the unexpected VM shutdown:

  • VMware Tools Communication Timeout: VMware Tools heartbeats time out immediately prior to the crash (vmware.log)

2026-04-27T12:31:11.825Z In(05) vmx - GuestRpcSendTimedOut: message to toolbox timed out.

  • filtmod-watchdog Triggered VMX Termination: The ESXi filtmod-watchdog may detect an unresponsive I/O filter upcall thread and forcibly terminate the associated VMX process. (/var/run/log/vmkernel.log)

2026-04-27T12:34:07.903Z Wa(180) vmkwarning: cpu56:2098056)WARNING: FiltModS: 898: Upcall thread 3541352 did not signal liveness within last 120000 ms, suspect hang up. Stopping associated cartel 3540652.

  • Unclean VM Power-Off Event: The VM may transition directly from VM_STATE_ON to VM_STATE_OFF, and the clean power-off flag may be reported as false. (/var/run/log/hostd.log)

2026-04-27T12:35:25.822Z In(166) Hostd[2099596]: [Originator@6876 sub=Vimsvc.ha-eventmgr] Event 10107 : An application (/bin/vmx) running on ESXi host has crashed (5 time(s) so far). A core file might have been created at /vmfs/volumes/<datastore uuid>/Virtual_machine/vmx-zdump.002.
2026-04-27T12:35:32.931Z In(166) Hostd[2099631]: [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/<datastore uuid>/Virtual_machine/Virtual_machine.vmx] State Transition (VM_STATE_ON -> VM_STATE_OFF)
2026-04-27T12:35:32.931Z In(166) Hostd[2099631]: [Originator@6876 sub=Vmsvc.vm:/vmfs/volumes/<datastore uuid>/Virtual_machine/Virtual_machine.vmx] Clean power off flag is false

Environment

  • VMware ESXi 8.x
  • Veeam Backup & Replication using Continuous Data Protection (CDP)

 

Cause

The issue is caused by a synchronization failure or thread hang within the ESXi I/O filter framework involving the third-party Veeam CDP I/O filter (veecdp).

When an I/O filter upcall thread becomes unresponsive, liveness signaling to the ESXi kernel is no longer maintained. The ESXi filtmod-watchdog monitors these threads and initiates a forced termination of the associated VMX process if liveness is not detected within 120 seconds.

The condition may be triggered during periods of elevated backend storage latency.

Justification

The issue can be validated using the following indicators:

  • Log Correlation: Repeated backtrace generation entries for the same vmx-Upcall thread  (/var/run/log/vmkernel.log)

2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)Log: 1640: Generating backtrace for 3541352: vmx-Upcall-a402:Virtual_machine
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)Backtrace for current CPU #91, worldID=2098056, fp=0x4539db89f000
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9af88:[0x4200190bc3c5]WorldSwitch_out_label@vmkernel#nover+0x0 stack: 0x660e2ccde8f74, 0x4539db89f100, 0x420040000000, 0x4539fdd9f100, 0x4539c341f100
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9af90:[0x4200190b8969]World_Switch@vmkernel#nover+0x222 stack: 0x4539db89f100, 0x420040000000, 0x4539fdd9f100, 0x4539c341f100, 0x420040000000
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9afe0:[0x4200190de3b7]CpuSchedDispatch@vmkernel#nover+0xa20 stack: 0x0, 0x420040001040, 0x420040001110, 0x420040001128, 0x420040001040
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9b220:[0x4200190e01da]CpuSchedWait@vmkernel#nover+0x35b stack: 0x8000000000000001, 0x0, 0x101001e001c5747, 0x3f, 0x4200460014e8
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9b390:[0x4200190e053a]CpuSchedTimedWait@vmkernel#nover+0xb7 stack: 0x431e9c6669c0, 0x360968, 0x4200400077a8, 0x4200400077a8, 0x4200190dbfdc
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9b430:[0x420018e7426d]EventQ_TimedWaitTCGlobal@vmkernel#nover+0x86 stack: 0x0, 0x431e9c12bef0, 0x431e9c12bc00, 0x431e9c12bf00, 0x431e9c12bf03
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9b480:[0x420019e7fef6]Res6AffMgrGetCluster@esx#nover+0x1e0b stack: 0x4539fdd9b600, 0x966, 0x967, 0x3ff, 0x431e00000010
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9b580:[0x420019e817a1]Res6AffMgr_AllocResourcesInt@esx#nover+0x456 stack: 0x100000000, 0x431e9ca6f810, 0x431e9c12d460, 0x431e9c22e830, 0x45ba27dfb2c0
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9b7f0:[0x420019e831e6]Res6AffMgr_AllocResources@esx#nover+0x1b stack: 0x0, 0x0, 0x4539fdd9b884, 0x420019e682c8, 0x45ba4fcf9700
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9b830:[0x420019df1623]Fil6_AllocateBlocks@esx#nover+0x204 stack: 0x0, 0x4539fdd9b884, 0x4539fdd9b884, 0x420019df15f3, 0x1f0
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9b8d0:[0x420019df3e41]Fil6_PlugFileHoleTxn@esx#nover+0x77e stack: 0x431e9c12d460, 0x10045ba41008e4e, 0xf60000004e0001, 0x45ba49eade08, 0x45ba49ead800
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9b970:[0x420019df5ef3]Fil6_FileIOInt@esx#nover+0x1d00 stack: 0x45ba002f4e80, 0x431e9ca6f810, 0x431e9c12d460, 0x45ba4892a500, 0x4539fdd9bcfc
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bad0:[0x420019df6a47]Fil6_FileIOIntWithRetry@esx#nover+0xc4 stack: 0x431af14f3c30, 0x100000000, 0xa0014000a0014, 0x32000a, 0x0
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bb80:[0x420019df6d4f]Fil6_FileIO@esx#nover+0x104 stack: 0x7946be6, 0x45ba002f4e80, 0x11080000000, 0x45ba00000001, 0x7946be8
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bc70:[0x420018a42048]FSSVec_FileIO@vmkernel#nover+0x21 stack: 0x15552ba2, 0x4200190c36b8, 0x4539fdd9bcfc, 0xffffffff00000001, 0x7946bf7
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bc90:[0x4200190c36b7]FSSFileIO@vmkernel#nover+0x17c stack: 0x7946bf7, 0x1, 0x7946bf8, 0x0, 0x4313c2c017b0
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bcf0:[0x4200190c3881]FSS_AsyncFileIO@vmkernel#nover+0xe stack: 0x7946bfd, 0x420019865366, 0x7946bfe, 0x4313c2c01680, 0x7946bff
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bd10:[0x420019865365][email protected]#1.0.0-0+0xa6 stack: 0x7946bff, 0x4531066000e0, 0x431af14f3c30, 0x45ba4892a500, 0x45ba002f4e80
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bd60:[0x4200199605bb][email protected]#1.0.0-0+0x68 stack: 0x4530fae16080, 0x431af14f3c30, 0x453105a07dc0, 0x0, 0x4521c77c8040
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bda0:[0x4200199640b3][email protected]#1.0.0-0+0xe8 stack: 0x4539fdd9f100, 0x0, 0x4539fdd9bf40, 0x1, 0x0
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9be00:[0x4200199613a5][email protected]#1.0.0-0+0x25e stack: 0x4521c6402004, 0x431af1401530, 0x420041406840, 0x431af1401460, 0x431af1401474
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bea0:[0x420018a27d7f]FiltModBridge_Upcall@vmkernel#nover+0x54 stack: 0x420041401570, 0x4539fdd9f000, 0x0, 0x43210cc1bcc0, 0x4539fdd9bf40
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bef0:[0x420018fd530f]User_UWVMK64SyscallHandler@vmkernel#nover+0x104 stack: 0x420018fd4958, 0x0, 0x0, 0x0, 0x0
2026-04-27T12:32:37.902Z In(182) vmkernel: cpu91:2098056)0x4539fdd9bf40:[0x42001909edf5]SyscallUWVMK64@vmkernel#nover+0xc1 stack: 0x0, 0x0, 0x7629ad5617, 0x75e7ae5250, 0x0

  • Hard Termination: The kernel issued a signal:9 (SIGKILL) to the VMX process: User: 3259: vmx-Upcall-a402... signal:9 exitCode:-1 coredump:enabled (/var/run/log/vmkernel.log)

    2026-04-27T12:35:24.048Z In(182) vmkernel: cpu0:3541352)User: 3259: vmx-Upcall-a402:Virtual_machine: wantCoreDump:vmx-Upcall-a402:Virtual_machine signal:9 exitCode:-1 coredump:enabled

  • Filter Disconnect: Immediately following the crash, the I/O filter daemon reported: Failed to get I/O filter command stream: I/O filter not connected. (/var/run/log/iofilterd-veecdp.log)

    2026-04-27T12:35:48.946Z In(14) iofilterd-veecdp[2099223]: [000002155364] <00000000> 946 (E) [Coordinator::onErrorOccurred]: error: vvr-4026531849: Failed to get I/O filter command stream: I/O filter not connected, node: DAEMON, component: SOURCE_REPLICATOR, replicationId: 5033c069-####-####-####-f35######f9e_60####94-####-####-####-3a3######418

  • Backend Storage Latency Observed - This indicates intermittent storage performance degradation.

2026-04-27T13:34:00.517Z Wa(180) vmkwarning: cpu89:2098354)WARNING: ScsiDeviceIO: 1781: Device naa.######################### performance has deteriorated. I/O latency increased from average value of 1326 microseconds to 43936 microseconds.
2026-04-27T13:34:00.857Z In(182) vmkernel: cpu52:2098305)ScsiDeviceIO: 1781: Device naa.######################### performance has improved. I/O latency reduced from 43936 microseconds to 500 microseconds.

 

Resolution

Because the failure occurs within a third-party I/O filter component, resolution must be provided by the vendor of the filter component. 

The following actions are recommended:

  1. Engage Veeam Support for further investigation of the veecdp I/O filter behavior.
  2. Engage the storage vendor to investigate intermittent backend storage latency spikes observed on the affected datastore devices.

Additional Information

Virtual machine crashed during backup

Virtual machines have become unresponsive or slow resulting in VSCSI resets