Error "Failed to create swap file" seen when attempting to power on a VM on a specific ESXi host within the cluster.
search cancel

Error "Failed to create swap file" seen when attempting to power on a VM on a specific ESXi host within the cluster.

book

Article ID: 406145

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

VM powering on fails with the following error 

Module MonitorLoop power on failed.
Failed to start the virtual machine.
Failed to power on VM. Could not power on virtual machine: Failure.
Failed to create swap file /vmfs/volumes/vsan:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx:/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx:.vswp: Failure

Environment

VMware vSAN 7.X 

VMware vSAN 8.X 

VMware vSAN 9.X 

Cause

There are numerous 'RPC to DOM op create returned: No connection' error messages in the var/log/vmkernel logs.

vmkwarning: cpu17: ##########:)WARNING: VSAN: VsanIoctlCtrlNodeCommon:3138: #######-####-####-####-#########: RPC to DOM op create returned: Socket is not connected.

These errors indicate an underlying network issue, from this specific host's perspective, and its inability to communicate with other nodes.

Resolution

Network investigation needs to be done on the host where VMs are not powering on or created, these will need to be checked below. 

  • Check if MTU is consistent across the entire cluster, especially on the problematic host.  Instructions to troubleshooting MTU connectivity can be found here.
  • Check for membership drops, heartbeat timeouts, and Wait for RDT on the ESXi host, to see if any of the nodes are dropping from the network or experience slowness on the network. To check for membership drops, check vmkernel log in /var/log/vmkernel.log

CMMDS: LeaderRemoveNodeFromMembership:####: ########-####-####-####-#######: Removing node #######-####-####-####-####### (vsanNodeType: data) from the cluster membership
CMMDS: CMMDSUtil_PrintArenaEntry:98: ########-####-####-####-#######: [1341898]: Adding a new Membership entry (######-#####-####-####-######) with 8 members:

In vobd.log, check for heartbeat timeouts, as this indicates that heartbeat timeouts are occurring between the nodes due to network slowness. 

 [esx.problem.vmfs.heartbeat.recovered]:####: ######-####-#####-####-###### :####: #######-####-####-####-########
 [vob.vmfs.heartbeat.timedout]:####: #######-####-####-####-#######: ####: ######-####-####-####-########

To check for "wait for RDT, which is slowness on the network as vSAN protocol is awaiting updates on the network from other nodes, check the vsantracesUrgent log for these messages below. 

2025-03-10T00:16:45.439638 [3612038] [cpu22] [c826378b OWNER readWithBlkAttr5 VMDISK] DOMTraceOpTookTooLong:10304: {'op': 0x45bad8d63c80, 'objUuid': '######-####-####-####-########', 'offset-39': 138322706432, 'length-25': 65536, 'totalTimeMS': 10014, 'timeInThisPhaseMS': 10014, 'opPhase': 'Wait for RDT'

  • To verify if there are underlying network card issues, check here
  • Reboot this ESXi host to force it to reconnect to the network. Make sure the host is in maintenance mode with default ensure accessibility prior to the reboot.

Additional Information

vSAN network troubleshooting

vSAN performance diagnostics reports:"There are errors seen in one or more physical NICs on one or more hosts"