ESXi Host and virtual machines lose network connectivity during VM backup.
search cancel

ESXi Host and virtual machines lose network connectivity during VM backup.

book

Article ID: 432779

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • When a scheduled virtual machine backup is triggered, network connectivity issues occur across the ESXi host and its VMs in the cluster.
  • Continuous ICMP pings from an external jump server to the ESXi host or VMs show intermittent packet drops.
  • Backups complete successfully, but the ESXi host and VMs remain unresponsive for a brief period.
  • Network connectivity is automatically restored upon completion of the backup task.
  • vCenter Server shows multiple missed heartbeats and connection loss for the affected host.

    /var/log/vmware/vpxd/vpxd.log

    YYYY-MM-DD:T:HH:MM:SS info vpxd[#####] [Originator@68## sub=HostCnx opID=CheckforMissingHeartbeats-380####] [VpxdHostCnx] No heartbeats received from host; cnx: <UUID>, host-<ID>, time since last heartbeat: 98769ms
    YYYY-MM-DD:T:HH:MM:SS info vpxd[[#####]] [Originator@68## sub=HostCnx opID=CheckforMissingHeartbeats-380####]] Marking the connection alive to false: <UUID>
    YYYY-MM-DD:T:HH:MM:SS info vpxd[[#####]] [Originator@68## sub=InvtHostCnx opID=CheckforMissingHeartbeats-380####]] Got lost connection callback for host-<ID>

Environment

VMware vCenter Server 8.0
VMware vSphere ESXi 8.0

Cause

This issue is caused by network congestion or traffic interception on the physical network during backup operations. High-volume data from security scanning tools or traffic mirroring across VLANs can overwhelm the network during backup windows, preventing ESXi heartbeat packets from reaching the vCenter Server.

Resolution

Reach out to the internal network team, as this issue could be caused by the traffic mirroring network solution in the environment (not limited though)

  • To identify the cause of network disconnects during backups, perform simultaneous packet captures on the ESXi host and vCenter Server using these steps:

    • Run a packet capture on the ESXi host ssh session, to confirm if the host is actively sending heartbeat packets to the vCenter Server. Replace vmnicX with the physical NIC bound to the management vmkernel port (identify this in esxtop by pressing n).

      pktcap-uw --uplink vmnicX --capture UplinkSndKernel --udpport 902 -o -| tcpdump-uw -enr - > /tmp/esxi.pcap

    • Run a packet capture on the vCenter Server using ssh, to confirm if the vCenter Server is receiving heartbeats.

       tcpdump src host <esxi_host_ip_address> and udp port 902 -w /tmp/vCenter.pcap


    • Confirm if specific VM traffic is dropped by identifying the VM's switchport and running a capture.

      • To identify switch port: net-stats -l

      • To perform the capture (replace [PortID] with the ID from the previous step):

        pktcap-uw --switchport [PortID] --capture VnicTx,VnicRx -s 256 -o /tmp/vm.pcap

  • Review the captures. If "port unreachable" entries appear, or if UDP 902 packets are sent by the host but not received by vCenter, the issue is likely due to physical network congestion.

    Packet capture on a virtual machine from the ESXi host shows the VM is unreachable during the backup window:



    Packet capture on a virtual machine post backup activity is completed:




  • Proceed to engage internal network/firewall team to identify the packet drops

Additional Information

If no dropped packets are observed in the packet capture files, investigate the host unresponsive events. Refer to Troubleshooting an ESXi host in a "not responding"/"disconnected" state