VM ping (communication) to physical network default gateway through the DVS using LAG/LACP is failing
search cancel

VM ping (communication) to physical network default gateway through the DVS using LAG/LACP is failing

book

Article ID: 410601

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

  • Using LAG/LACP on a DVS

  • VM networking is lost on DVS with LAG/LACP

  • VM ping to default gateway is not working

  • VmKernel interfaces on the DVS using LAG/LACP not able to communicate to the physical default gateway

  • vMotion fails to and from host using LAG/LACP

  • On the ESXi host /var/run/log/vmkernel.log, similar log entries are seen:

    • 2025-08-30T22:32:57.174Z In(182) vmkernel: cpu98:2098921)lacp: LACPPduReceiver:3982: Out of memory to alloc lacp ioc data
      2025-08-30T22:32:57.305Z In(182) vmkernel: cpu156:2098755)lacp: LACPPduReceiver:3982: Out of memory to alloc lacp ioc data
      2025-08-30T22:33:27.172Z In(182) vmkernel: cpu98:2098921)lacp: LACPPduReceiver:3982: Out of memory to alloc lacp ioc data
      2025-08-30T22:33:27.305Z In(182) vmkernel: cpu144:2098755)lacp: LACPPduReceiver:3982: Out of memory to alloc lacp ioc data
  • On the ESXi host /var/run/log/shell.log, similar log entries are seen where the user performed a manual restart of individual services followed by a full services.sh restart before the issue started:

    • 2025-08-08T08:13:51.156Z In(14) shell[#######]: [##########]: /etc/init.d/hostd restart
      2025-08-08T08:13:51.932Z In(14) shell[#######]: [##########]: /etc/init.d/vpxa restart
      2025-08-08T08:13:53.617Z In(14) shell[#######]: [##########]: services.sh restart
  • Checking the LACP service status, it shows as "not running"

    • [root@esxi-host:~] /etc/init.d/lacp status
      LACP daemon is not running

Environment

VMware vSphere ESXi

Cause

The services.sh restart command did not run to completion properly and got ended abruptly in the middle of the script execution. It is important to be careful when running this CLI operation to ensure it finishes before terminating the SSH session and do not terminate the services.sh restart command with CTRL+C while the script is running.

Resolution

Workaround:

  • Execute services.sh restart and allow the services.sh restart command to fully complete before terminating the SSH session and do not kill the command with CTRL+C as the script is running

Additional Information

For further root cause analysis, please collect a live core dump from the impacted ESXi host before reboot to resolve the issue: Generating Live core dump for ESXi host

Once the live core dump has been collected, please open a case with Broadcom support:

Handling Log Bundles for offline review with Broadcom support