After a VM is rebooted, it sometimes does not get a correct IP from the DHCP server
search cancel

After a VM is rebooted, it sometimes does not get a correct IP from the DHCP server

book

Article ID: 418640

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

After a VM is rebooted, it sometimes does not get a correct IP from the DHCP server

Cause

There can be multiple different causes of these symptoms, including:

  1. Issues with the guest VM's operating system or its components

  2. Issues inside the ESXi hypervisor on which the VM is running

  3. Issues external to the vmnic (uplink) of the ESXi host on which the VM is running

Resolution

To either rule the ESXi hypervisor in or out regarding a possible root cause, the easiest way would be to use techniques in Packet capture on ESXi using the pktcap-uw tool

When a VM is rebooted, the hypervisor will select one of the physical uplinks that are associated with the network (port group, or distributed virtual port group) to carry the VM's TCP/IP packets.  

A TCP/IP packet containing the request to obtain an DHCP IP address for the VM will be sent along that data path (from the virtual interface of the VM to the physical interface of the uplink), and then that packet will be sent to the physical infrastructure for forwarding to the DHCP server. 

  1. Determine which ESXi host is running the VM of interest

  2. SSH into that host with root privileges

  3. Determine the PortNum associated with the VM by running a command like:
    • net-stats -l | grep <VM name> 

  4. Determine the TEAM-PNIC associated with the VM by running:
    • esxtop (and then select n for "network", observe the line in the column headed PORT-ID whose number matches the PortNum obtained in step 3, and note the vmnic (uplink) being used to carry the traffic for the VM)

  5. Use the techniques in  Packet capture on ESXi using the pktcap-uw tool to capture the packets at the following capture points:
    • --switchport ######### --capture VnicTx,VnicRx (where ######### is the PortNum obtained in step 3)
    • --uplink vmnic# --capture UplinkSndKernel,UplinkRcvKernel (where # is the number of the vmnic; i.e. vmnic1, vmnic2, etc.)

The KB Packet capture on ESXi using the pktcap-uw tool explains how to direct those captures to files that can then be reviewed using a tool like Wireshark. 

CONCLUSIONS:

  1. If the packets sent by the guest as seen at --switchport --capture VnicTx, are being seen at --uplink --capture UplinkSndKernel, with no drops or delays beyond a few dozen microseconds, then the ESXi hypervisor is doing its job to deliver packets to the physical infrastructure outside the vmnic; and

  2. If the packets received by the host at --uplink --capture UplinkRcvKernel, are being seen at --switchport --capture VnicRx, with no drops or delays beyond a few dozen microseconds, then the ESXi hypervisor is doing its job to deliver packets received from the physical infrastructure outside the vmnic, to the guest operating system.

  3. If both of the above conditions are true, then the investigation is outside the scope of Broadcom support, and the guest operating system vendor and/or the team that manages the physical infrastructure external to the vmnics on the ESXi host must be engaged to determine the root cause of the symptoms.