Physical interface unreachable when VM on specific host
search cancel

Physical interface unreachable when VM on specific host

book

Article ID: 412927

calendar_today

Updated On:

Products

VMware vSphere ESXi

Issue/Introduction

In a vCenter ESXi host cluster, a virtual machine (VM) running on one ESXi host is unable to reach a network interface of a physical machine outside the ESXi host.  However, when that VM is migrated to another ESXi host in the cluster, it is able to reach the interface.

Note:  sometimes people reference an ESXi host that is in a cluster as a "node" of that cluster.

These symptoms may also appear on a non-clustered host.

You may see alerts on vCenter for the problematic host:

  • vSphere Distributed Switch VLAN trunked status
    Not all the configured VLANs in the vSphere Distributed Switch were trunked by the physical switch connected to uplink port # in vSphere Distributed Switch on host ###.###.### in #######.
  • vSphere Distributed Switch MTU supported status
    Not all VLAN MTU settings on the external physical switch allow the vSphere Distributed Switch maximum MTU size packets to pass on the uplink port # in vSphere Distributed Switch on host ###.###.### in #######.

You find that the VM in question cannot ping out from the problematic host.

Environment

VMware vSphere ESXi

Resolution

If packets supplied to the stack by the guest VM at capture point VnicTx are delivered completely and in a timely way (microseconds) to the physical infrastructure as seen at capture point UplinkSndKernel, and if packets received by the stack from the physical infrastructure as seen at capture point UplinkRcvKernel are delivered completely and in a timely way (microseconds) to the guest VM at capture point VnicRx, then the root cause of the symptom is not in the ESXi networking stack. 

The next step is to investigate the guest VM's operating system and/or internal networking stack, and/or the physical infrastructure external to the vmnic (uplink) on the ESXi host. 

If packets are seen at one end of the ESXi networking stack, but not the other, perform packet captures at the intermediate capture points as described in KB 341568 Packet capture on ESXi using the pktcap-uw tool, to determine where the packet blockage is occurring.

  1. Identify the PortNum and MacAddress for the VM in question using the following command:

    net-stats -l
  2. Initiate Packet Capture from the Problematic Host for the VM in question:
    If necessary, start pinging from the within the VM out to an IP address on the network to generate traffic.

    pktcap-uw -- switchport <PortNum> -- capture VnicTx, VnicRx -o - | tepdump-uw -r - -enn

    Review Packet Capture results to confirm if traffic is flowing or not from the VM in question to the physical switch.


  3. Identify the vmnic (Client) for the uplink on the problematic distributed switch:

    1. Identify the possible uplinks in the appropriate vSwitch or vDS for the VM, using
      esxcfg-vswitch -l
    2. Using esxtop, Identify in the "Pnic" column, which uplink (vmnic) is actually carrying the traffic in question.
      esxtop --> n
      The esxtop --> n output will either show "vmnic#" (where # --> the vmnic number) or "all(#)" (where # is the number of uplinks configured in an LACP LAG or Etherchannel configuration)

       If the result is "all(#)" then you must capture on EACH of the uplinks simultaneously, because the physical switches to which the uplinks are connected determine which data path is used.  

  4. Initiate Packet Capture from the Problematic Host for the uplink in question:

    pktcap-uw -- uplink <vmnic> -- capture UplinkSndKernel, UplinkRcvKernel -o - | tcpdump-uw -r - -enn | grep <VM MacAddress> | grep - i arp

Review Packet Capture results to confirm if traffic is flowing or not through the uplink to the physical switch.

If you see traffic flowing from the VM and being delivered to the physical network but no responses from the physical network, then the issue is occurring outside of the virtual network and the physical network configuration should be investigated to identify the problem.