Testing VMkernel network connectivity with the vmkping command
search cancel

Testing VMkernel network connectivity with the vmkping command

book

Article ID: 344313

calendar_today

Updated On:

Products

VMware vCenter Server VMware vSphere ESXi

Issue/Introduction

For troubleshooting purposes, it may be necessary to test VMkernel network connectivity between ESXi hosts in your environment.

This article provides you with the steps to perform a vmkping test between your ESXi hosts.

Typical symptoms you might observe would include an error such as:
 
Migration to host ###.###.###.### failed with error Connection closed by remote host, possibly due to timeout (195887167).

Timed out waiting for migration data.
 
Where ###.###.###.### above is the IP address of the vmkernel interface tagged for the "vMotion" service on the attempted destination ESXi host.

Environment

ESXi

Cause

If you observe an error such as the following:

Migration to host ###.###.###.### failed with error Connection closed by remote host, possibly due to timeout (195887167).

Timed out waiting for migration data.
 
Where ###.###.###.### above is the IP address of the vmkernel interface tagged for the "vMotion" service on the attempted destination ESXi host;
 
If previous vMotions were successful from this source ESXi host and the destination ESXi host, and nothing has been changed in the ESXi configurations of either ESXi host, then the issue is likely upstream from the vmnic (Uplink) being used to carry vMotion traffic on either or both of the source and destination ESXi hosts.
 
Please follow the step by step approach in the Resolution section below to obtain the evidence to suggest a root cause. 

Resolution

You will need to SSH into each of the source and destination hosts with a userid with root privileges.  For more information, see Using ESXi Shell in ESXi

STEP 1:  Identify which vmkernel interface (for example, vmk0, vmk1, vmk2, etc.) is tagged for the vMotion service in each of the source and destination ESXi hosts.

  • Using the vSphere client logged into vCenter, in the Hosts and Clusters view, select each of the source and destination ESXi hosts in turn, navigate to "Configure > vmkernel adapters" .
  • Expand the ">>" icons next to each of the configured vmkernel interfaces 
  • Determine which vmkN interface (vmk0, vmk1, vmk2, etc.) is tagged for vMotion service.
  • Also, make note of the MTU size configured on each of the vmkernel interfaces tagged for vMotion service on each of the source and destination ESXi hosts.

STEP 2:  Determine which vmnic (Uplink) is being used to carry vMotion traffic in each of the source and destination ESXi hosts in turn.

  • SSH into each ESXi host with root privileges
  • esxtop --> n
  • Observe the display in the "Pnic" column that is associated with the vmkernel interface identified in step 1.
  • You now will have the TCP/IP data path between the source and destination ESXi hosts.

STEP 3:  Determine if possible, using CDP and/or LLDP Info, to which physical switch and switchport each of the vmnics(Uplinks) identified in Step 2 is connected.

  • Using the vSphere client logged into vCenter, in the Hosts and Clusters view, select each of the source and destination ESXi hosts in turn, navigate to "Configure > Network > Physical adapters"
  • Select the appropriate vmnic(s) identified in Step 2 and select the CDP and/or LLDP tab below to reveal to which physical switch and switchport each of the vmnics(Uplinks) identified in Step 2 is connected.
  • CDP and/or LLDP information displayed here is based on information received by the ESXi host along the data path from the physical switch.
  • If there is no CDP or LLDP info displayed, then the physical switch may not be sending that info, which can be configured by the team that manages the physical switches. 
  • If there is no CDP or LLDP info displayed, then making note of the MAC address of the vmkernel interface identified in Step 1 will help the team that manages the physical switches determine to which physical switch and switchport each of the vmnics(Uplinks) identified in Step 2 is connected.

STEP 4:  Test the connectivity along the data path that you have identified above with the vmkping command.

  • The vmkping command sources a ping from the local VMkernel port.
  • Instructions to test vmkernel ping connectivity with vmkping:

vmkping -I vmkN ###.###.###.### -s SSSS -d -c 3

  • In the above command:
    • vmkN is the vmkernel interface tagged for vMotion service
    • ###.###.###.### is the IP interface of the vmkernel interface tagged for vMotion service on the destination host.
    • SSSS is a calculated number by subtracting 28 from the value obtained in Step 1 above for configured MTU size
    • The -d parameter indicated "DO NOT FRAGMENT" the ICMP Request or Reply
    • The -c 3 parameter just indicates to make 3 Ping attempts (change this if you wish to get, say, a sample of 900 attempts over, say, a 90 second period, then change this to "-c 900" and add the parameter "-i 0.1" which will make 10 attempts per second.  This is useful to determine the standard deviation of the round-trip latency data points, which tends to indicate how consistent the network performance is over a 90 second period, for this test.)

 

  • NOTE:  If the vmkernel interface tagged for vMotion service is configured to use the "vmotion" TCP/IP stack, add the following to the end of your vmkping command: "-S vmotion"

A successful ping response would appear similar to:

vmkping -I vmk0 10.0.0.1 -s 1472 -d -c3

PING server(10.0.0.1): 56 data bytes
64 bytes from 10.0.0.1: icmp_seq=0 ttl=64 time=10.245 ms
64 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.935 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.926 ms
--- server ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.926/4.035/10.245 ms


An unsuccessful ping response is similar to:

vmkping 10.0.0.2 -s 1472 -d -c3

  • PING server (10.0.0.2) 56(84) bytes of data.
    --- server ping statistics ---
    3 packets transmitted, 0 received, 100% packet loss, time 3017ms


  • NOTE: The commands shown above are the same for ipv6. Just need to add the -6 option in the command, for example:

    vmkping -6 and replace #.#.#.# by an ipv6 address ##:##:##:##:##:##:##:##

 

  • Full list of the vmkping options are:

vmkping [args] [host]
 

arg use
-4  use IPv4 (default)
-6 use IPv6
-c <count> set packet count
-d set DF bit (do not fragment) in IPv4 or Disable Fragmentation (IPv6)
-D vmkernel TCP stack debug mode
-i <interval> set interval (secs)
-I <interface> set outgoing interface, such as "-I vmk1"
-N <next_hop> set IP*_NEXTHOP- bypasses routing lookup
for IPv4, -I is required to use -N
-s <size> set the number of ICMP data bytes to be sent
The default is 56, which translates to a 64 byte ICMP frame when adding the 8 byte ICMP header (these sizes do not include the header)
-t <ttl> set IPv4 Time To Live or IPv6 Hop Limit
-v verbose
-W <time> set timeout to wait if no responses are received (secs)
-X XML output format for esxcli framework
-S sets the network stack instance name. If unspecified, the default stack is used. Note: only works for IPv4, not IPv6)
  • For testing TEP - TEP VMK connectivity between hosts (vxlan stack) :

vmkping -I vmk10 -S vxlan <destination_host's_TEP_VMK_IP>

example :
vmkping -I vmk10 -S vxlan #.#.#.#
PING #.#.#.# (#.#.#.#): 56 data bytes
64 bytes from #.#.#.#: icmp_seq=0 ttl=64 time=1.218 ms
64 bytes from #.#.#.#: icmp_seq=1 ttl=64 time=0.716 ms
64 bytes from #.#.#.#: icmp_seq=2 ttl=64 time=1.097 ms

--- #.#.#.# ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.716/1.010/1.218 ms

 

CONCLUSIONS:

1) If the vmkping command produces successful results, then there is likely no data path issue between the vmnics (Uplinks) in the source and destination hosts.

2) If the vmkping command does not produce successful results, then please engage your team that manages the physical switches to which physical switch and switchport each of the vmnics(Uplinks) identified in Step 2 is connected.

  • Please supply them with all of the information you collected in STEP 1 through STEP 4, above.
  • After they check the data path and the physical switch / switchport configurations, if they cannot resolve the issue, please engage Broadcom on a support case. 
  • If you have not yet created a Broadcom support case, please follow the instructions at Creating and managing Broadcom support cases

 

 

Additional Information