Intermittent connectivity for VMs or vSphere services
search cancel

Intermittent connectivity for VMs or vSphere services

book

Article ID: 417584

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

VMs may seem to periodically disconnect, including after a vMotion or power-on operation.

Some vSphere services, such as the Native Key Provider (NKP), may intermittently disconnect or flap between active and inactive states:

Environment

vSphere (all versions)

Cause

This issue can occur when at least one NIC in a teaming policy is unable to communicate on the network, but at least one NIC can. Therefore when a VM or host service is leveraging the "bad" NIC connectivity issues are seen but only until another NIC is used.

The frequency at which the issue is seen is very environment-specific, and it can depend on how many NICs can communicate versus those that can't, if there is a port-channel or LAG, etc. In other words, the issue will be seen more frequently in environments where the NICs that can't communicate are expected to be leveraged more frequently.

You can determine what NIC is in-use by a given VM or vmkernel adapter in real-time by using esxtop command in CLI to determine if the behavior only follows certain NICs:

  1. Open an SSH session to the host in question
  2. type "esxtop" then press the enter key
  3. Press the "n" key to open the networking page
  4. Note the NIC (e.g. vmnic2) in-use by an item by reviewing the TEAM-PNIC column 
  5. Press "q" to exit the screen

NOTE: If the TEAM-PNIC column notes "all", the VM/vmkernel adapter is in a portgroup using the load balancing policy of Route based on IP hash, which typically indicates a port channel is in-use. See Configure Virtual Switch with an EtherChannel (or port channel) for more info. Similarly, if the column reports the name of a LAG created on the distributed switch, for example "lag1", the VM/vmkernel adapter is leveraging an LACP connection. See Introduction to LACP in VMware by Broadcom for more information about this configuration.

Resolution

Investigate the underlying reason why the impacted NICs can't communicate with your networking team.

If assistance in identifying the potentially problematic NIC is required, or for any further assistance, please create a Broadcom Support Case and reference this KB: Creating and managing Broadcom support request (SR) cases

Additional Information

See also VMs intermittently lose network connections after vMotion