ESXi Hosts Constantly Sending DNS Queries to the DNS Server
search cancel

ESXi Hosts Constantly Sending DNS Queries to the DNS Server

book

Article ID: 385346

calendar_today

Updated On:

Products

VMware vSphere ESX 8.x

Issue/Introduction

  • ESXi hosts constantly sending DNS queries to DNS server
  • DNS server is getting engaged and makes it slow responding to other DNS queries

Environment

vSphere ESXi 8.x

Cause

When a DKVS (Distributed Key-Value Store) cluster is in an error state, it is known to cause a lot of DNS traffic, as the replica hosts are constantly retrying their connections to each other.

Resolution

VMware is aware of this known behavior and is seeking to fix this in future releases.

Confirmation

  • Verify if the hosts are having the DKVS (Distributed Key-Value Store) service running on them by running the below command on each ESXi host:

    /usr/lib/vmware/clusterAgent/bin/clusterAdmin cluster status

    Example of DKVS Running on the ESXi host

    [root@ESXi:/] /usr/lib/vmware/clusterAgent/bin/clusterAdmin cluster status

    {
       "state": "hosted",

       "cluster_id": "############",          >>>>>>>>>>>>>>>>  DKVS is running on the host
       "is_in_alarm": false,
       "alarm_cause": "",
       "is_in_cluster": true,
       "members": {
          "available": true
       },
       "namespaces": [
          {
             "name": "root",
             "up_to_date": true,
             "members": [
                {
                   "peer_address": "##.##.##.##:##",
                   "api_address": "##.##.##.##:##",
                   "reachable": true,
                   "primary": "no",
                   "learner": false
                },
                {
                   "peer_address": "##.##.##.##:##",
                   "api_address": "##.##.##.##:##",
                   "reachable": true,
                   "primary": "yes",
                   "learner": false
                },
                {
                   "peer_address": "##.##.##.##:##",
                   "api_address": "##.##.##.##:##",
                   "reachable": true,
                   "primary": "no",
                   "learner": false
                }
             ]
          }
       ]
    }

    Example of DKVS Not Running on the ESXi host

    [root@ESXi:/] /usr/lib/vmware/clusterAgent/bin/clusterAdmin cluster status
    {
       "state": "standalone",
       "cluster_id": "",          >>>>>>>>>>>>>>>>  DKVS is not running on the host as there is no cluster id     
      "is_in_alarm": false,
       "alarm_cause": "",
       "is_in_cluster": false,
       "members": {
          "available": false
       }
    }

If DKVS is enabled and running, below are 3 workaround options to resolve this issue.

Workaround Options

  1. Disable DKVS in vCenter.
    • SSH to the vCenter via root
    • Disable DKVS

      /usr/lib/vmware-vpx/py/xmlcfg.py -f /etc/vmware-vpx/vpxd.cfg set vpxd/clusterStore/globalDisable true


    • Restart the vpxd service

      vmon-cli -r vpxd

    • After disabling DKVS on vCenter, it may be necessary to clear the DKVS settings on the ESXi hosts
      • SSH to each of the ESXi hosts within the cluster via root
      • Stop the clusterAgent service

        /etc/init.d/clusterAgent stop
         
      • Remove the clusterAgent data file

        configstorecli files datafile delete -c esx -k cluster_agent_data

      • Remove the clusterAgent data directory

        configstorecli files datadir delete -c esx -k cluster_agent_data

      • Restart the vpxd service

        vmon-cli -r vpxd

  2. Add the ESXi hosts to vCenter using the ESXi's IP address instead of their FQDN names

  3. Add mapping from the ESXi hosts' FQDNs to their IP addresses in /etc/hosts on ESXi hosts (has to be done on each ESX host in a cluster)
Note: The DKVS service is used during a restore from backup of vCenter.  It becomes the source of truth if the VC backup differs from the host inventory configuration (host membership in a cluster, credentials, DVS State).  If this service is disabled, the recovery of VC server may cause host disconnects and would need to be reconnected to re-sync host data and configuration