"Admission failure in path: host/vim/vmvisor/etcd:etcd", frequent Etcd crash on the ESXi host causing too many DNS queries
search cancel

"Admission failure in path: host/vim/vmvisor/etcd:etcd", frequent Etcd crash on the ESXi host causing too many DNS queries

book

Article ID: 387913

calendar_today

Updated On: 04-11-2025

Products

VMware vSphere ESXi 8.0

Issue/Introduction

  • A massive influx of DNS queries from one or more ESXI hosts is overloading the DNS server.
  • On the problematic host, Etcd service keeps crashing as soon as its starts.
  • In the clusterAgent logs of the the problematic host, we see the error "connection reset by peer"

    /var/run/log/clusterAgent.log

     No(5) clusterAgent[3931896]: INFO  Etcd client started watch       {"opID": "kvwatch-tlspeertrust", "cli": "0xc0001e81a0", "key": "root/tlspeertrust"}
     No(5) clusterAgent[3931896]: INFO  Etcd client started watch       {"opID": "kvwatch-votingmembersupdated", "cli": "0xc0001e81a0", "key": "root/votingmembersupdated"}
     No(5) clusterAgent[3931896]: WARN  grpc: addrConn.createTransport failed to connect to {ESXi-FQDN:2379  <nil> 0 <nil>}. Err :connection error: desc = "transport: authentication handshake failed: read tcp ESXi-FQDN-IP:28383->ESXi-FQDN-IP:2379: read: connection reset by peer". Reconnecting...
     No(5) clusterAgent[3931896]: WARN  grpc: addrConn.createTransport failed to connect to {ESXi-FQDN:2379  <nil> 0 <nil>}. Err :connection error: desc = "transport: authentication handshake failed: read tcp ESXi-FQDN-IP:36502->ESXi-FQDN-IP:2379: read: connection reset by peer". Reconnecting...

  • In Watchdog.log, we see that the service keeps restarting.


    /var/run/log/watchdog.log

watchdog[XXXXXXX]: Started etcdmain with PID=3931911
watchdog[XXXXXXX]: Restarting etcdmain
watchdog[XXXXXXX]: Started etcdmain with PID=3931929
watchdog[XXXXXXX]: Restarting etcdmain
watchdog[XXXXXXX]: Started etcdmain with PID=3931943
watchdog[XXXXXXX]: Restarting etcdmain
watchdog[XXXXXXX]: Started etcdmain with PID=3931969
watchdog[XXXXXXX]: Restarting etcdmain
watchdog[XXXXXXX]: Started etcdmain with PID=3931985
watchdog[XXXXXXX]: Restarting etcdmain
watchdog[XXXXXXX]: Started etcdmain with PID=3931999

  • we see below entries in etcd log

    /var/run/log/etcd.log

In(6) etcd[XXXXXXX]: added member 9e8cfbf3dbf0e555 [https://ESXi-FQDN:2380] to cluster 2fbf9a482d65ed67
In(6) etcd[XXXXXXX]: starting peer 9e8cfbf3dbf0e555...
In(6) etcd[XXXXXXX]: started HTTP pipelining with peer 9e8cfbf3dbf0e555

In(6) etcd[XXXXXXX]: started streaming with peer 9e8cfbf3dbf0e555 (writer)
In(6) etcd[XXXXXXX]: removed member 9e8cfbf3dbf0e555 from cluster 2fbf9a482d65ed67
In(6) etcd[XXXXXXX]: stopping peer 9e8cfbf3dbf0e555...
In(6) etcd[XXXXXXX]: stopped streaming with peer 9e8cfbf3dbf0e555 (writer)
In(6) etcd[XXXXXXX]: stopped streaming with peer 9e8cfbf3dbf0e555 (writer)
In(6) etcd[XXXXXXX]: started streaming with peer 9e8cfbf3dbf0e555 (stream MsgApp v2 reader)
In(6) etcd[XXXXXXX]: stopped HTTP pipelining with peer 9e8cfbf3dbf0e555
In(6) etcd[XXXXXXX]: stopped streaming with peer 9e8cfbf3dbf0e555 (stream MsgApp v2 reader)
In(6) etcd[XXXXXXX]: started streaming with peer 9e8cfbf3dbf0e555 (stream Message reader)
In(6) etcd[XXXXXXX]: stopped streaming with peer 9e8cfbf3dbf0e555 (stream Message reader)
In(6) etcd[XXXXXXX]: stopped peer 9e8cfbf3dbf0e555
In(6) etcd[XXXXXXX]: removed peer 9e8cfbf3dbf0e555

  • At the same time, we see that "Admission failure in path: host/vim/vmvisor/etcd:etcd" in VMkernal.log

    /var/run/log/vmkernel.log

vmkernel: cpu12:3932082)Admission failure in path: host/vim/vmvisor/etcd:etcd.3932076:uw.3932076
vmkernel: cpu12:3932082)UserWorld 'etcd' 3932076 with cmdline '/usr/lib/vmware/etcd/bin/etcd --config-file=/var/cache/datafiles/esx#cluster_agent_data/etcd.yml', parent 2097917
vmkernel: cpu12:3932082)started from 'init' 2097917 with cmdline '/bin/init', parent 0
vmkernel: cpu12:3932082)uw.3932076 (10380427) requires 4096 KB, asked 4096 KB from etcd (6977) which has 193788 KB occupied and 2820 KB available.
vmkernel: cpu84:3932095)Admission failure in path: host/vim/vmvisor/etcd:etcd.3932093:uw.3932093
vmkernel: cpu84:3932095)UserWorld 'etcd' 3932093 with cmdline '/usr/lib/vmware/etcd/bin/etcd --config-file=/var/cache/datafiles/esx#cluster_agent_data/etcd.yml', parent 2097917
vmkernel: cpu84:3932095)started from 'init' 2097917 with cmdline '/bin/init', parent 0
vmkernel: cpu84:3932095)uw.3932093 (10380454) requires 4096 KB, asked 4096 KB from etcd (6977) which has 192872 KB occupied and 3736 KB available

  • Cluster status for the problematic ESXi shows the reachable status as False.  

/usr/lib/vmware/clusterAgent/bin/clusterAdmin cluster status

"state": "hosted"
"cluster_id": "ebbbcf4f-8eae-4fe8-85e8-d197a4ffe1c7: domain-c952432",
"is_in_alarm": false,
"alarm_cause": "",
"is_in_cluster": true,
"members": {
"available": true
},
"namespaces": [
{
"name": "root",
"up_to_date": true,
"members": [
"peer_address": "ESXi1:2380",
"api_address": "ESXi1:2379",
"reachable": true,
"primary": "yes",
"learner": false
},
{
"peer_address": "ESXi2:2380",
"api_address":
"ESXi2:2379",
"reachable": true,
"primary": "no",
"learner": false
},
{
"peer_address": "ESXi3:2388",
"api_address": "ESXi3:2379",
"reachable": false,
"primary": "unknown",

"learner": false
}

 

Cause

Etcd service runs out of memory and keeps crashing 

Resolution

This issue is addressed in vSphere 8.0 U3e.
Update VMware vCenter and VMware vSphere ESXi to 8.0 U3e to resolve this issue.