"The file table of the ramdisk 'tmp' is full. As a result, the file /tmp/Go.[file_name] could not be created by the application 'etcd'" ####-##-##T##:##:##.882Z In(182) vmkernel: cpu##:9#####8)Admission failure in path: host/vim/vmvisor/etcd:etcd.9#####7:uw.9#####7
####-##-##T##:##:## No(5) clusterAgent[#####]: WARN grpc: addrConn.createTransport failed to connect to {hostfqdn:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: E
rror while dialing dial tcp hostip:2379: connect: connection refused". Reconnecting...
####-##-##T##:##:## No(5) clusterAgent[#####]: WARN grpc: addrConn.createTransport failed to connect to hostfqdn:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: E
rror while dialing dial tcp hostip:2379: connect: connection refused". Reconnecting...
####-##-##T##:##:## No(5) clusterAgent[#####]: WARN grpc: addrConn.createTransport failed to connect to {hostfqdn:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp hostip:2379: operation was canceled". Reconnecting...
####-##-##T##:##:## No(5) clusterAgent[#####]: WARN grpc: addrConn.createTransport failed to connect to {hostfqdn:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp hostip:2379: operation was canceled". Reconnecting...
####-##-##T##:##:## No(5) clusterAgent[#####]: WARN grpc: addrConn.createTransport failed to connect to {hostfqdn:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp hostip:2379: operation was canceled". Reconnecting...
####-##-##T##:##:## No(5) clusterAgent[#####]: WARN grpc: addrConn.createTransport failed to connect to {hostfqdn:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp hostip:2379: connect: connection refused". Reconnecting...
####-##-##T##:##:## No(5) clusterAgent[#####]: 2025-06-04T05:05:43.483Z WARN clientv3/retry_interceptor.go:62 retrying of unary invoker failed {"target": "endpoint://client-#####-####-###-####-########/hostfqdn:2379", "attempt": 0, "error": "rpc error: code = Unauthenticated desc = etcdserver: invalid auth token"}
####-##-##T##:##:## No(5) clusterAgent[#####]: WARN grpc: addrConn.createTransport failed to connect to {hostfqdn:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp hostip:2379: connect: connection refused". Reconnecting...
####-##-##T##:##:## No(5) clusterAgent[#####]: 2025-06-04T05:05:43.490Z WARN clientv3/retry_interceptor.go:62 retrying of unary invoker failed {"target": "endpoint://client-#####-####-###-####-########/hostfqdn:2379", "attempt": 0, "error": "rpc error: code = InvalidArgument desc = etcdserver: authentication failed, invalid user ID or password"}
####-##-##T##:##:## No(5) clusterAgent[#####]: WARN grpc: addrConn.createTransport failed to connect to {hostfqdn:2379 <nil> 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp hostip:2379: connect: connection refused". Reconnecting...
####-##-##T##:##:## No(5) clusterAgent[#####]: 2025-06-04T05:05:43.492Z WARN clientv3/retry_interceptor.go:62 retrying of unary invoker failed {"target": "endpoint://client-#####-####-###-####-########/hostfqdn:2379", "attempt": 0, "error": "rpc error: code = InvalidArgument desc = etcdserver: authentication failed, invalid user ID or password"}
Er(3) etcd[######]: failed to find member ############## in cluster ##############
Er(3) etcd[######]: failed to find member ############## in cluster ##############
etcd[######]: peer e6eaf0202e2e2ba4 became inactive (message send to peer failed)
etcd[######]: failed to dial ############## on stream MsgApp v2 (peer ############## failed to find local node ##############)
etcd[######]: failed to dial ############## on stream Message (peer ############## failed to find local node ##############)
vSphere ESXi 8.x
When a DKVS (Distributed Key-Value Store) cluster is in an error state, it is known to cause a lot of DNS traffic, as the replica hosts are constantly retrying their connections to each other.
This issue has been resolved in ESXi 8.0 Update 3g, Build 24859861.
/usr/lib/vmware/clusterAgent/bin/clusterAdmin cluster status[root@ESXi:/] /usr/lib/vmware/clusterAgent/bin/clusterAdmin cluster status
{
"state": "hosted",If DKVS is enabled and running, below are 3 workaround options to resolve this issue.
/usr/lib/vmware-vpx/py/xmlcfg.py -f /etc/vmware-vpx/vpxd.cfg set vpxd/clusterStore/globalDisable truevmon-cli -r vpxd/etc/init.d/clusterAgent stopconfigstorecli files datafile delete -c esx -k cluster_agent_dataconfigstorecli files datadir delete -c esx -k cluster_agent_data
vmon-cli -r vpxdDisable DKVS on the vCenter where affected ESXi hosts are connected using the attached Python script.
. Python script is attached and run below command to run the python script in vCenter
python3 dkvs-cleanup.py -d disable -w all-soft -s restart
/usr/lib/vmware-vpx/py/xmlcfg.py -f /etc/vmware-vpx/vpxd.cfg get vpxd/clusterStore/globalDisable
Recommendations:
python3 dkvs-cleanup.py -d enable -w actions-soft -s restart