ESXi host goes to "Not Responding" state, gets connected to vCenter for a couple of minutes after restarting the services on the host. vCenter is using vSphere Replication Manager.
In the envoy-access.log
YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 593 455 0 0 0 ###.##.###.###:46042 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr ########-#####-####-####-##########-HMSINT-39083032" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:45404 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:45400 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:45324 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:33096 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:45942 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 809 399 1 0 0 ###.##.###.###:46042 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr ########-#####-####-####-##########-HMSINT-39095330" "WaitForUpdatesEx"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 1 0 0 ###.##.###.###:45416 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:33154 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:33078 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 1 0 0 ###.##.###.###:37984 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr ########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:34456 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:45952 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy-access[2098898]: POST /hbr HTTP/1.1 200 via_upstream - 586 455 0 0 0 ###.##.###.###:32994 TLSv1.2 ###.##.###.###:443 - - /var/run/vmware/proxy-hbr "########-#####-####-####-##########-HMS-PING" "Fetch"
In the Envoy.log
[YYYY-MM-DDTHH:MM:SS] In(166) envoy[2098898]: "[YYYY-MM-DDTHH:MM:SS] warning envoy[2099212] [Originator@6876 sub=filter] [C6126] remote https connections exceed max allowed: 128"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy[2098898]: "[YYYY-MM-DDTHH:MM:SS] warning envoy[2099212] [Originator@6876 sub=filter] [C6126] closing connection TCP<###.##.###.###:49516, ###.##.###.###:443>"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy[2098898]: "[YYYY-MM-DDTHH:MM:SS] warning envoy[2099212] [Originator@6876 sub=filter] [C6127] remote https connections exceed max allowed: 128"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy[2098898]: "[YYYY-MM-DDTHH:MM:SS] warning envoy[2099212] [Originator@6876 sub=filter] [C6127] closing connection TCP<###.##.###.###:49520, ###.##.###.###:443>"
[YYYY-MM-DDTHH:MM:SS] In(166) envoy[2098898]: "[YYYY-MM-DDTHH:MM:SS] warning envoy[2099213] [Originator@6876 sub=filter] [C6128] remote https connections exceed max allowed: 128"
Due to the host trying to connect to the vSphere Replication Appliance and eventually it times out after the specified timeout session .
Recommended action plan -
1. SSH to vSphere Replication appliance. Edit /opt/vmware/hms/conf/hms-configuration.xml
and change scale-out-mode
to false
.
2. systemctl restart hms
3. Open https://<VR_Addr>:8043/mob?moid=replica-manager&vmodl=1
invoke method cleanupHbrsrvuwsPersistence
After that, hms
will not ping the hbrsrvuw
in the ESX.