/var/log/vmware/sso/websso.log:
YYYY-MM-DDTHH:MM:SS INFO websso[73:tomcat-http--27] [CorId=########-####-####-####-#########] [com.vmware.identity.samlservice.impl.ExternalIdpProvider] Got exception (sleeping before retry)
com.vmware.vapi.client.exception.TransportProtocolException: HTTP response with status code 503 (enable debug logging for details): envoy overloaded at com.vmware.vapi.internal.protocol.client.rpc.http.ApacheHttpUtil.validateHttpResponse(ApacheHttpUtil.java:101) ~[vapi-runtime-2.100.0.jar:?]
/var/log/vmware/vapi/endpoint/endpoint.log:YYYY-MM-DDTHH:MM:SS| WARN | vAPI-I/O dispatcher-0 | ApiMethodSession | Error was thrown while running close session: com.vmware.vapi.endpoint.vapi.ApiMethodSession$######com.vmware.vapi.client.exception.TransportProtocolException: HTTP response with status code 503 (enable debug logging for details): envoy overloaded
zgrep "503 overload" /var/log/vmware/envoy-sidecar/envoy-access-* | wc -lOn vCenter 8.0U3:
zgrep envoy_server_memory_heap_size{} /var/cache/vmware-rhttpproxy/envoy-sidecar-stats/* | cut -d ' ' -f2| sort -n | uniq | tail -1 | awk '{print $1 >= 1052266987}'
On vCenter 9.0
zgrep envoy_overload_envoy_resource_monitors_fixed_heap_pressure /var/log/vmware/vstats/metrics/ENVOY_SIDECAR* | grep -v "# TYPE" | cut -d ' ' -f2| sort -n | uniq | tail -1 | awk '{print $1 >= 98}'
vCenter Server 8.x
vCenter Server 9.x
Memory exhaustion in the envoy-sidecar causes vCenter internal workloads to fail, triggering 503 service errors.
Workaround:
# cp /etc/vmware-envoy-sidecar/config.yaml /etc/vmware-envoy-sidecar/config.yaml.back# sed -i 's/max_heap_size_bytes: 1073741824/max_heap_size_bytes: 2147483648/g' /etc/vmware-envoy-sidecar/config.yaml
# service-control --restart envoy-sidecar# sed -i 's/max_heap_size_bytes: 2147483648/max_heap_size_bytes: 4294967296/g' /etc/vmware-envoy-sidecar/config.yaml# service-control --restart envoy-sidecar
In rare scenarios 4 GB of memory might still lead to memory exhaustion. In such cases we recommend to completely remove these two actions:
vi /etc/vmware-envoy-sidecar/config.yaml
- name: "envoy.overload_actions.stop_accepting_requests" triggers: - name: "envoy.resource_monitors.global_downstream_max_connections" threshold: value: 0.99 - name: "envoy.resource_monitors.fixed_heap" threshold: value: 0.98
- name: "envoy.overload_actions.reject_incoming_connections" triggers: - name: "envoy.resource_monitors.fixed_heap" threshold: value: 1.00
Post the above two actions are removed , the updated overload Manager section in the YAML file should appear as follows,
overload_manager: refresh_interval: 1s resource_monitors: - name: "envoy.resource_monitors.global_downstream_max_connections" typed_config: "@type": type.googleapis.com/envoy.extensions.resource_monitors.downstream_connections.v3.DownstreamConnectionsConfig max_active_downstream_connections: 8000 - name: "envoy.resource_monitors.fixed_heap" typed_config: "@type": type.googleapis.com/envoy.extensions.resource_monitors.fixed_heap.v3.FixedHeapConfig max_heap_size_bytes: 4294967296 # 4GB actions: - name: "envoy.overload_actions.shrink_heap" triggers: - name: "envoy.resource_monitors.fixed_heap" threshold: value: 0.75 - name: "envoy.overload_actions.disable_http_keepalive" triggers: - name: "envoy.resource_monitors.global_downstream_max_connections" threshold: value: 0.8 - name: "envoy.resource_monitors.fixed_heap" threshold: value: 0.95 - name: "envoy.overload_actions.reduce_timeouts" triggers: - name: "envoy.resource_monitors.global_downstream_max_connections" scaled: scaling_threshold: 0.25 saturation_threshold: 0.97 - name: "envoy.resource_monitors.fixed_heap" scaled: scaling_threshold: 0.85 saturation_threshold: 0.97 typed_config: "@type": type.googleapis.com/envoy.config.overload.v3.ScaleTimersOverloadActionConfig timer_scale_factors: - timer: HTTP_DOWNSTREAM_CONNECTION_IDLE min_timeout: 2s
Save the file and restart sidecar service:# service-control --restart envoy-sidecar