In the /services-logs/prelude/Postgres logs-
[DEBUG] connection_ping(): result is PGRES_TUPLES_OK[DEBUG] is_server_available(): ping status for "host=postgres-0.postgres.domainname dbname=repmgr-db user=repmgr-db passfile=/run/repmgr-db.cred connect_timeout=10keepalives=1" is PQPING_NO_RESPONSE
va#####.srv.axxxxxxxz etcd[######3]: failed to send out heartbeat on time (exceeded the 100ms timeout for 1.940391769s, to cf5f60d36079f947)server is likely overloadedI0408 03:13:08.363531 2418766 trace.go:236] Trace[408016628]: "Get" accept:application/vnd.kubernetes.protobuf,application/json,audit-id:8cd3ab6b-19ef-4583-b631-6deb2dc2ba9b,client:127.0.0.1,api-group:,api-version:v1,name:ccs-gateway-tracing-config,subresource:,namespace:prelude,protocol:HTTP/2.0,resource:configmaps,scope:resource,url:/api/v1/namespaces/prelude/configmaps/ccs-gateway-tracing-config,user-agent:kubelet/v1.30.2+vmware.3 (linux/amd64) kubernetes/7f6a4bf,verb:GET (08-Apr-2025 03:13:07.260) (total time: 1081ms):Trace[408016628]: ---"Writing http response done" 1081ms (03:13:08.341)
VMware Aria Automation 8.18.x
The node exceeded 90% memory usage (make: *** [/opt/health/Makefile:73: memory-usage] Error 1), impacting overall performance and delaying PostgreSQL pod responses. It is assumed that the current master PostgreSQL pod was unable to respond to ping requests within the 10-second timeout due to memory exhaustion, triggering an automatic failover.
To resolve the issue we have the following options:
1. Option 1: Increase the physical RAM with the RAM custom profile adds to.
Currently
Mem: 94Gi 80Gi 2.1Gi 12Gi 24Gi 13Gi
Swap: 0B 0B 0B
As vRO in XL profile standard setting is 15 GB total for vRO and Polyglot runner compared to customer custom profile 40GB total for vRO and Polyglot runner, the recommendation is to increase all the VAs physical memory to 94 + 40 = 134GB
2. Option 2: Move to external vRO.
NOTE: The custom profile customization is supported only for external vRO not embedded stated here: How to Scale the Heap Memory Size of the Automation Orchestrator Server.