In a VMware Aria Automation Orchestrator (formerly vRealize Orchestrator) cluster, one or more nodes may become unavailable. Upon investigation, the postgres pod fails to enter a Running state. When checking the pod status via CLI, you observe the following:
kubectl get pods -n prelude
NAME READY STATUS RESTARTS AGE
postgres-0 1/1 Running 0 108s
postgres-1 1/1 Running 0 108s
postgres-2 0/1 Error 4 108s
VMware Aria Automation Orchestrator 8.x
The postgresql.conf configuration file located at /data/live/postgresql.conf has become corrupted or contains invalid parameters. This corruption is often linked to storage latency or file system issues caused by maintaining multiple or aged virtual machine snapshots in the vCenter environment.
To resolve this issue, you must restore the affected node to a functional state and re-sync it with the cluster:
Running state, delete any old snapshots to prevent future storage latency and potential corruption.If a valid snapshot is not available, the alternative resolution is to deploy a new node with the same name and IP address and join it to the existing cluster.