Frequent email alerts are sent with the following message "VMware Cloud Director cell with IP address ###.###.### is successful in reconnecting to the database."
Checking the vmware-vcd-watchdog.log it can be seen that the service is being restarted:
2025-07-25 09:16:29 | INFO | vmware-vcd-cell running
2025-07-25 09:21:30 | ALERT | vmware-vcd-cell is dead but /var/run/vmware-vcd-cell.pid exists, attempting to restart it
2025-07-25 09:21:40 | INFO | Started vmware-vcd-cell (pid=478962)
2025-07-25 09:21:40 | WARN | Server status returned HTTP/1.1 404
2025-07-25 09:22:40 | WARN | Server status returned HTTP/1.1 503
2025-07-25 09:23:40 | WARN | Server status returned HTTP/1.1 503
2025-07-25 09:24:40 | WARN | Server status returned HTTP/1.1 503
2025-07-25 09:26:41 | INFO | vmware-vcd-cell running
2025-07-25 09:31:41 | INFO | vmware-vcd-cell running
If dmesg is checked on the appliance then it can be seen that an out of memory killer was activated and started killing process:
[11#######] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/vmware-vcd.service,task=java,pid=41###,uid=1###
[11#######] Out of memory: Killed process 41#### (java) total-vm:16181468kB, anon-rss:5105644kB, file-rss:0kB, shmem-rss:16kB, UID:1003 pgtables:14688kB oom_score_adj:0
VMware Cloud Director 10.x
Memory was being consumed at too high of rate for the appliance to handle. The sizing of the appliance needs to increase or the number of requests needs to be limited.
Right size the appliances to extra large or more according to: https://techdocs.broadcom.com/us/en/vmware-cis/cloud-director/vmware-cloud-director/10-6/vmware-cloud-director-installation-and-upgrade-guide-10-6/deployment-uprade-and-administration-of-the-vcd-appliance-install/deployment-and-initial-configuration-of-your-vcd-appliance-install/vcd-appliance-sizing-guidelines-install.html.
The procedure for resizing is here: https://knowledge.broadcom.com/external/article/387075/recommended-procedure-for-resizing-cloud.html.
If the appliances are already right-sized then requests need to be limited coming into Cloud Director. That would have to be performed outside of VMware Cloud Director at the loadbalancer level.