SDDC Manager UI times out with 504 Gateway and API calls fail to respond.
systemctl status commonsvcs.service -l returns java.lang.OutOfMemoryError:
root@<SDDC_MANAGER> [ /var/log/vmware/vcf/operationsmanager ]# systemctl status commonsvcs.service -l
* commonsvcs.service - VMware Cloud Foundation Platform Services
Loaded: loaded (/etc/systemd/system/commonsvcs.service; enabled; vendor preset: enabled)
Active: active (running) since Day YYYY-MM-DDTHH:MM:SS <timezone>; x months ydays ago
Main PID: 1129 (java)
Tasks: 60 (limit: 19194)
Memory: 2.3G
CGroup: /system.slice/commonsvcs.service
`-1129 /etc/alternatives/jre/bin/java -Xmx2048m -XX:MaxPermSize=512m -classpath /opt/vmware/vcf/commonsvcs/conf:/opt/vmware/vcf/commonsvcs/lib/* -Dapp.name=commonsvcs -Dapp.pid=1129 -Dapp.repo=/opt/vmware/vcf/commonsvcs/lib -Dapp.home=/opt/vmware/vcf/commonsvcs -Dbasedir=/opt/vmware/vcf/commonsvcs com.vmware.evo.sddc.commonsvcs.Application
<SDDC_MANAGER> sudo[31473]: pam_unix(sudo:session): session closed for user root
<SDDC_MANAGER> sudo[31488]: vcf_commonsvcs : PWD=/ ; USER=root ; COMMAND=/opt/vmware/vcf/commonsvcs/scripts/nfs/refresh-nfs.sh
<SDDC_MANAGER> sudo[31488]: pam_unix(sudo:session): session opened for user root by (uid=996)
<SDDC_MANAGER> sudo[31488]: pam_unix(sudo:session): session closed for user root
<SDDC_MANAGER> sudo[31571]: vcf_commonsvcs : PWD=/ ; USER=root ; COMMAND=/opt/vmware/vcf/commonsvcs/scripts/auth/set-known_hosts.sh /etc/vmware/vcf/commonsvcs/known_hosts
<SDDC_MANAGER> sudo[31571]: pam_unix(sudo:session): session opened for user root by (uid=996)
<SDDC_MANAGER> sudo[31571]: pam_unix(sudo:session): session closed for user root
<SDDC_MANAGER> commonsvcs[1129]: Exception in thread "VLSI-client-connection-monitor-329132" Exception in thread "http-nio-###.#.#.#-####-Acceptor" java.lang.OutOfMemoryError: Java heap space
<SDDC_MANAGER> commonsvcs[1129]: Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "http-nio-###.#.#.#-####-Poller"
<SDDC_MANAGER> commonsvcs[1129]: java.lang.OutOfMemoryError: Java heap space
VCF 4.x and VCF 5.x
The JavaOutOfMemory issue is caused by a large number of password update tasks having USER_CANCELLED state. Since this state is not recognized by Task Aggregator, SDDC Manager assumes that the task is still PENDING (i.e. its state is mutable). Every minute, Task Aggregator loops through the list of mutable tasks and checks whether or not they have been brought to completion. Since none of these tasks will ever reach a terminal state, Task Aggregator spends a lot of resources monitoring these tasks, leading to JavaOutOfMemory exception.
Issue is fixed in 5.2.1.
Issue can be remediated by restarting all the services on SDDC VM:
/opt/vmware/vcf/operationsmanager/scripts/cli/sddcmanager_restart_services.sh