A task suddenly fails in VCD.
If you check vcloud-container-debug.log, you will see the following log around the time the task failed:
2024-11-20 23:06:32,234 | ERROR | processor-Backend | DefaultActivityQueueProcessor | Unxpected error submitting activity com.vmware.ssdc.backend.services.impl.CreateDiskActivity/urn:uuid:########-####-####-####-############ to activity template ActivityTemplate [activityExecutor=com.vmware.vcloud.activity.executors.PersistentActivityExecutor@55b43c68, activityProvider=com.vmware.vcloud.activity.toolkit.SpringActivityProvider@1ec4989e]. Will not retry, clearing queue element. | java.lang.OutOfMemoryError: unable to create native thread: possibly out of memory or process/resource limits reached at java.base/java.lang.Thread.start0(Native Method) at java.base/java.lang.Thread.start(Thread.java:798) at java.base/java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:937) at java.base/java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1343) at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118) at com.vmware.vcloud.activity.executors.LocalActivityExecutor.submit(LocalActivityExecutor.java:368) at com.vmware.vcloud.activity.executors.PersistentActivityExecutor.innerSubmit(PersistentActivityExecutor.java:236) at com.vmware.vcloud.activity.executors.PersistentActivityExecutor.submit(PersistentActivityExecutor.java:155) at com.vmware.vcloud.activity.toolkit.ActivityTemplate.run(ActivityTemplate.java:245) at com.vmware.vcloud.activity.toolkit.ActivityTemplate.run(ActivityTemplate.java:215) at com.vmware.vcloud.activity.toolkit.ActivityTemplate.run(ActivityTemplate.java:151) at com.vmware.vcloud.activity.toolkit.queueing.DefaultActivityQueueProcessor.submitElement(DefaultActivityQueueProcessor.java:388) at com.vmware.vcloud.activity.toolkit.queueing.DefaultActivityQueueProcessor$1.run(DefaultActivityQueueProcessor.java:194)
Also, if you check vmware-vcd-watchdog.log, you will see that vmware-vcd-cell has restarted.
2024-11-20 23:05:54 | INFO | vmware-vcd-cell running2024-11-20 23:06:55 | ALERT | vmware-vcd-cell is dead but /var/run/vmware-vcd-cell.pid exists, attempting to restart it2024-11-20 23:07:05 | INFO | Started vmware-vcd-cell (pid=3955)2024-11-20 23:07:06 | WARN | Server status returned HTTP/1.1 4042024-11-20 23:08:06 | WARN | Server status returned HTTP/1.1 5032024-11-20 23:09:06 | WARN | Server status returned HTTP/1.1 5032024-11-20 23:11:06 | INFO | vmware-vcd-cell running
VMware Cloud Director 10.5
VMware Cloud Director 10.6
Note: The AMQP-based functionality will still work in VCD 10.6, but it is deprecated from VCD 10.6 and no longer supported. See the document.
When the connection between the VCD (RabbitMQ Client) and the RabbitMQ server is forcibly disconnected, the AMQP thread will remain in the VCD.
In environments where forced disconnections occur periodically, such as by a load balancer, the VCD experiences "java.lang.OutOfMemoryError" and the vmware-vcd-cell service will be restarted repeatedly.
The frequency with which this occurs depends on the environment, but in the target environment it has been confirmed that this occurs in turn in one of the cells every few days.
Currently there is no fix.
Please resolve any issues that may be preventing the connection between VCD and RabbitMQ Server.
Alternatively, you can configure MQTT instead of AMQP.
Workaround:
Count the number of AMQP-related threads from the thread dump, and if it exceeds 1000, restart the vmware-vcd-cell manually.
# /opt/vmware/vcloud-director/bin/cell-management-tool support -i $(service vmware-vcd pid cell) -t | grep -c "AMQP Connection <RabbitMQ Server IP>:5672"
Mitigation:
You may be able to reduce disconnections from the LB by shortening the AMQP heartbeat interval.
# /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n amqp.heartbeat -l# /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n amqp.heartbeat -v 30Alternatively, remove the Load Balancer and connect VCD and RabbitMQ Server directly.