Slow Task Processing in VMware Cloud Director
search cancel

Slow Task Processing in VMware Cloud Director

book

Article ID: 325174

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • VCD environments gradually degrade and task processing becomes slower and slower
  • Commonplace operations (power on/off a VM, modify the configuration of a VM, instantiate a new VM/vApp, etc.) take exceptionally long to complete
  • Tasks that normally only take seconds or a few minutes are taking 10 minutes or more to complete
  • This issue can usually be identified within the cell-runtime.log file; specifically you want to validate the expected Artemis cluster topology against the real Artemis cluster topology



Environment

VMware Cloud Director 10.x

Cause

  • The Artemis cluster is the internal mechanism used to facilitate cell-to-cell communication. On occasion, this mechanism degrades and cell-to-cell communication falters along with it. When cell-to-cell communication degrades, task processing can take considerably longer. This occurs because the cell that handles a particular task is unable to relay an update of the tasks completion as the internal communication mechanism is non-functional

Resolution

This issue is resolved in VMware Cloud Director 10.4.2 available at Broadcom Downloads


Workaround:

This issue can be temporarily bypassed by performing a rolling reboot on all cells, or alternatively, by running vmware-vcd services on exclusively the primary cell. For versions 10.4.X, the following configuration changes can be implemented to reduce the impact of this issue. Please note: All cell-management-tool commands listed below only need to be executed on the primary cell

  1. Set the connectionTTL to 90s. The default is 60s:
    /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n "jms.cluster.connectionTTL" -v "90000"

  2. Set the clientFailureCheckPeriod to 45s:
    /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n "jms.cluster.clientFailureCheckPeriod" -v "45000"

  3. Set the Task Poller retrieval interval to 60s - this polls vCenter for task updates:
    /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n vc-task-completions-retrieval-timer-interval-sec -v 60

  4. Set the Activity Poller retrieval interval to 60s - this polls data from the activity table for completion of activities:
    /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n vcloud.activities.activityRelayPollingIntervalMs -v 60000

  5. Set the VCD inventory timeout to 600s:
    /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n InventoryWait -v 600000

  6. Set the Event Processor duration to 120s:
    /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n event.processor.running.duration.millisec -v 120000


  7. Perform a shutdown and restart of the vmware-vcd services on ALL cells in the environment:
    service vmware-vcd restart

Additional Information

To verify if the values are already in place, use the "-l" option in the commands.

Example:

# /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n "jms.cluster.connectionTTL" -l
Property "jms.cluster.connectionTTL" has value "90000

For more information, see the VMware Cloud Director 10.4.2 Release Notes.