Slow Task Processing in VMware Cloud Director
search cancel

Slow Task Processing in VMware Cloud Director

book

Article ID: 325174

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • The Artemis cluster is the internal mechanism used to facilitate cell-to-cell communication. On occasion, this mechanism degrades and cell-to-cell communication falters along with it. When cell-to-cell communication degrades, task processing can take considerably longer. This occurs because the cell that handles a particular task is unable to relay an update of the tasks completion as the internal communication mechanism is non-functional


Symptoms:
  • VCD environments gradually degrade and task processing becomes slower and slower
  • Commonplace operations (power on/off a VM, modify the configuration of a VM, instantiate a new VM/vApp, etc.) take exceptionally long to complete
  • Tasks that normally only take seconds or a few minutes are taking 10 minutes or more to complete
  • This issue can usually be identified within the cell-runtime.log file; specifically you want to validate the expected Artemis cluster topology against the real Artemis cluster topology
  • In the screenshot below, you'll see the expected Artemis cluster topology highlighted in yellow (as well as the IP of the missing cell), and the real Artemis cluster topology highlighted in red; the disparity between the expected value and real value indicates that the Artemis cluster has degraded and is missing a member, thus inducing slow task processing
ArtemisExample2.png

Environment

VMware Cloud Director for Service Provider 10.x
VMware Cloud Director 10.x

Cause

  • Versions 10.3.X and 10.4.X are susceptible to degradation of the mechanisms that facilitate task processing. The Artemis cluster topology is known to lose participating members and thus induce slowness in the environment

Resolution

  • This issue has persisted on versions 10.3.X and 10.4.X


Workaround:
  • This issue can be temporarily bypassed by performing a rolling reboot on all cells, or alternatively, by running vmware-vcd services on exclusively the primary cell
  • For versions 10.4.X, the following configuration changes can be implemented to reduce the impact of this issue
  • Please note: All cell-management-tool commands listed below only need to be executed on the primary cell
    • Set the connectionTTL to 90s. The default is 60s:
      /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n "jms.cluster.connectionTTL" -v "90000"
       
    • Set the clientFailureCheckPeriod to 45s:
      /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n "jms.cluster.clientFailureCheckPeriod" -v "45000"
       
    • Set the Task Poller retrieval interval to 60s - this polls vCenter for task updates:
      /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n vc-task-completions-retrieval-timer-interval-sec -v 60
       
    • Set the Activity Poller retrieval interval to 60s - this polls data from the activity table for completion of activities:
      /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n vcloud.activities.activityRelayPollingIntervalMs -v 60000
       
    • Set the VCD inventory timeout to 600s:
      /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n InventoryWait -v 600000
       
    • Set the Event Processor duration to 120s:
      /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n event.processor.running.duration.millisec -v 120000
       
    • Perform a shutdown and restart of the vmware-vcd services on ALL cells in the environment:
      service vmware-vcd restart


Additional Information

To verify if the values are already in place, use the "-l" option in the commands.

Example:

# /opt/vmware/vcloud-director/bin/cell-management-tool manage-config -n "jms.cluster.connectionTTL" -l
Property "jms.cluster.connectionTTL" has value "90000"


Impact/Risks:
  • Slow task processing can result in issues with failover add-ons, as well as general discontent with the quality of the VCD experience