Cloud Director Task Latency and Messaging Cluster Instability due to global.properties Configuration Mismatch
search cancel

Cloud Director Task Latency and Messaging Cluster Instability due to global.properties Configuration Mismatch

book

Article ID: 428567

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

The Cloud Director environment is experiencing significant performance degradation during standard operations. Specific symptoms include:

  • Task Latency: VM power operations, vApp deployments, and other blocking tasks remain in a "Running" state for extended periods or time out.
  • Messaging Cluster Instability: The messaging cluster (ActiveMQ/Artemis) reports 0 members, even when the Primary node is confirmed active.
  • Member Fluctuation: The member count in the cell-runtime logs fluctuates inconsistently.
  • Log Verification: Analysis of the cell-runtime.log on the Primary node indicates connectivity or cluster membership dropping unexpectedly, correlating with the periods of slowness.

Environment

  • 10.3
  • 10.4
  • 10.5
  • 10.6

Cause

The slow task processing is caused by a fractured Artemis messaging bus resulting from a configuration mismatch within the /opt/vmware/vcloud-director/etc/global.properties file across the server group.

Specific inconsistencies identified include:

  • Database Timeout Mismatch: The database timeout value is not configured correctly on specific nodes, whereas the default and required value for cluster stability is 90. This leads to prolonged waits during database transactions.
  • Legacy Configuration: Presence of superfluous database entries carried over from previous versions, specifically 9.7, that interfere with current DB operations and settings.
  • Inconsistent SSL Security: The ssl.protocols.disallowed entry is missing or mismatched across nodes. VCD endpoints require strict and identical SSL security settings on all cells to maintain the encrypted messaging bus required for task updates.

Resolution

To resolve the latency and restore messaging stability, the global.properties file must be standardised across all cells in the environment.

Prerequisites:

  • Root access to all Cloud Director appliances.
  • A maintenance window as services will require a restart.
  • The corrected global.properties file, or a clean example from a matching version build. If you don't have one, contact Broadcom support.

Procedure:

  • Stop Cloud Director Services: Stop the cell services on all nodes in the cluster to prevent data inconsistency.
  • /opt/vmware/vcloud-director/bin/cell-management-tool -u <admin_user> cell --shutdown
  • Backup Existing Configuration: On every cell, create a backup of the current configuration file.
  • cp /opt/vmware/vcloud-director/etc/global.properties /opt/vmware/vcloud-director/etc/global.properties.bak
  • Replace/Update Configuration: Replace the existing global.properties file on the affected nodes with the corrected, standardised version provided by Broadcom support.
  • Restart Appliance VMs: Fully reboot the Appliance VMs to ensure the OS and application layer pick up the new networking and SSL configurations.