Cluster shrink stuck at 1%
search cancel

Cluster shrink stuck at 1%

book

Article ID: 423861

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite) VCF Operations

Issue/Introduction

Trying to shrink the VCF Operations cluster to consolidate into fewer nodes. There has been no progress in data being moved in many hours and the process seems to be stuck.

Note: Shrinking nodes out of Operations clusters with lots of data to be moved can take multiple hours or days. We recommend to shrink only a single node from the cluster at a time.

Environment

Aria Operations 8.18.x

VMware VCF Operations 9.0.x

Cause

Excessive instanced metrics collected for objects has inflated the size of the data files to be transferred between the nodes in the cluster in the underlying metric database.

Resolution

Validate that large FSDB files exist in the environment

  1. Log in to all VCF Operations analytic nodes (primary, replica, data) as root via SSH
  2. Find any .dat files in the FSDB larger than 1 GB
    find /storage/db/vcops/data -name "*.dat" -type f -size +1G -exec ls -lSh {} \;
  3. If files are located, contact Broadcom Support for further assistance