Continuous Availability (CA) cluster stuck in Initialization during Witness replacement - VCF Operations
search cancel

Continuous Availability (CA) cluster stuck in Initialization during Witness replacement - VCF Operations

book

Article ID: 437782

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

When attempting to replace an existing witness node in a VCF Operations 9.x environment configured with Continuous Availability (CA), clusters and geographically separated datacenters where vMotion is restricted, the replacement operation fails to complete. The following may be observed:

  • The cluster status remains indefinitely stuck in "Cluster initialization is in progress."
  • The new witness node is not successfully added to the cluster roles.
  • The old witness node remains present in the cluster inventory despite the replacement attempt.
  • Attempts to disable Continuous Availability (CA) while keeping all nodes result in the cluster hanging in a "Disabling CA" state.
  • The cluster may appear healthy from a service perspective, and network connectivity between nodes is validated as functional.

Environment

VCF Operations 9.0.x

Resolution

Prerequisites

Procedure

  1. Bring the Cluster Offline: If the cluster is stuck in an offline loop, use the Take Cluster Offline button in the Admin UI to force a fully offline state. 
  2. Bring the Cluster Online: Ensure all the nodes in the cluster are fully online before proceeding.
  3. Disable Continuous Availability:
    • In the Admin UI, select Disable CA.
    • Choose the option Keep all nodes. This converts the replica node back into a standard data node.
  4. Remove the Witness Node:
    • Take the cluster offline again.
    • Delete the old witness node from the inventory while the cluster is offline.
  5. Bring the Cluster Online: Click Bring Cluster Online and wait for the status to show as Online.
  6. Add New Witness Node:
    • In the Admin UI, click the plus (+) button.
    • Select the Witness Node cluster role and provide the new IP address.
  7. Enable Continuous Availability: click the bottom enable the CA cluster. High Availability (HA) will also be enabled during this process.
  8. Verify: Confirm the witness is part of the cluster as per the VCF Operations admin UI.