Error: "Cluster Health: READ ONLY PRIMARY" for Embedded Database Availability in VAMI UI after attempting to add a Standby Cell
search cancel

Error: "Cluster Health: READ ONLY PRIMARY" for Embedded Database Availability in VAMI UI after attempting to add a Standby Cell

book

Article ID: 397673

calendar_today

Updated On:

Products

VMware Cloud Director

Issue/Introduction

  • VMware Cloud Director(VCD) cluster in VAMI shows READ_ONLY_PRIMARY and the standby cells shows it is not following the primary.
  • After removing the 2 failed standby cells from VAMI using this document: Unregister a Failed VMware Cloud Director Appliance Primary or Standby Cell in a Database High Availability Cluster using API we can see the VAMI shows in HEALTHY status. But again when we add the standby cells back from a new .ova the VAMI shows the error of READ_ONLY_PRIMARY again.
  • The /var/vmware/vpostgres/current/pgdata/log/postgres-<date>.log files on the Standby Cell show WAL errors similar to the following:

    FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment #### has already been removed

Environment

VMware Cloud Director 10.5.1

Cause

This issue occurs if the WAL files on the Primary Cell present in the location: /var/vmware/vpostgres/current/pgdata/pg_wal are rotating too quickly to allow time for the new Standby Cell to copy the database and start replication. This can happen in large or busy Cloud Director environments where the database on the Primary Cell can be updating relatively quickly.

Resolution

To workaround this issue when adding Standby Cells in a large or busy environment, temporary stop the Cloud Director service on all other Cells until the Standby Cell has been added successfully.
Please note that it might cause a downtime for the VCD services for the provider and the tenants. Follow this kb for how to stop and then start the vmware-vcd service on the cells.: Steps to stop, start, or restart the VMware Cloud Director service before rebooting the nodes