Platform unavailable on CA cluster in VMware Aria Operations.
search cancel

Platform unavailable on CA cluster in VMware Aria Operations.

book

Article ID: 422723

calendar_today

Updated On:

Products

VCF Operations

Issue/Introduction

  • On the affected analytics node, the Analytics and Postgres services are unable to start.
    This can be validated by SSH'ing to the node as root and running the following commands:

    systemctl status analytics.service
    systemctl status vpostgres.service

  • The paired node shows the following exception the file: /storage/log/vcops/log/analytics-uuid.log 

    2025-12-08T08:09:10,486+0000 WARN  [DataForwarder]  com.vmware.vcops.analytics.gemfire.ForwardDataContainer.beforeCreate - ANALYTICS_FORWARD_DATA_REGION region exceed the maximum number (20000) of entries. Rejecting put in region

  • After running the following script: 

    /usr/lib/vmware-vcopssuite/utilities/bin/gss_troubleshooting.sh

    The NODE STATE INFO on the affected node is in a "BALANCING" state when checking the file: /storage/log/vcops/log/<Node_FQDN>-gss_troubleshooting-xxxx.log on the Primary or Primary replica nodes as below:

    --> NODE STATE INFO: <--
      "stateMappings": {
    "clusternamne": {
    "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx": {
    "state": "RUNNING"
    },
    "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx": {
    "state": "RUNNING"
    },
    "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx": {
    "state": "RUNNING"
    },
    "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx": {
    "state": "RUNNING"
    },
    xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx": {
    "state": "BALANCING"
    },
    "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx": {

    "state": "RUNNING"

Environment

VMware Aria Operations 8.18.x

Resolution

  1. Take Aria Operations Offline: Log in to the Admin UI and take the cluster offline.

  2. Snapshots: Create snapshots for all analytics nodes across both domains. For detailed instructions, refer to: Snapshot Creation in VMware Aria Operations.

  3. Replace the Faulty Node: Follow the replacement procedure outlined in the Continuous Availability FAQs.

    • Note: This process involves significant data synchronization and may take a considerable amount of time to complete.