After restoring a degraded Continuous Availability (CA) enabled cluster to a healthy state, you observe that multiple vCenter adapter instances in Aria Operations 8.18.5 are no longer collecting data.
Symptoms:
vCenter objects show 0 Virtual Machines and 0 Hosts in the Summary tab.
On the Integrations page, vCenter adapter instances display a Warning status.
Hovering over the status reveals the message "Not Collecting."
Attempting to manually start collection by clicking Stop / Start Collecting from the adapter instance Ellipses (3 dots menu) results in the status changing to "None" and failing to progress to "Collecting."
Cloud Proxies (CPs) show as "Offline" on the Cloud Proxies page, despite the Admin UI potentially showing a "Healthy" status.
Console access to the Cloud Proxies reveals a reboot loop with the following errors:
Remounting filesystem read-only
EXT4-fs error (device dm-2): ext4_journal_check_start:83: comm python: Detected aborted journal
In some instances, a vCenter adapter instance may automatically move from a Cloud Proxy to a data node but remains in a "Not Collecting" state.
Aria Operations 8.18.5
Continuous Availability (CA) enabled cluster
Cloud Proxy (CP) appliances
The issue can be caused by filesystem corruption on the Cloud Proxy appliances, which triggers a read-only remount and a subsequent boot issue such as getting stuck in Single User Mode. This corruption, combined with a previously degraded CA cluster and Fault Domain instability, prevents the collection framework from successfully initializing or failing over adapter instances. Even if an instance moves to a healthy data node or cloud proxy, the underlying collection service and cluster instability can prevent the VC adapter from starting collections automatically.
Ensure the Continuous Availability (CA) cluster and all Fault Domains are in a Healthy state. If the cluster is still degraded, contact Broadcom Support for assistance in restoring Fault Domain health.
WARNING: Improper shutdown and power on of fault domain nodes in a CA enabled cluster can cause further issues.
Follow KB - Shutdown and Startup sequence for Aria Operations cluster
If cluster is healthy, log in to the vCenter Server and locate the affected Cloud Proxy virtual machines.
Power Off and then Power On the affected Cloud Proxy virtual machines to break the reboot loop and allow the guest OS to perform a filesystem check (if needed).
Log in to the Aria Operations UI and verify that the Cloud Proxies now show as Online.
For any Cloud Proxy that remained online throughout the event, perform the following:
Establish an SSH session to the node as root and run the command: service vmware-vcops restart or reboot the Cloud Proxy
In the Aria Operations UI, navigate to the Integrations page.
For each affected vCenter adapter instance:
Click the ellipses (three dots) and select Edit.
Click Validate Connection to confirm communication is established.
Select Stop Collecting and wait 1 to 2 minutes.
Select Start Collecting and wait up to 5 minutes for the status to return to OK.
NOTE: You may need to refresh the application UI/browser to see the updated status.