Certificate summary, connectivity, DNS and few other metrics are missing for a specific vCenter server in Aria Operations
search cancel

Certificate summary, connectivity, DNS and few other metrics are missing for a specific vCenter server in Aria Operations

book

Article ID: 397610

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

VIH metrics missing from workload domain, present for management domain. Manually executing the Health Summary task on the workload domain would take 7 minutes to successfully complete.
 
With this DEBUG logging, we were able to see the following errors in the VMwareInfraHealthAdapter.log:
VCF Health Summary Task <task id> is failed with status as COMPLETED_WITH_FAILURE/Unable to download zip file, will retry again with new task id
Found conflict code: 409, Waiting 300 seconds for other task to get completed in the vcf SDDC manager and retrying again.

Environment

Aria Operations 8.18.x

Cause

  • The management domain was collecting all its VIH properties, but the workload domain was not.
  • After upgrading VCF from 5.2 to 5.2.1.1, a lock was in place, preventing the health summary from running successfully.
  • Even after the lock was removed, the VIH adapter was unable to download the zip file containing the data from the workload domain, due to a 409 Conflict http status code.
  • The health summary task was completing on the SDDC side, but the VIH adapter's 5-minute timeout was not sufficient for the longer task completion time in the workload domain.

Resolution

  1. Increase Timeouts:
    • In the /usr/lib/vmware-vcops/user/plugins/inbound/VMwareInfrastructureHealthAdapter/conf/VMwareInfraHealth.properties file, increase the following timeouts:
      • VCF_SOS_API_HEALTH_TASK_STATUS_CHECK_WAIT_TIME_IN_SECONDS from 300 seconds to 600 seconds
      • VCF_SOS_API_HEALTH_TASK_STATUS_IN_PROGRESS_CHECK_WAIT_TIME_IN_SECONDS from 60 seconds to 120 seconds
      • VCF_SOS_API_HEALTH_TASK_OPERATION_IN_PROGRESS_WAIT_TIME_IN_SECONDS from 300 seconds to 600 seconds
  2. Restart VIH Adapter:
    • Restart the VMware Infrastructure Health Adapter instance from the Inventory Explorer.
  3. Verify Results:
    • In the next collection cycle, the missing data should be collected and displayed in Aria Operations.

Additional Information

  • If the issue persists, ensure all vCenter adapters, including VC, VSAN, and NSX, are on the same Cloud Proxy as the VCF adapter.
  • If the health summary takes significantly longer than the increased timeouts (e.g., on larger networks), consider further increasing the timeout values or investigate performance issues within the SDDC Manager.
  • Regularly monitor the vmware-vcops/user/plugins/inbound/VMwareInfrastructureHealthAdapter/work/ directory for potential errors or warnings.
  • Review the SDDC Manager's /var/log/vmware/vcf/sddc-support/vcf-sos-activity.log file for any further insights.
  • Use DEBUG logging for the following classes:
    • com.vmware.adapter3.vmwareinfrahealth.util.VcfSosApiUtil
    • com.vmware.adapter3.vmwareinfrahealth.managers.VCenterResourceManager
Note: This resolution applies specifically to this case. It may not address all potential causes of missing metrics in Aria Operations. Please contact VCF support for help with any other scenarios.