Resolving VM Metrics Flow Issues After 2.1.1 Upgrade
search cancel

Resolving VM Metrics Flow Issues After 2.1.1 Upgrade

book

Article ID: 376402

calendar_today

Updated On:

Products

VMware Data Services Manager

Issue/Introduction

This guide provides instructions for resolving the issue of VM metrics are not being reported after upgrading to DSM version 2.1.1.

Symptoms

  • VM metrics (such as CPU, Memory, and Data Disk usage) appear empty in the monitoring dashboard.
  • The APMS service fails to start.

Navigate to Databases → Select any database → Monitoring → Metrics panel. The charts for CPU Usage, Memory Usage, and Data Disk Usage will be empty.

When you SSH to the "Provider VM" and look at the "apms.service.log" (/var/log/tdm/provider/apms.logyou see messages like:

failed to find parent tuple for heap-only tuple at <ctid>.

Environment

VCF and Data Services Manager 2.1.1

Cause

The Provider PostgreSQL database was corrupted.

Resolution

To resolve this known issue proceed with the following steps:

  • SSH into provider VM and check the /var/log/tdm/provider/apms.log to find ctid

 

  • Run the following commands. The time required depends on the number of rows in the vmware.vm_timeseries table:

psql -d vmware -U postgres -c "begin;delete from vmware.vm_timeseries where ctid = '<ctid>';end;"
psql -d vmware -U postgres -c "VACUUM FULL vmware.vm_timeseries"
psql -d vmware -U postgres -c "REINDEX TABLE vmware.vm_timeseries"
systemctl restart apms.service

  • Verify whether the apms.service has started successfully. If the error recurs with a different ctid, repeat the above commands using the new ctid.

 

  • Once the apms.service starts, VM metrics will resume, and the monitoring dashboard will display the metrics after some time.