Aria Operations upgrade fails at step 9 of 14 with "Failedresource key=pak_manager.action_failed, resource args=[run master postgres db upgrade]"
search cancel

Aria Operations upgrade fails at step 9 of 14 with "Failedresource key=pak_manager.action_failed, resource args=[run master postgres db upgrade]"

book

Article ID: 380047

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Aria Operations upgrade fails at step 9 of 14 and unable to continue with upgrade with failed status "Failedresource key=pak_manager.action_failed, resource args=[run master postgres db upgrade]".

In the centralsqldbupgrade.log:

2024-10-15 10:45:14,100 INFO [main] com.vmware.vcops.dbupgrade.postgres.centraldb.upgrade.v818.FixingDataShardName.upgrade - Processing key = REPORT:com.vmware.statsplatform.persistence.content.report.Report.########-####-####-####-############.CSV
2024-10-15 10:45:14,101 ERROR [main] .processJAVA - error:
java.lang.NullPointerException: null
   at org.json.JSONTokener.nextCleanInternal(JSONTokener.java:128) ~[geode-json-1.7.0.jar:?]
   at org.json.JSONTokener.nextValue(JSONTokener.java:106) ~[geode-json-1.7.0.jar:?]
   at org.json.JSONObject.<init>(JSONObject.java:164) ~[geode-json-1.7.0.jar:?]
   at org.json.JSONObject.<init>(JSONObject.java:179) ~[geode-json-1.7.0.jar:?]
   at com.vmware.statsplatform.persistence.globaldata.XStreamUtil.extractClassKeyFromJson(XStreamUtil.java:229) ~[persistence-1.0-SNAPSHOT.jar:?]
   at com.vmware.statsplatform.persistence.globaldata.XStreamUtil.extractClassKey(XStreamUtil.java:251) ~[persistence-1.0-SNAPSHOT.jar:?]
   at com.vmware.statsplatform.persistence.globaldata.XStreamUtil.deserialize(XStreamUtil.java:363) ~[persistence-1.0-SNAPSHOT.jar:?]
   at com.vmware.statsplatform.persistence.global.KvXstreamSerializer.fromString(KvXstreamSerializer.java:14) ~[persistence-1.0-SNAPSHOT.jar:?]
   at
...
Caused by: java.lang.NullPointerException
   at org.json.JSONTokener.nextCleanInternal(JSONTokener.java:128) ~[geode-json-1.7.0.jar:?]

Environment

Aria Operations 8.x

Cause

This is due to a null value in one of the reports in the kv_data_shard table.

Resolution

  1. Take a snapshot or backup prior.  These steps remove records from a table.

    1. Log into the primary node as root.
    2. Switch to the postgres user:
      • su postgres
    3. Log into the database:
      • /opt/vmware/vpostgres/current/bin/psql -p 5433 vcopsdb
    4. In the database query for the record in the kv_data_shard table using key ID from the error:
      • select * from kv_data_shard where key like '%########-####-####-####-############%';
    5. This should return one record.  You will notice a few null values in the record:
      • REPORT:com.vmware.statsplatform.persistence.content.report.Report.########-####-####-####-############.CSV      \N      \N      STRING  2023-02-08 10:14:42.714 \N      \N      \N      \N
    6. Delete the record from the table using the same query but changing it to delete:
      • delete from kv_data_shard where key like '%########-####-####-####-############%';
    7. Verify the record is no longer in the table with the same query from step 5:
      • select * from kv_data_shard where key like '%########-####-####-####-############%';
    8. Verify there are no other records with null values:
      • select * from kv_data_shard where col__kv_strvalue is null or data_type is null or mark_for_delete is null or primary_shard is null;
    9. This should return 0 records, if not they will need to be cleaned out using the key ID that is returned from the query.  Return to step 4 - 6 to delete.
    10. Proceed with upgrade again once all the bad records have been cleaned out.