NAPP 4.2.0 Upgrade Failure During Metrics Feature Upgrade - Site-Service DB Table Not Created During Upgrade
book
Article ID: 373017
calendar_today
Updated On:
Products
VMware vDefend Firewall with Advanced Threat Prevention
Issue/Introduction
During NAPP upgrade to 4.2.0, upgrade of the Metrics feature fails because the metrics-nsx-config component fails to come up. If other failures are observed during metrics upgrade, the source of the failure may be different and this article may not be applicable.
You can validate if the issue described in this document is applicable by performing the following.
Log into the NSX Manager and using napp-k command as below:
2024-07-20T08:37:31.45241032Z stdout F 2024-07-20T08:37:31.452Z ERROR repository/repository.go:14Error migrating DgsSites table > {"error": "FATAL: terminating connection due to administrator command (SQLSTATE 57P01)"}
If no matches are found in the logs, then the issue is likely something else.
Environment
NAPP 4.2.0
Cause
The site-service component waits until postgres is running before starting. When it starts up, it will connect to postgres, and subsequently create/migrate table into the configuration database. However, if migration fails (e.g. postgres restarts during the migration), it is not retried; a restart of site-service is required. Since the table is not created in postgres, downstream applications that rely on this table will fail.
Resolution
Log into the NSX Manager. All commands in subsequent steps are run from the NSX Manager.
All three pods from the above output should have output similar to: '1/1 Running'
Copy the script from the KB to the NSX manager. It must not be stored in the /tmp directory. The following steps assume the script is named kb_upgrade_fix.sh
Store the contents of this command to a file as a precautionary backup: napp-k get job load-default-site -o yaml > load-default-site-backup.yaml
Make the script executable: chmod +x kb_upgrade_fix.sh
Run the script ./kb_upgrade_fix.sh, the script will print "script executed successfully, please restart pods and retry upgrade" if everything succeeded
Restart the failing pod: napp-k rollout restart deploy/metrics-nsx-config