While attempting to upgrade from Opsman 3.0.35 to 3.0.36 and 3.0.37 the upgrade is failing when importing the existing installation.zip config.
It appears that it can't decrypt the installation file
to further confirm the problem on the newly deployed Opsman (the one that is failing to import) - via ssh -
Review the logs under /var/log/opsmanager/production.log
Error message like below can be seen:
I, [2025-01-22T11:26:10.085579 #1221] INFO -- : [021b381e-7584-4a4f-a2c6-4df470867ca0] Migrating to AddExcludeConstraintToInstallsTable (20241205220441)
W, [2025-01-22T11:26:10.094438 #1221] WARN -- : [021b381e-7584-4a4f-a2c6-4df470867ca0] Caught error while importing Rails database: #<StandardError:"An error has occurred, this and all later migrations canceled:\n\nPG::ExclusionViolation: ERROR: could not create exclusion constraint \"excl_install_status\"\nDETAIL: Key (status)=(running) conflicts with key (status)=(running).\n">
W, [2025-01-22T11:26:10.094483 #1221] WARN -- : [021b381e-7584-4a4f-a2c6-4df470867ca0] Attempting to restore back to previous state. This may fail.
I, [2025-01-22T11:26:10.095224 #1221] INFO -- : [021b381e-7584-4a4f-a2c6-4df470867ca0] DatabaseExporterImporter - stdout: ===== 2025-01-22 11:26:10 UTC Running "pg_restore --clean --dbname=tempest_production /tmp/prod_backup20250122-1221-pbuk9"
Opsman 3.0.36+
This situation may occur on Ops Manager database that was in an invalid state when 'Export Installation Settings' was done:
Confirmation steps:
In order to confirm the status in the database follow below steps:
sudo -i
cd /home/tempest-web/tempest/web
INFRASTRUCTURE=vsphere
RAILS_ENV='production' TEMPEST_INFRASTRUCTURE=$INFRASTRUCTURE TEMPEST_WEB_DIR='/home/tempest-web' DATA_ROOT='/var/tempest' LOG_DIR='/var/log/opsmanager' SECRET_KEY_BASE='secret' su tempest-web --command 'bundle exec rails dbconsole'
select id,created_at,updated_at,finished_at,finished,status FROM installs;
select id,created_at,updated_at,finished_at,finished,status FROM installs where status = 'running';
DO NOT PROCEED in case the result above does not confirm the issue.
------------------------------------------------------------------------------------
From the Postgres console from the above steps , run:
update installs set status='failed' where status = 'running';
Here is sample of the steps completed:
Confirmation step:
select id,created_at,updated_at,finished_at,finished,status FROM installs where status = 'running';
id | created_at | updated_at | finished_at | finished | status
----+----------------------------+----------------------------+----------------------------+----------+---------
4 | 2025-01-22 11:33:03.377972 | 2025-01-22 11:40:16.901384 | 2025-01-22 11:40:16.900902 | t | running
(1 row)
update the field form running to failed:
update installs set status='failed' where status = 'running';
UPDATE 1
Verify the result with the db check
select id,created_at,updated_at,finished_at,finished,status FROM installs where status = 'running';
id | created_at | updated_at | finished_at | finished | status
----+------------+------------+-------------+----------+--------
(0 rows)
Alternative method :
# If you prefer to use the Rails console instead of Postgres, run step 3 from [https://knowledge.broadcom.com/external/article?articleNumber=293866] as is.
# From the Rails console, run {{Tempest::Install.where(Tempest::Install.arel_table[:status].eq("running")).update_all(:status => "failed")}}
This is expected behaviour in the latest versions of opsman to prevent undesired state of deployments during upgrade process.
From opsman 3.0.38 this issue should be completely fixed