Opsman Upgrade failing to decrypt installation.zip
search cancel

Opsman Upgrade failing to decrypt installation.zip

book

Article ID: 387760

calendar_today

Updated On:

Products

Operations Manager

Issue/Introduction

While attempting to upgrade from Opsman 3.0.35 to 3.0.36 and 3.0.37  the upgrade is failing when importing the existing installation.zip config.

It appears that it can't decrypt the installation file

to further confirm the problem on the newly deployed Opsman (the one that is failing to import) - via ssh - 

Review the logs under /var/log/opsmanager/production.log

Error message like below can be seen:

I, [2025-01-22T11:26:10.085579 #1221]  INFO -- : [021b381e-7584-4a4f-a2c6-4df470867ca0] Migrating to AddExcludeConstraintToInstallsTable (20241205220441)
W, [2025-01-22T11:26:10.094438 #1221]  WARN -- : [021b381e-7584-4a4f-a2c6-4df470867ca0] Caught error while importing Rails database: #<StandardError:"An error has occurred, this and all later migrations canceled:\n\nPG::ExclusionViolation: ERROR:  could not create exclusion constraint \"excl_install_status\"\nDETAIL:  Key (status)=(running) conflicts with key (status)=(running).\n">
W, [2025-01-22T11:26:10.094483 #1221]  WARN -- : [021b381e-7584-4a4f-a2c6-4df470867ca0] Attempting to restore back to previous state. This may fail.
I, [2025-01-22T11:26:10.095224 #1221]  INFO -- : [021b381e-7584-4a4f-a2c6-4df470867ca0] DatabaseExporterImporter - stdout: ===== 2025-01-22 11:26:10 UTC Running "pg_restore --clean --dbname=tempest_production /tmp/prod_backup20250122-1221-pbuk9"

Environment

Opsman 3.0.36+

 

Cause

This situation may occur on Ops Manager database that was in an invalid state when 'Export Installation Settings' was done:

  1. The database had recorded more than one 'Apply Changes' operation when 'Export Installation Settings' was attempted. This is an invalid state and was a known bug that was addressed in Opsman 3.0.36 by adding a constraint to the database not to allow multiple 'Apply Changes'. It looks like somehow it was in this state from prior actions either due to (a) concurrent 'Apply Changes' being attempted (this was the known bug), and/or (b) 'Export Installation Settings' was done while an 'Apply Changes' was actively running. 
  2. When installation.zip was imported during Opsman upgrade to 3.0.36, the database tried to enforce this constraint during migration and failed.

Confirmation steps:

In order to confirm the status in the database follow below steps:

  1. SSH into the Operations Manager VM and then sudo to root:
    sudo -i
  2. Change directories
    cd /home/tempest-web/tempest/web
  3. Run the following command to get to the database based on the provider used  (vsphere, aws, azure, gcp, openstack):
    INFRASTRUCTURE=vsphere
    RAILS_ENV='production' TEMPEST_INFRASTRUCTURE=$INFRASTRUCTURE TEMPEST_WEB_DIR='/home/tempest-web' DATA_ROOT='/var/tempest' LOG_DIR='/var/log/opsmanager' SECRET_KEY_BASE='secret' su tempest-web --command 'bundle exec rails dbconsole'
    
    select id,created_at,updated_at,finished_at,finished,status FROM installs;
    select id,created_at,updated_at,finished_at,finished,status FROM installs where status = 'running';


  4. Confirm if there are any operations in running state

 

Resolution

DO NOT PROCEED in case the result above does not confirm the issue. 

------------------------------------------------------------------------------------

From the Postgres console from the above steps , run:

update installs set status='failed' where status = 'running';

Here is sample of the steps completed:

Confirmation step:

select id,created_at,updated_at,finished_at,finished,status FROM installs where status = 'running';
 id |         created_at         |         updated_at         |        finished_at         | finished | status
----+----------------------------+----------------------------+----------------------------+----------+---------
  4 | 2025-01-22 11:33:03.377972 | 2025-01-22 11:40:16.901384 | 2025-01-22 11:40:16.900902 | t        | running
(1 row)

update the field form running to failed:

update installs set status='failed' where status = 'running';
UPDATE 1

Verify the result with the db check

select id,created_at,updated_at,finished_at,finished,status FROM installs where status = 'running';
 id | created_at | updated_at | finished_at | finished | status
----+------------+------------+-------------+----------+--------
(0 rows)

Alternative method :

# If you prefer to use the Rails console instead of Postgres, run step 3 from [https://knowledge.broadcom.com/external/article?articleNumber=293866] as is.
# From the Rails console, run {{Tempest::Install.where(Tempest::Install.arel_table[:status].eq("running")).update_all(:status => "failed")}}

 

Additional Information

This is expected behaviour in the latest versions of opsman to prevent undesired state of deployments during upgrade process. 

From opsman 3.0.38 this issue should be completely fixed