Failed to download support bundle using Ops Manager UI
search cancel

Failed to download support bundle using Ops Manager UI

book

Article ID: 372616

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

On the Ops Manager web console, fetching the support resource bundle hangs and it fails after refreshing the opsman UI.

The following log messages could be seen in the Ops Manager log file (/var/log/opsmanger/production.log)

ERROR -- : 2024-07-08T15:20:32+0000: [Worker(delayed_job host:example.com pid:940)] Job CreateSupportBundleJob [45e767ac-8f77-78f1-aba8-eb3ac4a7b88d] from DelayedJob(default) with arguments: [16] (id=8921) (queue=default) FAILED (4 prior attempts) with Errno::ECONNREFUSED: Connection refused - connect(2) for /var/tempest/run/unlock.sock
I, [2024-07-08T15:20:32 #940]  INFO -- : 2024-07-08T15:20:32+0000: [Worker(delayed_job host:example.com pid:940)] 1 jobs processed at 44.1636 j/s, 1 failed
I, [2024-07-08T15:20:34 #345]  INFO -- : [45e767ac-8f77-78f1-aba8-eb3ac4a7b88d] Started GET "/api/v0/support_bundle/16?check_only=true&_=3456899887" for 10.10.105.158 at 2024-07-08T15:20:32
I, [2024-07-08T15:20:34 #345]  INFO -- : [45e767ac-8f77-78f1-aba8-eb3ac4a7b88d] Processing by Api::V0::SupportBundleController#show as JSON
I, [2024-07-08T15:20:34 #345]  INFO -- : [45e767ac-8f77-78f1-aba8-eb3ac4a7b88d]   Parameters: {"check_only"=>"true", "_"=>"3456899887", "id"=>"15"}
I, [22024-07-08T15:20:34 #345]  INFO -- : [45e767ac-8f77-78f1-aba8-eb3ac4a7b88d] Valid UAA token
I, [2024-07-08T15:20:35 #345]  INFO -- : [45e767ac-8f77-78f1-aba8-eb3ac4a7b88d] Completed 200 OK in 257ms (Views: 0.1ms | ActiveRecord: 0.8ms | Allocations: 510007)




Environment

Product Version: 3.0.25

Cause

This happens when socket used to pass the decryption passphrase between the different Ops Manager processes got left behind when Ops Manager crashed

Resolution

We recommend upgrading to Ops Manager v3.0.29+LTS-T or later to resolve this issue.

If an upgrade is not possible, please follow the below workaround.

Remove /var/tempest/run/unlock.sock and restart the Ops Manager "tempest-web" service. 

Stop the "tempest-web" service before removing the socket.

service tempest-web stop 

rm /var/tempest/run/unlock.sock 

service tempest-web start