Upgrading standalone BOSH director on Azure failed at the post-installation step due to status code 403
search cancel

Upgrading standalone BOSH director on Azure failed at the post-installation step due to status code 403

book

Article ID: 293849

calendar_today

Updated On:

Products

Operations Manager

Issue/Introduction

The standalone BOSH director was deployed on Azure directly without involving Ops Manager. When trying to upgrade this standalone BOSH director, the new VM for BOSH director was created successfully with all the jobs up and running. However it failed, at the last step, to do bosh clean-up in order to delete unused stemcells. This is because it tried to access some storage account for which the operator was not authorized to do so. The error is similar to the following messages:
Finished deploying (01:03:06)
 
Deleting unused stemcell 'bosh-stemcell-xxxx'... Failed (00:04:24)
Stopping registry... Finished (00:00:00)
Cleaning up rendered CPI jobs... Finished (00:00:00)
 
Deleting stemcell from cloud:
  CPI 'delete_stemcell' method responded with error: 
CmdError{"type":"Bosh::Clouds::CloudError","message":"get_blob_properties: #\u003cAzure::Core::Http::HTTPError:46923758951300 
@http_response: #\u003cAzure::Core::Http::HttpResponse:0x0000555a93f7b8c0 @http_response=#\u003cFaraday::Response:0x0000555a93f7bac8 
@on_complete_callbacks=[], 
@env=#\u003cFaraday::Env 
@method=:head 
@body=\"\" 
@url=#\u003cURI::HTTPS https://passaeasops.blob.core.windows.net/stemcell/bosh-stemcell-xxxx.vhd\u003e 
@request=#\u003cFaraday::RequestOptions open_timeout=60\u003e 
@request_headers={\"User-Agent\"=\u003e\"BOSH-AZURE-CPI; Azure-Storage/0.12.3-preview (Ruby 2.4.4-p296; Linux linux-gnu)\", 
\"x-ms-date\"=\u003e\"Fri, 30 Apr 2021 14:56:26 GMT\", \"x-ms-version\"=\u003e\"2015-04-05\", 
\"DataServiceVersion\"=\u003e\"1.0;NetFx\",\"MaxDataServiceVersion\"=\u003e\"3.0;NetFx\", \"Content-Type\"=\u003e\"application/atom+xml; charset=utf-8\", 
\"x-ms-client-request-id\"=\u003e\"xxxx\", \"Content-Length\"=\u003e\"0\", \"Authorization\"=\u003e\"SharedKey passaeasops:xxxx"} 
@ssl=#\u003cFaraday::SSLOptions verify=true\u003e 
@response=#\u003cFaraday::Response:0x0000555a93f7bac8 ...\u003e 
@response_headers={\"transfer-encoding\"=\u003e\"chunked\", \"server\"=\u003e\"Microsoft-HTTPAPI/2.0\", \"x-ms-request-id\"=\u003e\"dacd7009-301e-0064-50d1-3d62b9000000\", \"date\"=\u003e\"Fri, 30 Apr 2021 14:56:31 GMT\", \"connection\"=\u003e\"close\"} @status=403 @reason_phrase=\"This request is not authorized to perform this operation.\"\u003e\u003e, @uri=#\u003cURI::HTTPS https://passaeasops.blob.core.windows.net/stemcell/bosh-stemcell-xxxx.vhd\u003e\u003e, 
@uri: #\u003cURI::HTTPS https://passaeasops.blob.core.windows.net/stemcell/bosh-stemcell-xxxx.vhd\u003e, @status_code: 403, 
@type: \"Unknown\", 
@description: \"This request is not authorized to perform this operation.\"\u003e\n/root/.bosh/installations/xxxx/packages/bosh_azure_cpi/vendor/bundle/ruby/2.4.0/gems/azure-core-0.1.14/lib/azure/core/http/retry_policy.rb:58:in `call'\n/root/.bosh/installations/c8c6251a-e7ee-4f77-49eb-xxxx/packages/bosh_azure_cpi/vendor/bundle/ruby/2.4.0/gems/azure-core-0.1.14/lib/azure/core/http/http_request.rb:110:in `block in with_filter'\n/root/.bosh/installations/xxxx/packages/bosh_azure_cpi/vendor/bundle/ruby/2.4.0/gems/azure-core-0.1.14/lib/azure/core/http/signer_filter.rb:28:in `call'\n/root/.bosh/installations/xxxx/packages/bosh_azure_cpi/vendor/bundle/ruby/2.4.0/gems/azure-core-0.1.14/lib/azure/core/http/signer_filter.rb:28:in `call'\n/root/.bosh/installations/xxxx/packages/bosh_azure_cpi/vendor/bundle/ruby/2.4.0/gems/azure-core-0.1.14/l 

The error was caused by a known issue with bosh-azure-cpi-release, which would scan all storage accounts for deleting stemcells. The operator sometimes does not have sufficient privilege to access all storage accounts in it's organization on Azure.

Environment

Product Version: 2.10

Resolution

The issue was fixed in bosh-azure-cpi-release v37.0.0+ where a new feature was introduced to enable the operator to do cleanup against the default storage account only. The feature flag is called use_default_account_for_cleaning which could be set in the manifest file for deploying the BOSH director.