During a bosh deploy, the update of an instance completes immediately and it seems like it has not done anything at all

search cancel

During a bosh deploy, the update of an instance completes immediately and it seems like it has not done anything at all

book

Article ID: 386201

calendar_today

Updated On: 02-19-2025

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

During a bosh deploy, the update of an instance completes immediately and it seems like it has not done anything at all.

When remediating a cluster deployment using a bosh-deploy of the manifest, the update of an instance could complete immediately within a second as indicated with the completion time of "00:00:00". It seems like it has not updated the instance at all, and proceeds to the next instance right away.

Example of a bosh deploy command that was ran:

bosh -d $deployment deploy manifest-file.yaml --fix --recreate

The bosh task logs show that the update of the master (instance 0) completed immediately and it does not seem like it has been updated at all.

Task 129 | 22:41:38 | Deprecation: Global 'properties' are deprecated. Please define 'properties' at the job level.
Task 129 | 22:41:39 | Preparing deployment: Preparing deployment
Task 129 | 22:41:40 | Warning: DNS address not available for the link provider instance: pivotal-container-service/c1ef28ba-7476-46a1-zzzz-xxx
Task 129 | 22:41:40 | Warning: DNS address not available for the link provider instance: pivotal-container-service/c1ef28ba-7476-46a1-zzzz-xxx
Task 129 | 22:41:40 | Warning: DNS address not available for the link provider instance: pivotal-container-service/c1ef28ba-7476-46a1-zzzz-xxx
Task 129 | 22:41:51 | Preparing deployment: Preparing deployment (00:00:12)
Task 129 | 22:41:51 | Preparing deployment: Rendering templates (00:00:04)
Task 129 | 22:41:55 | Preparing package compilation: Finding packages to compile (00:00:00)
Task 129 | 22:43:15 | Updating instance master: master/55f79999-43ec-4a2e-zzzz-xxx (0) (canary) (00:00:00)
Task 129 | 22:43:15 | Updating instance master: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:43:35 | L executing pre-stop: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:43:35 | L executing drain: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:43:36 | L stopping jobs: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:43:37 | L executing post-stop: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:44:16 | L installing packages: master/65f79999-43ec-4a2e-yyyy-www (1)
...

The bosh vms output would show the "-" character as the status of the problematic instance:

Instance                            Process State  AZ   IPs         VM CID                             VM Type      Active  Stemcell
master/55f79999-43ec-4a2e-zzzz-xxx  -              az1  10.xx.yy.2  vm-xxxxxxx-a041-4d03-98e9-xxxxxxx  medium.disk  true    bosh-vsphere-esxi-ubuntu-jammy-go_agent/1.506
master/65f79999-43ec-4a2e-yyyy-www  running        az1  10.xx.yy.3  vm-xxxxxxx-80bc-402c-90b0-xxxxxxx  medium.disk  true    bosh-vsphere-esxi-ubuntu-jammy-go_agent/1.506
master/75f79999-43ec-4a2e-ssss-ttt  running        az1  10.xx.yy.4  vm-xxxxxxx-2c91-4d53-a246-xxxxxxx  medium.disk  true    bosh-vsphere-esxi-ubuntu-jammy-go_agent/1.506
worker/85f79999-43ec-4a2e-ssss-ttt  running        az1  10.xx.yy.5  vm-xxxxxxx-5804-4a98-89d5-xxxxxxx  medium.disk  true    bosh-vsphere-esxi-ubuntu-jammy-go_agent/1.506

When logging into the problematic instance via "bosh ssh", and running the "monit summary" command, the output does not show any jobs at all and that is because no packages have been installed yet.

Environment

VMware Tanzu Kubernetes Grid Integrated Edition

Resolution

As a workaround, run the following steps to recover the problematic instance.

Ignore all instances except for the problematic instance. Run the following command to ignore each instance (master and worker):
```
bosh -d $deployment ignore master/65f79999-43ec-4a2e-yyyy-www
```

Recreate the problematic instance by running the following command:

bosh -d $deployment recreate master/55f79999-43ec-4a2e-zzzz-xxx --fix --no-converge

Once the problematic instance has been recreated successfully, unignore all the ignored instances by running this command for each of them:
```
bosh -d $deployment unignore master/65f79999-43ec-4a2e-yyyy-www
```

Feedback

thumb_up Yes

thumb_down No