During a bosh deploy, the update of an instance completes immediately and it seems like it has not done anything at all.
When remediating a cluster deployment using a bosh-deploy of the manifest, the update of an instance could complete immediately within a second as indicated with the completion time of "00:00:00". It seems like it has not updated the instance at all, and proceeds to the next instance right away.
Example of a bosh deploy command that was ran:
bosh -d $deployment deploy manifest-file.yaml --fix --recreate
The bosh task logs show that the update of the master (instance 0) completed immediately and it does not seem like it has been updated at all.
Task 129 | 22:41:38 | Deprecation: Global 'properties' are deprecated. Please define 'properties' at the job level.
Task 129 | 22:41:39 | Preparing deployment: Preparing deployment
Task 129 | 22:41:40 | Warning: DNS address not available for the link provider instance: pivotal-container-service/c1ef28ba-7476-46a1-zzzz-xxx
Task 129 | 22:41:40 | Warning: DNS address not available for the link provider instance: pivotal-container-service/c1ef28ba-7476-46a1-zzzz-xxx
Task 129 | 22:41:40 | Warning: DNS address not available for the link provider instance: pivotal-container-service/c1ef28ba-7476-46a1-zzzz-xxx
Task 129 | 22:41:51 | Preparing deployment: Preparing deployment (00:00:12)
Task 129 | 22:41:51 | Preparing deployment: Rendering templates (00:00:04)
Task 129 | 22:41:55 | Preparing package compilation: Finding packages to compile (00:00:00)
Task 129 | 22:43:15 | Updating instance master: master/55f79999-43ec-4a2e-zzzz-xxx (0) (canary) (00:00:00)
Task 129 | 22:43:15 | Updating instance master: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:43:35 | L executing pre-stop: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:43:35 | L executing drain: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:43:36 | L stopping jobs: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:43:37 | L executing post-stop: master/65f79999-43ec-4a2e-yyyy-www (1)
Task 129 | 22:44:16 | L installing packages: master/65f79999-43ec-4a2e-yyyy-www (1)
...
The bosh vms output would show the "-" character as the status of the problematic instance:
Instance Process State AZ IPs VM CID VM Type Active Stemcell
master/55f79999-43ec-4a2e-zzzz-xxx - az1 10.xx.yy.2 vm-xxxxxxx-a041-4d03-98e9-xxxxxxx medium.disk true bosh-vsphere-esxi-ubuntu-jammy-go_agent/1.506
master/65f79999-43ec-4a2e-yyyy-www running az1 10.xx.yy.3 vm-xxxxxxx-80bc-402c-90b0-xxxxxxx medium.disk true bosh-vsphere-esxi-ubuntu-jammy-go_agent/1.506
master/75f79999-43ec-4a2e-ssss-ttt running az1 10.xx.yy.4 vm-xxxxxxx-2c91-4d53-a246-xxxxxxx medium.disk true bosh-vsphere-esxi-ubuntu-jammy-go_agent/1.506
worker/85f79999-43ec-4a2e-ssss-ttt running az1 10.xx.yy.5 vm-xxxxxxx-5804-4a98-89d5-xxxxxxx medium.disk true bosh-vsphere-esxi-ubuntu-jammy-go_agent/1.506
When logging into the problematic instance via "bosh ssh", and running the "monit summary" command, the output does not show any jobs at all and that is because no packages have been installed yet.
VMware Tanzu Kubernetes Grid Integrated Edition
As a workaround, run the following steps to recover the problematic instance.
bosh -d $deployment ignore master/65f79999-43ec-4a2e-yyyy-www
bosh -d $deployment recreate master/55f79999-43ec-4a2e-zzzz-xxx --fix --no-converge
bosh -d $deployment unignore master/65f79999-43ec-4a2e-yyyy-www