After doing network maintenance, many worker VMs in TKGI on vSphere with VMware NSX show 'unresponsive agent' state.
The worker VMs cannot be recreated due to 'Read-only file system' error as shown:
$ bosh -d service-instance_########## recreate worker/########## --fix --no-converge
Using environment '172.##.##.##’ as client 'ops_manager'
Using deployment 'service-instance_##########'
Continue? [yN]: y
Task 11219963
Task 11219963 | 00:32:24 | Deprecation: Global 'properties' are deprecated. Please define 'properties' at the job level.
Task 11219963 | 00:34:40 | Updating instance worker/########## (0): Stopping instance (00:00:00)
Task 11219963 | 00:34:40 | Updating instance worker/########## (0): Deleting VM (00:00:01)
L Error: Unknown CPI error 'Unknown' with message 'Read-only file system @ rb_sysopen - /var/vcap/data/director/tmp/vsphere-cpi-nsxt-ca######-######-######' in 'delete_vm' CPI method (CPI request ID: 'cpi-######')
Task 11219963 | 00:34:41 | Error: Unknown CPI error 'Unknown' with message 'Read-only file system @ rb_sysopen - /var/vcap/data/director/tmp/vsphere-cpi-nsxt-ca######-######-######' in 'delete_vm' CPI method (CPI request ID: 'cpi-######')
Task 11219963 Started Mon Aug 11 00:32:24 UTC 2025
Task 11219963 Finished Mon Aug 11 00:34:41 UTC 2025
Task 11219963 Duration 00:02:17
Task 11219963 error
Non-converging action failed:
Expected task '11219963' to succeed but state is 'error'
Exit code 1
File /var/vcap/data/director/tmp/vsphere-cpi-nsxt-ca######-######-###### can be found in the bosh director VM.
Additionally, even root user are not able to modify or access test files in the bosh director VM:
bosh/0:/home/vcap# touch /var/vcap/data/testfile
touch: cannot touch '/var/vcap/data/testfile': Read-only file system
Tanzu Kubernetes Grid Integrated Edition
Due to the maintenance, the storage has been switched to read-only mode, which brings the file systems on VM using the storage to read-only state.
After the maintenance is done, usually a reboot of the affected VM could restore the file system from ready-only mode to its normal state.
To solve the aforementioned error, rebooting the bosh director VM first to fix the read-only file system issue, then recreate the affected worker VM with command like "bosh recreate".