Issue: CSI volumeattachment is getting the below error while attaching Persistent Volume to a Kubernetes node.
Status:
Attach Error:
Message: rpc error: code = Internal desc = failed to attach disk: "********-****-****-****-*******" with node: "********-****-****-****-*******" err failed to attach cns volume: "********-****-****-****-*******" to node vm: "VirtualMachine:vm-**** [VirtualCenterHost: *********, UUID:********-****-****-****-*******, Datacenter: Datacenter [Datacenter: Datacenter:datacenter-****, VirtualCenterHost: *************]]". fault: "(*types.LocalizedMethodFault)(0xc00000000)({\n DynamicData: (types.DynamicData) {\n },\n Fault: (types.CnsFault) {\n BaseMethodFault: (types.BaseMethodFault) <nil>,\n Reason: (string) (len=88) \"The input volume********-****-****-****-******* is not registered as a CNS volume.\"\n },\n LocalizedMessage: (string) (len=35) \"fault.CnsNotRegisteredFault.summary\"\n})\n". opId: "xxxx"
Time: YYYY:MM:DD
Attached: false
Detach Error:
Message: rpc error: code = Internal desc = volumeID "********-****-****-****-*******" not found in QueryVolume
Also, PV would not be listed on the vCenter UI under "Container Volumes". Refer the below screenshot
It can be caused due to general discrepancies between the datastore and the CNS database.
NOTES:
find /vmfs/volumes/<datastore>/catalog/journal -name "*.xaction"
If the output has some files, please refrain from rebuilding the catalog
Steps to Rebuild Catalog:
a. Stop "hostd"
/etc/init.d/hostd stop
b. For each identified datastore, Move catalog folder to a backup folder
mkdir /tmp/<datastore>-catalog-bkp(or any other temporary directory in any other datastore)cd /vmfs/volumes/<datastore>/catalogmv $(ls | grep -v journal) /tmp/catalogls /tmp/catalog #(make sure all files are copied)
c. Start "hostd"
/etc/init.d/hostd start
4. SSH to other hosts on which the datastore is mounted and execute the following command
Restart "hostd"
/etc/init.d/hostd restart
Or run the following to avoid "hostd" restart
/usr/lib/vmware/hostd/bin/notifyDatastore.py -t PreUnmount -d <dsName>
5.1 For vCenter Server version 9.0 and above
a. To use the vCenter MOB, the MOB needs to be enabled before accessing.
Go to VC MOB > content > vStorageObjectManager > ReconcileDatastoreInventoryEx_Task
MOB URL will be similar to
https://<vcsa- fqdn>/mob/?moid=VStorageObjectManager&method=reconcileDatastoreInventoryEx
b. Replace the existing spec with the following -
<spec> <datastore type="Datastore">datastore-MOID</datastore></spec>
VC MOB > content >rootFolder > datacenter > datastore > moid / "datastore-##" 5.2 For older releases
a. To use the ESXi MOB, the MOB needs to be enabled before accessing. Enable host MOB
b. Go to host MOB > ha-vstorage-object-manager > HostReconcileDatastoreInventory_Task
MOB URL will be similar to
https://<ESXI_host_fqdn/IP>/mob/?moid=ha-vstorage-object-manager&method=reconcileDatastoreInventory
c. Provide the <datastore-UUID>
Example: ds:///vmfs/volumes/<datastore-UUID>/ (Found on the summary page of the datastore in the vCenter UI)
d. Run the reconcile task and wait till it succeeds.
e. Repeat this for each identified datastore.
6. SSH into the host selected for the above steps. Look for the catalog folder and it should show again for all the datastores taken out in STEP 1.
a. ls /vmfs/volumes/<datastore-path>/catalog
b. Verify that vclock is created in the format of "vclock-" for all the datastores:
ls /vmfs/volumes/<datastore-path>/catalog
c. Verify tidy file is created for all datastores. Below command should return a file named "1.dat”
ls /vmfs/volumes/<datastore-path>/catalog
7.1 For ESXi hosts version 9.0 and above
a. Trigger CNS fullSync (for VC version 8.0 and above), by following the below instructions:
b. SSH to VC and execute the following command
psql -U postgres -d VCDB
c. Get the required datastore URL from vpx_ds_info or cns.vpx_storage_datastore_info:
select name, url from vpx_ds_info; select * from cns.vpx_storage_datastore_info; update cns.vpx_storage_datastore_info set vclock=-1 where datastore_url=‘<>’; delete from cns.volume_info where datastore='<>';
d. Restart vsan-health service
vmon-cli --restart vsan-health
e. Wait for Full Sync to complete. Following log lines can be seen in the CNS logs (/var/log/vmware/vsan-health/vsanvcmgmtd.log)
2025-01-28T10:40:31.675Z info vsanvcmgmtd[219935] [vSAN@6876 sub=CnsSync] Sync all datastores ......2025-01-28T10:40:34.019Z info vsanvcmgmtd[219935] [vSAN@6876 sub=CnsSync] Sync ds:///vmfs/volumes/<ds-uuid>: startVClock = 0, fullSync = true...2025-01-28T10:40:42.975Z info vsanvcmgmtd[219935] [vSAN@6876 sub=CnsSync] Synced all datastores
f. Confirm that the CNS database is updated with the correct vclock values:
psql -U postgres -d VCDBselect * from cns.vpx_storage_datastore_info;select * from cns.volume_info where datastore='<>';
7.2 For older ESXi hosts
a. If the database content is not getting updated even after some time, and the database is showing old content, it may require triggering a sync for "StorageLifecycleManager".
https://<VCIP>/vslm/mob//?moid=StorageLifecycleManager&method=VslmSyncDatastore
The datastore URL will be like "ds:///vmfs/volumes/##### - ###### - ######". (Found on the summary page of the datastore in the vCenter UI)
Set "fullSync=true""fcd Id" can be blank
b. Once this is done, wait for some time for data sync between SPS and CNS. After this sync, the Persistent volume will show up in vCenter's CNS UI.