Symptom:
-------------
1) vSphere HA stuck in "HA Agent Unreachable".
2) Uninstalling fdm vib fails with "Cannot open volume".[[email protected]:~] esxcli software vib remove -n vmware-fdm [InstallationError] Failed to query file system stats: Errors: Error getting data for filesystem on '/vmfs/volumes/63######-########-####-########4f60': Cannot open volume: /vmfs/volumes/63######-########-####-########4f60, skipping. Error getting data for filesystem on '/vmfs/volumes/66######-########-####-########9670': Cannot open volume: /vmfs/volumes/66######-########-####-########9670, skipping. cause = Errors: Error getting data for filesystem on '/vmfs/volumes/63######-########-####-########4f60': Cannot open volume: /vmfs/volumes/63######-########-####-########4f60, skipping. Error getting data for filesystem on '/vmfs/volumes/66######-########-####-########9670': Cannot open volume: /vmfs/volumes/66######-########-####-########9670, skipping. Please refer to the log file for more details.[[email protected]:~]
fdm.log:--------YYYY-MM-DD Er(163) Fdm[31077111]: [Originator@6876 sub=Cluster opID=WorkQueue-2c####35] Failed to open file: /vmfs/volumes/62######-########-####-########de3e/.vSphere-HA/FDM-78######-####-####-####-########-##x-####632-VM1/protectedlistYYYY-MM-DD Er(163) Fdm[31077111]: [Originator@6876 sub=Cluster opID=WorkQueue-2c####35] open(/vmfs/volumes/62######-########-####-########de3e/.vSphere-HA/FDM-78######-####-####-####-########-##x-####632-VM1/protectedlist) failed: Device or resource busyYYYY-MM-DD In(166) Fdm[31077111]: [Originator@6876 sub=Invt opID=WorkQueue-2c####35] Notify datastore (/vmfs/volumes/62######-########-####-########de3e) locallyYYYY-MM-DD Db(167) Fdm[31077111]: [Originator@6876 sub=Cluster opID=WorkQueue-2c####35] IO error at __localhost__; path: /vmfs/volumes/62######-########-####-########de3e (err: 16)YYYY-MM-DD Wa(164) Fdm[31077111]: [Originator@6876 sub=VpxProfiler opID=WorkQueue-2c####35] WorkQueue [TotalTime] took 4039 msYYYY-MM-DD Er(163) Fdm[31076792]: [Originator@6876 sub=Cluster opID=WorkQueue-26####b9a] Failed to open file: /vmfs/volumes/63######-########-####-########xec0/.vSphere-HA/FDM-78######-####-####-####-########-##x-####632-VM1/protectedlistYYYY-MM-DD Er(163) Fdm[31076792]: [Originator@6876 sub=Cluster opID=WorkQueue-26####b9a] open(/vmfs/volumes/63######-########-####-########xec0/.vSphere-HA/FDM-78######-####-####-####-########-##x-####632-VM1/protectedlist) failed: Device or resource busyYYYY-MM-DD In(166) Fdm[31076792]: [Originator@6876 sub=Invt opID=WorkQueue-26####b9a] Notify datastore (/vmfs/volumes/63######-########-####-########xec0) locallyYYYY-MM-DD Db(167) Fdm[31076792]: [Originator@6876 sub=Cluster opID=WorkQueue-26####b9a] IO error at __localhost__; path: /vmfs/volumes/63######-########-####-########xec0 (err: 16)YYYY-MM-DD Wa(164) Fdm[31076792]: [Originator@6876 sub=VpxProfiler opID=WorkQueue-26####b9a] WorkQueue [TotalTime] took 4040 msYYYY-MM-DD Db(167) Fdm[31076789]: [Originator@6876 sub=Cluster opID=clusterManager.cpp:980-38739276] Updating inventory manager with 6 datastores
[[email protected]:~] esxcfg-scsidevs -mVmFileSystem: Slow refresh failed: Cannot open volume: /vmfs/volumes/63######-########-####-########4f60VmFileSystem: Slow refresh failed: Cannot open volume: /vmfs/volumes/66######-########-####-########9670
[[email protected]:/vmfs/volumes] ls -lls: ./63######-########-####-########4f60: Read-only file systemls: ./66######-########-####-########9670: Read-only file systemtotal 59904drwxr-xr-x 1 root root MM DD HR:MIN 18######-########-####-########96a3drwxr-xr-x 1 root root MM DD HR:MIN 5b######-########-####-########76a9::lrwxr-xr-x 1 root root MM DD HR:MIN Example-Datastore -> 5f######-########-####-########de3elrwxr-xr-x 1 root root MM DD HR:MIN OPManager-Datastore -> 65######-########-####-########cd50lrwxr-xr-x 1 root root MM DD HR:MIN OSDATA-68######-########-####-########24e0 -> 68######-########-####-########24e0lrwxr-xr-x 1 root root MM DD HR:MIN TEST1-DS -> 63######-########-####-########4f60 =============>>>>>>>>>> Inaccessiable devices (broken symbolic link)lrwxr-xr-x 1 root root MM DD HR:MIN TEST2-DS2 -> 66######-########-####-########9670 =============>>>>>>>>>> Inaccessiable devices (broken symbolic link)lrwxr-xr-x 1 root root MM DD HR:MIN TEST2-DS3 -> 5f######-########-####-########de3e
VMware vCenter Server 8.x
VMware vCenter 9.0.0
The datastores previously designated for HA heartbeating have become inaccessible. This typically occurs during infrastructure decommissioning or storage maintenance where datastores are unmounted or removed from the environment without being deselected in the HA configuration.
1) Disable HA on the cluster.
2) Unmount the datastore in question.
3) Rescan the datastore from cluster level.
4) Enable HA.