vmware-updategmr/vmware-vum-server.log:
yyyy-mm-ddThh:mm:ss info vmware-vum-server[2569116] [Originator@6876 sub=vmomi.soapStub[12] opID=1c941e96-7a7b-4f48-9c90-a6e7f8eac458] SOAP request returned HTTP failure; <SSL(<io_obj p:0x00007fa2dc026c60, h:49, <TCP '127.0.0.1 : 54400'>, <TCP '127.0.0.1 : 443'>>), /vsanHealth>, method: runLifecycleCheck; code: 500(Internal Server Error); fault: (vim.fault.VsanFault) { --> faultCause = (vmodl.MethodFault) null, --> faultMessage = (vmodl.LocalizableMessage) [ --> (vmodl.LocalizableMessage) { --> key = "com.vmware.vsan.clustermgmt.lifecycle.msg.internalerror", --> arg = <unset>, --> message = "Lifecycle management operation failed due to unexpected error." --> } --> ] --> msg = "Received SOAP response fault from [<SSL(<io_obj p:0x00007fa2dc026c60, h:49, <TCP '127.0.0.1 : 54400'>, <TCP '127.0.0.1 : 443'>>), /vsanHealth>]: runLifecycleCheck --> General vSAN error." --> } yyyy-mm-ddThh:mm:ss error vmware-vum-server[2569116] [Originator@6876 sub=VumVapi::Lib::Utils opID=1c941e96-7a7b-4f48-9c90-a6e7f8eac458] [Utils 1105] VsanFault exception hit while running LifecycleCheck: Fault cause: vim.fault.VsanFault
The fault cause is vim.fault.VsanFault. Further looking at vSAN logs you see the below error:
var/log/vmware/vsan-health/vmware-vsan-health-service.log:
yyyy-mm-ddThh:mm:ss INFO vsan-mgmt[36593] [VsanEventUtil::_collectClustersEventsFromCache opID=noOpId] skip cluster 'vim.ClusterComputeResource:domain-cxxxx' without updated timestamp : yyyy-mm-ddThh:mm:ss INFO vsan-mgmt[36072] [VsanVcClusterConfigSystemImpl::RunLifecycleCheck opID=184c9593] Running lifecycle check on cluster 'vim.ClusterComputeResource:domain-cxxxx' with spec: (vim.cluster.VsanVcLifecycleCheckSpec) { operation = 'noChecks' }
yyyy-mm-ddThh:mm:ss ERROR vsan-mgmt[36072] [VsanVcClusterConfigSystemImpl::RunLifecycleCheck opID=184c9593] Lifecycle checks internal error (vmodl.fault.ManagedObjectNotFound) { msg = "Received SOAP response fault from [<<cs p:00007fc5ec278c40, TCP:localhost:8085>, /sdk>]: GetHardware\nThe object 'vim.HostSystem:host-xxx' has already been deleted or has not been completely created", obj = 'vim.HostSystem:host-xxx' } Traceback (most recent call last): File "bora/vsan/clusterconfig/vpxd/pyMoVsan/VsanVcClusterConfigSystemImpl.py", line 11222, in RunLifecycleCheck File "bora/vsan/clusterconfig/vpxd/pyMoVsan/VsanVcClusterConfigSystemImpl.py", line 11164, in IsWitnessVirtualAppliance File "/usr/lib/vmware/site-packages/pyVmomi/VmomiSupport.py", line 612, in __call__ File "/usr/lib/vmware/site-packages/pyVmomi/VmomiSupport.py", line 400, in _InvokeAccessor PyCppVmomi.vmodl.fault.ManagedObjectNotFound: (vmodl.fault.ManagedObjectNotFound) { msg = "Received SOAP response fault from: GetHardware\nThe object 'vim.HostSystem:host-xxx' has already been deleted or has not been completely created", obj = 'vim.HostSystem:host-xxx' }
VMware vCenter Server 8.x
Stale entry of a VSAN witness host on the vCenter server.
The compliance check is encountering an error in the vSAN health servce as it is trying to find host host-xxx, but it is reported that the host has already been deleted or has not been completely created
Follow the below steps to manually remove the stale entry of the vSAN witness host from the vCenter server MOB page.
Take a snapshot of the vCenter server before making the changes.
1. Open an SSH session to the vCenter appliance
2. Log into RVC with "rvc localhost" and navigate to the affected vSAN cluster on your environment.
3. Enable the vSAN MOB, disabled by default, by running the command:
vsan.debug.mob --start /localhost
4. Access the URL link from the output. Log in with [email protected] (or the configured domain name) and password. The link will be similar to:
https://vcenter_fqdn/vsan/mob
5. Click on (more...) then select vsan-stretched-cluster-system.
After this step, a new window is launched for the vsan-stretched-cluster-system.
6. Click on VSANVcGetWitnessHosts.
You should be seeing the nodes listed in VimClusterVSANWitnessHostInfo[]. Find the host with the ID referenced in the logs host-xxx (The object 'vim.HostSystem:host-xxx' has already been deleted or has not been completely created). This is the host entry that we will be removing.
7. Before we proceed to remove the stale host, validate the current configured Witness vSAN witness node from the vSphere UI.
Navigate to Cluster > Configure > vSAN > Fault Domain.
8. Validate the host entry is not present in the vCenter Database.
psql -d VCDB -U postgres -c "select id, dns_name, ip_address from vpx_host where id='xxx';"
9. Go back to vsan-stretched-cluster-system page and click on VSANVcRemoveWitnessHost. Enter the <domain-c> and <host-> values , from the above step into their respected sections and Click on Invoke Method.
10. To confirm the host entry successfully removed, navigate back to VSANVcGetWitnessHosts (step 6). Once confirmed, go back to vSphere UI, refresh the browser and re-try the compliance check.
Steps are taken and modified from KB Out of inventory witness alerts seen in vSAN cluster Fault Domains . Steps 2-7, and step 11. Note: Step 1 to "Place the witness appliance into maintenance mode" can be skipped as we are removing a stale entry in this scenario.