ESXI Host showing disconnected from Vcenter and VM not responding
book
Article ID: 370127
calendar_today
Updated On:
Products
VMware vSphere ESXi
Issue/Introduction
Symtoms:
ESXI host is disconnected from the vCenter.
There is no heartbeat between the vCenter and Host
Host is marked as disconnected from the vCenter.
On the ESXi DCUI console, you will see the error below. IssueCommand:ERROR Tag 1 SActive already set: SACI:3E CI:3E activeTags:0 reissue_flag:0
On the vpxd logs, the error below was seen. error vpxd cannot contact the specified host (xxxxxx)
On ESXi Cannot view the vmfs/volumes partition
Environment
vSphere ESXi 7.0 Update 3i
Cause
It is a known issue on Marvell 88SE9230 AHCI chipset based controllers (including DELL BOSS S1 adapter and Lenovo ThinkSystem_M.2 adapter).
Resolution
Engage your hardware vendor.
Along with hardware vendor support, you can embark on the below recommendations. 1) Replace two SSDs with new same model SSDs. 2) Don't run VM or heavy IO on the ThinkSystem_M.2 device with RAID mode. 3) If possible, configure the controller to JBOD mode. JBOD mode is fine enough as a boot device without running VM on it.
Additional Information
It happens with RAID (VD) mode only. In RAID mode, the adapter combines two SSDs, emulates and exposes a single virtual SSD. The problem happens when emulating the port reset for the virtual device. Sometimes, the adapter is unable to complete some IO command in many seconds somehow, or an IO may be really stuck on the device, which will cause IO timeout and a taskmgmt abort request will be sent to the driver to cancel the pending IO command. vmw_ahci will do an AHCI port reset to clean the outstanding commands on the device. The problem happens at this moment. The adapter doesn't follow the AHCI spec and doesn't set some status registers as the driver expected, which finally leads the driver misbehaved and is unable to recover.