After applying the workaround in KB Temporary/transient storage path loss on ESXi 8.0 could result in paths not coming back when using Cisco UCS and NFNIC we are still dropping paths to storage. When we check the FPIN Heap, we find that we are still out of space after applying the workaround. You can check available FPINHeap with the following command:
esxcfg-info -a |grep -A3 storageFPINHeap|grep "Max Available"
Example:
Host-1 shows that it has run out of FPINheap. |----Max Available...................................416 bytes
Host-2 shows that we have not run out of Heap.|----Max Available...................................3219872 bytes
A healthy host will around 5246448 bytes Available but an impacted host will show signifyingly less free space sometimes 16k bytes or less. In this case, you have already disabled fpin but you still see that the Heap is empty (416 Bytes)
To resolve this situation, we will need to:
esxcli storage fpin info set -e false, then we only need to do a host reboot. vsish -e set /storage/fpin/info 0
esxcli storage fpin info get
Rebooting will clear the heap and this allows Cisco to create paths to storage again.
For more information on this bug see: Temporary/transient storage path loss on ESXi 8.0 could result in paths not coming back when using Cisco UCS and NFNIC
There are two permanent fixes for this issue: