Host are showing not logged in to NetApp cluster

Products

VMware vSphere ESXi

Issue/Introduction

The hosts with Emulex HBAs connected to two NetApp clusters are showing as logged into one cluster and not logged into the second cluster
After a reboot, the login status toggles—where it was previously showing "not logged in," it now shows as "logged in," and vice versa for the other NetApp cluster
In the /var/log/vmkernel.log file, you see Slab allocation failure warnings:

YYYY-MM-DDTHH:MM:SS4Z cpu143:2098850)WARNING: lpfc: _lpfc_SlabAlloc:1883: 0:3371 Alloc failure from Slab 2 Caller <lpfc_setup_disc_node:5783> cur cnt 128, Max 128
YYYY-MM-DDTHH:MM:SS4Z cpu143:2098850)WARNING: lpfc: _lpfc_SlabAlloc:1883: 0:3371 Alloc failure from Slab 2 Caller <lpfc_setup_disc_node:5783> cur cnt 128, Max 128
YYYY-MM-DDTHH:MM:SS4Z cpu143:2098850)WARNING: lpfc: _lpfc_SlabAlloc:1883: 0:3371 Alloc failure from Slab 2 Caller <lpfc_setup_disc_node:5783> cur cnt 128, Max 128
2YYYY-MM-DDTHH:MM:SS4Z cpu143:2098850)WARNING: lpfc: _lpfc_SlabAlloc:1883: 0:3371 Alloc failure from Slab 2 Caller <lpfc_setup_disc_node:5783> cur cnt 128, Max 128
YYYY-MM-DDTHH:MM:SS9Z cpu182:2099011)WARNING: lpfc: _lpfc_SlabAlloc:1883: 1:3371 Alloc failure from Slab 2 Caller <lpfc_els_unsol_buffer:9627> cur cnt 128, Max 128
YYYY-MM-DDTHH:MM:SS9Z cpu182:2099011)WARNING: lpfc: _lpfc_SlabAlloc:1883: 1:3371 Alloc failure from Slab 2 Caller <lpfc_sli4_seq_abort_rsp:17406> cur cnt 128, Max 128
YYYY-MM-DDTHH:MM:SS8Z cpu130:2098850)WARNING: lpfc: _lpfc_SlabAlloc:1883: 0:3371 Alloc failure from Slab 2 Caller <lpfc_els_unsol_buffer:9627> cur cnt 128, Max 128
YYYY-MM-DDTHH:MM:SS8Z cpu130:2098850)WARNING: lpfc: _lpfc_SlabAlloc:1883: 0:3371 Alloc failure from Slab 2 Caller <lpfc_sli4_seq_abort_rsp:17406> cur cnt 128, Max 128

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware vSphere ESXi 7.x

Cause

The issue may occur if one of the following situations is present

The fabric zone has 64 or more members with some or all of the targets with higher FCID assignments.
Each target port is rotating through several FCIDs and the driver logs into six or more targets.

Resolution

For zones whose membership exceeds 50 members, split your fabric zone into smaller node counts or consult any best practices from your target/switch/operating system vendors.
If splitting the zone set up is unacceptable, set a driver parameter to resize the node table. Discuss this with your Emulex vendor before making the changes.
- To make the node table larger – for example 128 nodes
  #esxcli system system module parameters set –p lpfc_nlp_slab_cnt=128 –m lpfc
- Reboot the server for changes to take effect.
  #reboot
Run this command on each impacted server.