VMs are experiencing intermittent connectivity loss after adding a new NSX prepared ESXi host to the cluster.
From the nsx-syslog.log in the added ESXi host under directory /var/run/log/nsx-syslog.log, the nsx-controller component is reporting a problem with vsipfw module and unable to get the kernel addrset due to switch port not found.
YYYY-MM-DDTHH:MM:SS.SSSZ cfgAgent[2134266]: NSX 2134266 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="E8228700" level="error"] vsipfw: VsipFWCmd.cpp:execute():220 ioctl failed because switch port is not found
YYYY-MM-DDTHH:MM:SS.SSSZ cfgAgent[2134266]: NSX 2134266 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="E8228700" level="error" errorCode="LCP01107"] dfw: Failed to get kernel addrset count: switch port not found
YYYY-MM-DDTHH:MM:SS.SSSZ cfgAgent[2134266]: NSX 2134266 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="E8228700" level="error"] vsipfw: VsipFWCmd.cpp:execute():220 ioctl failed because switch port is not found
YYYY-MM-DDTHH:MM:SS.SSSZ cfgAgent[2134266]: NSX 2134266 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="E8228700" level="error" errorCode="LCP01161"] dfw: Failed to get kernel ruleset count: switch port not found
Further in the nsx-syslog.log, it shows the virtual interface (VIF) attached to the VM was removed due to the absence of the switch port.
YYYY-MM-DDTHH:MM:SS.SSSZ
cfgAgent[2134266]: NSX 2134266 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" s2comp="nsx-hyperbus" tid="E87B3700" level="info"] IP manager: Delete VIF [########-####-####-####-##########0d
(host switch ## ## ## ## ## ## ## ##-## ## ## ## ## ## ## 43) LIP after receiving VIF disconnect message
YYYY-MM-DDTHH:MM:SS.SSSZ
cfgAgent[2134266]: NSX 2134266 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="E87B3700" level="info"] ConfigApp: write to ConfigCache on VIF update (delta update)YYYY-MM-DDTHH:MM:SS.SSSZ
cfgAgent[2134266]: NSX 2134266 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="E7FA3700" level="error"] vsipfw: VsipFWCmd.cpp:execute():220 ioctl failed because switch port is not foundYYYY-MM-DDTHH:MM:SS.SSSZ
cfgAgent[2134266]: NSX 2134266 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="E7FA3700" level="warn"] dfw: Failed to set mac address of vif [########-####-####-####-##########0d
: switch port not found
The VIF port is detached for the impacted VMs during the incident.
YYYY-MM-DDTHH:MM:SS.SSSZ nsx-opsagent[2134410]: NSX 2134410 - [nsx@6876 comp="nsx-esx" subcomp="opsagent" s2comp="nsxa" tid="2135012" level="INFO"] [DoVifPortOperation] request=[opId:[6] op:[HOSTD_DETACH_PORT(2)] vif:[########-####-####-####-##########0d] ls:[########-####-####-####-##########b1] vmx:[/vmfs/volumes/vsan:################-##############ab/########-####-####-####-############36/<VM Name>.vmx] lp:[]]
VMware NSX-T Data Center
VMware NSX
The issue may occurred because of the NSX controller component exhausted its memory, preventing it from effectively communicating with the newly added ESXi host to synchronise the configuration.
To verify the NSX Manager cluster status:
Check Cluster Status:
Address Issues: