Symptoms:
There are symptoms:
1. PSOD Screen trace shows these error strings
"#PF Exception 14 in world xxxxxxxx:nsx-cfgagent IP 0x42002c992ea8 addr 0x500ffcf PTEs:0x1573943b027;0x8cb2003007;0x0;"
or
"PANIC bora/vmkernel/main/dlmalloc.c:4944 - Usage errir in dlmalloc"
or
"pfr_kif_ktbp_head_RB_FIND"
PSOD conditions and configuration observed
VMs are vMotioned from a host running NSX before 4.2.0 to a host running NSX 4.2.0 or later.
Log
/var/run/log/LogEFI.log
2025-02-24T15:44:14.310Z cpu112:2101665)VMware ESXi 7.0.3 [Releasebuild-22348816 x86_64]
#PF Exception 14 in world xxxxxxx:nsx-cfgagent IP 0x4200204fcb6a addr 0x336336c3
PTEs:0x100c50dd027;0x816580c007;0x0;
2025-02-24T15:44:14.311Z cpu112:2101665)cr0=0x80010031 cr2=0x336336c3 cr3=0x8190fd2000 cr4=0x142768
2025-02-24T15:44:14.311Z cpu112:2101665)FMS=06/6a/6 uCode=0xd0003d1
2025-02-24T15:44:14.311Z cpu112:2101665)frame=0x453a9c89aa10 ip=0x4200204fcb6a err=0x0 rflags=0x10206
2025-02-24T15:44:14.311Z cpu112:2101665)rax=0x33633633 rbx=0x0 rcx=0x4325df2c12b0
2025-02-24T15:44:14.311Z cpu112:2101665)rdx=0x0 rbp=0x4325df2c1440 rsi=0x453a9c89ab30
2025-02-24T15:44:14.311Z cpu112:2101665)rdi=0x33633633 r8=0x1010 r9=0x4325df2cff30
2025-02-24T15:44:14.312Z cpu112:2101665)r10=0x4325df200f10 r11=0x4325df200e58 r12=0x4200204fcb68
2025-02-24T15:44:14.312Z cpu112:2101665)r13=0x453a9c89ab30 r14=0x1 r15=0x4323e84a4e48
*PCPU112:2101665/nsx-cfgagent
PCPU 0: VVUVUVSVVSVVVVVVVVVUUVSVVVVVVVVVVVUUVVVVVVSVVSSVUSVSUVVVVVSVVVVV
PCPU 64: VSVVVVUVVVVVSVVVVVSVVVVUVSVVVVVUVVVVVSVVUUUVVVVVUUVVVVVVVUVVVVUV
2025-02-24T15:44:14.312Z cpu112:2101665)Code start: 0x42001ea00000 VMK uptime: 5:12:59:55.010
2025-02-24T15:44:14.313Z cpu112:2101665)0x453a9c89aad8:[0x4200204fcb6a][email protected]#1.0.7.0.24476730+0x2 stack: 0x4323e84cbd98
2025-02-24T15:44:14.313Z cpu112:2101665)0x453a9c89aae0:[0x420020590ed2][email protected]#1.0.7.0.24476730+0x83 stack: 0x4323e8499300
2025-02-24T15:44:14.317Z cpu112:2101665)base fs=0x0 gs=0x42005c000000 Kgs=0x0
Sample screenshot
2. No DFW rules are programmed on VMs
No DFW rule is programmed in VMs. (i.e. vsipioctl getrules -f <name> show no rules)
Log
/var/run/log/nsx-syslog.log
nsx-syslog.log:2025-02-21T23:39:16.553Z cfgAgent[2100550]: NSX 2100550 - [nsx@6876 comp="nsx-controller" subcomp="cfgAgent" tid="74266700" level="error" errorCode="LCP01155"] dfw: Failed to process request
NSX 4.2.0.x to 4.2.1.x VIBs are installed ona ESXi hosts. DFW rules, which use dynamic groups, are applies to VMs.
If you run into this PSOD, please create a support request case with Broadcom so that we can validate.
Resolution:
Upgrade to NSX 4.2.2.1
Workaround:
Use the following steps to a workaround:
1. ssh to a Manager. In a shell, enter an admin password (replace <admin password> by the actual admin password)
nsxpass='<admin password>'
2. Use the following curl command to collect the security global configuration:
root@nsx:~#curl -ku admin:$nsxpass https://127.0.0.1/api/v1/infra/settings/firewall/security > d.json
root@nsx:~# more d.json | grep global_addrset_mode_enabled
"global_addrset_mode_enabled" : true,
3. use vi to modify the value of global_addrset_mode_enabled to false.
root@nsx:~# more d.json | grep global_addrset_mode_enabled
"global_addrset_mode_enabled" : false,
4. use the following curl command to change the value of global_addrset_mode_enabled
root@nsx:~# curl -ku admin:$nsxpass https://127.0.0.1/api/v1/infra/settings/firewall/security [email protected] -X PUT -H "Content-Type: application/json"
5. repeat step 2 to 4, except that set the value of global_addrset_mode_enabled to true now.
6. ssh to a host. run "/opt/vmware/nsx-nestdb/bin/nestdb-cli --beautify --json --cmd get vmware.nsx.nestdb.GlobalConfigMsg | grep global_addrset_mode_enabled:
[root@ESX:~] /opt/vmware/nsx-nestdb/bin/nestdb-cli --beautify --json --cmd get vmware.nsx.nestdb.GlobalConfigMsg | grep global_addrset_mode_enabled
"global_addrset_mode_enabled" : true,