PSOD can occur when traffic hits the NSX DFW rule which has a context profile associated with FQDN attributes and receives CNAME record in response from DNS server.
PSOD can occur during the vMotion of a VM that has NSX DFW rule which has a context profile associated with FQDN attributes and receives CNAME record in response from DNS server.
Stack trace observed during vMotion:
<DATE>T<TIME>Z cpu6:2248766)@BlueScreen: #PF Exception 14 in world 2248766:NetWorld-VM- IP 0x420010e4a31e addr 0x12
PTEs:0x175fa0027;0x1e571c007;0x0;
<DATE>T<TIME>Z cpu6:2248766)Code start: 0x42000f40xxxx VMK uptime: 1:01:58:02.637
<DATE>T<TIME>Z cpu6:2248766)0x453951a9xxxx:[0x420010e4xxxx]pf_fqdn_uuid_tree_RB_NEXT@ com.vmware.vsip#1.0.7.0.21376387+0xe stack: 0x453951a999b8
<DATE>T<TIME>Z cpu6:2248766)base fs=0x0 gs=0x420041800000 Kgs=0x0
<DATE>T<TIME>Z cpu1:2101580)Failed to backup ConfigStore.
<DATE>T<TIME>Z cpu13:2097556)Jumpstart plugin petronas-wipe-partitions activation failed.
<DATE>T<TIME>Z cpu6:2248766)CPU model name: Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, FMS: 06/4f/1, uCodeRev: b000040
Stack trace observed without vMotion:
Screen: Spin count exceeded - possible deadlock
<DATE>T<TIME>Z cpu0:66983194)Code start: 0x420030800000 VMK uptime: 41:03:22:04.411
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99ad70:[0x420030910c0d]PanicvPanicInt@vmkernel#nover+0x1f9 stack: 0x10
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99ae20:[0x420030911274]Panic_NoSave@vmkernel#nover+0x4d stack: 0x453a5e99ae80
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99ae80:[0x4200308240e4]Lock_CheckSpinCount@vmkernel#nover+0x269 stack: 0x420040000000
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99aed0:[0x420030916500]MCSLockSpin@vmkernel#nover+0x71 stack: 0x4323d820dd18
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99af00:[0x4200309166d4]MCSLockRWContended@vmkernel#nover+0x1c1 stack: 0x0
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99af50:[0x420030916e59]MCS_DoAcqReadLockWithRA@vmkernel#nover+0x82 stack: 0x453a5e99b228
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99af60:[0x420030835041]vmk_SpinlockReadLock@vmkernel#nover+0x16 stack: 0x800000002
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99af70:[0x420032246001]pf_test@ com.vmware.vsip#1.0.7.0.20682517+0x34d2 stack: 0x45bcc2aa83ba
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99b190:[0x4200322cc22f]PFFilterPacket@ com.vmware.vsip#1.0.7.0.20682517+0x50c stack: 0x0
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99b4a0:[0x4200321ec6ff]VSIPDVFProcessPacketsInt@ com.vmware.vsip#1.0.7.0.20682517+0x4c8 stack: 0x0
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99bb70:[0x42003150f0e0]DVFilterInputOutputIOChainCB@ com.vmware.vmkapi#v2_10_0_0+0x89 stack: 0x43064204e108
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99bbb0:[0x420030a53593]IOChain_Resume@vmkernel#nover+0x258 stack: 0x430600000001
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99bc50:[0x420030a972be]Port_InputResume@vmkernel#nover+0x93 stack: 0x4306d4a06e00
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99bca0:[0x420030a9b537]PortClient_InputCommitted@vmkernel#nover+0x34 stack: 0x4306d4a068c0
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99bcf0:[0x420030a4c18d]E1000DevAsyncTx@vmkernel#nover+0x53e stack: 0x4306f0e03e00
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99bf50:[0x420030a818e1]NetWorldPerVMCB@vmkernel#nover+0x19e stack: 0x430113e9b750
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99bfe0:[0x420030c14c52]CpuSched_StartWorld@vmkernel#nover+0x7b stack: 0x0
<DATE>T<TIME>Z cpu0:66983194)0x453a5e99c000:[0x4200308d408f]Debug_IsInitialized@vmkernel#nover+0xc stack: 0x0
Stack trace observed without vMotion:
<DATE>T<TIME>Z cpu62:2113087)@BlueScreen: #PF Exception 14 in world 2113087:NetWorld-VM- IP 0x42002466957f addr 0x1a
PTEs:0x0;
<DATE>T<TIME>Z cpu62:2113087)Code start: 0x420022c00000 VMK uptime: 0:02:54:14.349
<DATE>T<TIME>Z cpu62:2113087)0x453a5a018b68:[0x42002466957f]<fqdn>+0xf stack: 0x453a5a0199b8
<DATE>T<TIME>Z cpu62:2113087)base fs=0x0 gs=0x42004f800000 Kgs=0x0
<DATE>T<TIME>Z cpu62:2113087)CPU model name: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz, FMS: 06/55/7, uCodeRev: 5003302
<DATE>T<TIME>Z cpu62:2113087)PRODUCTNAME:Amazon EC2 i3en.metal-2tb, VENDORNAME:Amazon EC2, SERIAL_NUMBER:i-0aba838bce54f68b7, SERVER_UUID:<UUID>, VERSION:, SKU:, FAMILY:
NSX DFW context profile has a configuration related to FQDN attribute and receives a CNAME record in response from DNS server. When traffic hits this rule or if a VM associated with this rule vMotions the host experiences memory corruptions in the DFW which leads to PSOD
Here is the sample configuration for L7 Context profile with FQDN attribute.