External connectivity flaps are observed for one of the VNF application which carries traffic to external network via NSX Edge VMs. The VNF loses external connectivity for few seconds and then restores.
No alerts or errors or warnings were seen on NSX Edge VMs on vcenter or on NSX manager GUI and the resource utilization for the Edge VMs were OK
3.2.2
Intermittent geneve tunnel flaps were seen on the NSX edge nodes with remote vTEP (ESXi host or other Edge VM) as below which results in external connectivity issues for the application traffic
xxxx-xx-xxTxx:xx:xx.xxxZ <Hostname of EdgeVM> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="tunnel" level="INFO"] Tunnel <Edge vTEP IP>:<Remote vTEP IP>(geneve) state updated from up to down
xxxx-xx-xxTxx:xx:xx.xxxZ <Hostname of EdgeVM> NSX 1 FABRIC [nsx@6876 comp="nsx-edge" subcomp="nsxa" s2comp="tunnel" level="INFO"] Tunnel <Edge vTEP IP>:<Remote vTEP IP>(geneve) state updated from down to up
From the ESXi host vmkernel logs, observed p2m buffer overflow messages as below resulting in flapping of vnic interfaces of Edge VM occassionally
xxxx-xx-xxTxx:xx:xx.xxxZ cpu0:#######)VmMemCow: 1772: p2m update: cannot reserve - cur 0 0 rsvd 927 req 1029 avail 1279
xxxx-xx-xxTxx:xx:xx.xxxZ cpu54:#######)vswitch: L2Sec_EnforcePortCompliance:214: [nsx@6876 comp="nsx-esx" subcomp="vswitch"]client Edge_VM.eth2 requested promiscuous mode on port 0x#######, disallowed by vswitch policy
The current value of "ShareCOSBufSize" on the ESXi host was updated with 5 as below
Increase the "ShareCOSBufSize" on the ESXi host where Edge VMs are deployed to 32. For procedure and more information refer KB 76387.