Network latency or disconnects observed in an NSX-T environment with Service Insertion
search cancel

Network latency or disconnects observed in an NSX-T environment with Service Insertion

book

Article ID: 324252

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  •  East West network security using 3rd party chaining is configured in the environment, also known as Service Insertion (SI)
  •  A collapsed design is used with NSX-T Edge VMs running on ESXi hosts prepared for NSX
  •  Edge VM network interfaces connect to VLAN Segments
  •  On the ESXi host, it can be seen that the Edge VM has a slot 12 filter present
   #summarize-dvfilter

   world 57668912 vmm0:Edge01 vcUuid:'50 20 ef 8c 85 39 18 57-13 d6 79 82 ec 05 c2 1b'
   port 50331712 Edge01.eth2
   vNic slot 12
   name: nic-57668912-eth2-vmware-si.12

   agentName: vmware-si
   state: IOChain Attached
   vmState: Detached
   failurePolicy: failOpen
   serviceVMID: none
   filter source: Dynamic Filter Creation
  • The slot 12 dvfilter has at least one rule, in this case a default allow rule
  #vsipioctl getrules -f nic-34905814-eth2-vmware-si.12
  ruleset mainrs {
  # generation number: 0
  # realization time : 2020-11-05T11:27:45
  rule 1024 at 1 inout protocol any from any to any pbr pass-through;
 
  •  Packets are dropping on the Service Insertion filter
 On the ESXi, identify the switchport of the Edge VM interface
 
   #net-stats -l | grep "Edge01.eth2"
   50331712            5       9 DvsPortset-1     00:50:56:a0:ec:f9  Edge01.eth2

   #vsish -e get /net/portsets/DvsPortset-1/ports/50331712

        NETX_GVM_INPUT_PRE <netx-pre-gvm2s:0x431701a5e2a8>
                pktsStarted:81892462
                pktsPassed:81771908
                pktsDropped:120554      <<<
                pktsFiltered:0
                pktsQueued:0
                pktsFaulted:0
                pktsInjected:0
                pktErrors:0


Environment

VMware NSX-T Data Center 3.x
VMware NSX-T Data Center
VMware NSX-T Data Center 2.5.x

Cause

The Edge VM is a system VM and should not have a Service Insertion IO chain filter attached to its network interfaces.
Even if this filter only has a default allow rule, it can still drop packets due to stateful TCP connection tracking.
This behaviour can result in packet retransmission and TCP connections being reset due to timeout.

Resolution

This is a known issue affecting NSX-T Data Center, currently there is no resolution.

Workaround:
The Service Insertion(SI) dvfilter is removed from the Edge vNic by placing it on the SI exclusion list.

The SI exclusion list is comprised of a list of NSGroups.

Therefore, to add the Edge VM to the SI exclusion list it must first be part of an NSGroup.

Versions prior to 2.5.2:
Edge VMs were not part of a default NSGroup and so must first be added to an NSGroup.

1) On the UI, create a new NSGroup and note its UUID

2) On the UI, add the Edge to the NSGroup using the logical ports of the VM

3) Via API, add the new NSGroup to the SI exclusion list as below

     POST https://<Manager-ip>/api/v1/serviceinsertion/excludelist?action=add_member

     Body:
     {
     "target_id" : "<UUID found from step 1>",
     "target_type" : "NSGroup"
     }

4) Confirm slot 12 has been removed from the VM at the ESXi level by running, the VM interface should not be listed in the output
   #summarize-dvfilter


NSX-T 2.5.2/3.0.0 and above:
Edge VMs are already part of a default system NSGroup called Edge_NSGroup.

1) Via API, identify the UUID of the Edge_NSGroup
   GET https://<Manager-ip>/api/v1/ns-groups

2) Via API, add Edge_NSGroup to the SI exclusion list as below

     POST https://<Manager-ip>/api/v1/serviceinsertion/excludelist?action=add_member

     Body:
     {
     "target_id" : "<UUID found from step 1>",
     "target_type" : "NSGroup"
     }

3) Confirm slot 12 has been removed from the VM at the ESXi level by running, the VM interface should not be listed in the output
   #summarize-dvfilter