Container does not allow traffic until 10 seconds or more after creation.
search cancel

Container does not allow traffic until 10 seconds or more after creation.

book

Article ID: 322078

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
-    Newly deployed container-based application is not accessible for 10 seconds or more even though the container creation is successful. If user tries to ping from the container to outside, ping works only after 10+ seconds.
 -    The environment has an NSGroup with greater than 5K effective members. The NSGroup is used in DFW sections/rules and is required for the container to communicate.

Environment

VMware NSX-T

Cause

When a container logical-port is added to a NSGroup based on the dynamic criteria that already has more than 5K effective members, the evaluation and subsequent logical-port addition to the NSGroup is what is taking up the bulk of the additional time before the container is able to communicate. The corresponding DFW rules applied to large NSgroup that allow traffic will get applied to the container during this time (approx. 10 seconds or more) there by making the application inaccessible immediately after deployment.

This is mostly seen in PAS/TAS scale deployments but can also apply to any container-based deployments with more than 5K effective members in NSGroup and FW sections with the “AppliedTo” pointing to large NSGroups.
 
 

Resolution

NSX 2.5.2 and later releases have optimizations that help logical-ports evaluation/addition to NSGroup faster. Also, for PAS/TAS deployments, starting NCP 3.1.1 and later releases, NCP will create FW section/rules with “IPSet” and “AppliedTo” as DFW wherever applicable instead of “AppliedTo” pointing to a large NSGroup.

Workaround:
  1. Create an IPSet with container network IP block.
  2. For each global running ASG, find their corresponding FW section.
  3. Make a copy of each FW section (including the rules) found in Step-2 and leave AppliedTo as DFW at FW section. Alternatively, create one FW section and copy all rules from FW sections found in Step-2.
  4. For each rule in the copied FW section, set the source as the IPSet id created in Step-1
  5. Unbind all global ASGs by using the cf command
    1. ​​​​​​​​​​​​​​​​​​​​​cf unbind-running-security-group <asg-name>
 
Above steps will avoid the delay caused by container logical-port evaluation/addition to NSGroup and apply the rule to the container filter almost immediately after the container creation.