Network connectivity issue after Upgrade to 4.1.2.5
search cancel

Network connectivity issue after Upgrade to 4.1.2.5

book

Article ID: 376943

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

The customer has enabled a Gateway Firewall for the tier-1 router with a default_rule under Policy_Default_Infra-tier1_### that allows ANY-ANY traffic in 3.2.2, the rules are configured as 'Stateful: NO'. However, after upgrading to 4.1.2.5, the rule began functioning as 'Stateless' (expected), whereas in 3.2.2, despite the configuration, it was functioning as 'Stateful' in MP. As a result, after the upgrade, some communication between subnets started to fail due to the rule now working as stateless.

Environment

  • VMware NSX-T Data Center 3.x
  • VMware NSX-T Data Center 4.1X

Cause

Default firewall sections are stateless in Policy mode:

3.2.2 configuration:

in manager view:


Stateful flag is kept at FirewallSection level, so a GET /firewall/sections/<section-uuid> should give stateful details.


"autoplumbed": false,
        "category": "Default",
        "comments": "Default section unlock comment",
        "description": "default.Policy_Default_Infra-tier1",
        "display_name": "Policy_Default_Infra-tier1",
        "enforced_on": "LOGICALROUTER",
        "id": "2f2620c5-####-####-####-b145a8c27157",
        "is_default": false,
        "lock_modified_by": "nsx_policy",
        "lock_modified_time": 1626299713556,
        "locked": false,
        "priority": 90999999,
        "resource_type": "FirewallSection",
        "rule_count": 1,
        "section_type": "LAYER3",
        "stateful": true,                   <-------------------- stateful
        "tags": [

Resolution

According to the configuration guide:

If a tier-1 gateway has both SNAT and gateway firewall (GWFW) configuredand if the GWFW is not configured to be stateful, you must configure NO SNAT for the tier-1 gateway's advertised subnets. Otherwise, traffic to IP addresses in these subnets will fail.

Additional Information

3.2.2 was incorrect since policy and MP stateful values where different, this happens when sync between policy to MP (manager) did not happen or failed. Upgrade to 4.1.2 triggered sync between policy & MP which corrected stateful flag on MP so 4.1.2 was correctly representing the intent and realized to edge nodes.

If 3.2.2 setup was upgrade via 3.1.2 in past and had gateway policies then there can be problematic gateway policies .