NSX-T NAT rule(s) not working after Edge dataplane restart or Maintenance Mode exit
search cancel

NSX-T NAT rule(s) not working after Edge dataplane restart or Maintenance Mode exit

book

Article ID: 322632

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • You are using NSX-T 3.1.2.
  • You are using SNAT/DNAT on an NSX-T Gateway.
  • Traffic disruption may occurs for flow which use gateways that are configured for SNAT/DNAT after:
  • A dataplane service restart action on the NSX-T edge node where the gateway resides.
  • The NSX-T edge node has either entered and then existed maintenance mode enable/disable, thus causing a dataplane service restart.
  • Application traffic is impacted or disrupted as NAT rules stop working.
  • When we check the gateway DR downlink interface firewall settings (gateway is configure for SNAT/DNAT), using the following command:
get logical-router interface <logical-router-interface-UUID> | json
  •  Below we can see '"enable-firewall": false', the setting of false indicates you are encountering this issue:
 {
  "access_vlan": "untagged",
  "admin": "up",
  "arp_proxy_table": [],
  "connect-to-service-plane-ew": false,
  "connect-to-service-plane-ns": false,
  "dad-mode": "LOOSE",
  "dad-profile": "(1 sec, 3 rtr)",
  "enable-firewall": false,
  "enable-firewall-ike": false,
  "enable-firewall-pbr": false,
  "enable-firewall-rule": false,
 (...)
 }
  • If you are not encountering this issue, the setting will be 'true'.


Environment

VMware NSX-T Data Center 3.x
VMware NSX-T
VMware NSX-T Data Center

Cause

This is caused by a race condition and will not always happen, if the DR instance is realized before the SR instance on the edge node, the issue can manifest itself. 
This is specific to NSX-T 3.1.2, earlier versions are able to handle this and not encounter the race condition.

Resolution

This issue is resolved in NSX-T 3.2.X and onward available at VMware Customer Connect.

Workaround:
You can use one of the options bellow to workaround this issue:
  1. Reboot the edge node.
  2. Push a fresh config from the NSX Manager by changing the description field on the Edge node and save it on the GUI or some other editable field of the gateway.
  3. Run an API call to push a fresh config to the Edge.
First retrieve the edge node configuration:
GET /api/v1/transport-nodes/<Edge-Node-UUID>
Then without editing, copy the result of the GET API call and use it in the body of the next PUT API call:
PUT /api/v1/transport-nodes/<Edge-Node-UUID>