NAT stops working on NSX-T T0 or T1 Gateway
search cancel

NAT stops working on NSX-T T0 or T1 Gateway

book

Article ID: 317788

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:

NAT rules stop working on NSX-T T0 or T1s.

NAT rules appear missing from the T0/T1 interface when checking via CLI

Affecting all versions up to 3.1.x


Relevant log location:
In var/log/proton/nsxapi.log, the following message indicates a NPE is generated when the StateSync Sharding leader sync NAT configuration with the other two managers:

2021-10-26T05:09:14.690Z ERROR FullSyncMsgLoader AbstractFullStateSyncDataBuilder - - [nsx@6876 comp="nsx-manager" errorCode="MP4717" level="ERROR" subcomp="manager"] Error happened when provider com.vmware.nsx.management.firewall.sync.ccp.NatSectionAndRuleSyncMessageProvider@26357244 is converting messages from id LogicalRouter/969815bd-d958-409d-93e1-850c2011bd6a, skip it.
java.lang.NullPointerException: null


Check if there are any NAT rules with logging=null in Corfu.

In a shell on a manager, run the following command to write all NAT rules to a file
/opt/vmware/bin/corfu_tool_runner.py -r nsx-manager -t NatRule > /tmp/natrule.txt

Then, use the following command to find out any rules with logging=null
less /tmp/natrule.txt | grep "ruleId\|logging"
  ruleId=17422,
  logging=<null>,
  ruleId=2231,
  logging=<null>,

 


Environment

VMware NSX-T Data Center

Cause

During a full sync, StateSync sharing leader cannot handle any NAT rules with logging=null. It generates a NPE (null pointer exception) and stops sending any NAT rules after the NAT rules with logging=null to other managers. This causes the other two manager do not have the full NAT configuration. The other two managers, in turn, do not send the all NAT rules to edges. The is why a NAT rule with logging=null may not take effect.

Resolution

As the problem is specific to StateSync and StateSync is depreciated in 3.2, upgrade to NSX-T 3.2 or later to resolve this issue.


Workaround:
1. SSH to all NSX Managers as root
2. Use the following two commands to find out any NAT rules with logging=null
/opt/vmware/bin/corfu_tool_runner.py -r nsx-manager -t NatRule > /tmp/natrule.txt
less /tmp/natrule.txt | grep "ruleId\|logging"

3. Change the value of logging to false on all the rules listed in Step 2 output

Additional Information

Impact/Risks:
N-S traffic impact