VMware NSX
VMware Tanzu
A PowerCLI automation script intended to create a single NAT rule for a specific organization (ORG) within a Tanzu Kubernetes Grid (TKG) / NSX-T NCP–integrated environment instead resulted in:
The new NAT rule being created successfully
All existing NAT rules for other ORGs being deleted or overwritten
This caused widespread networking impact across multiple Kubernetes namespaces and ORG environments due to lost NAT flows and broken routing.
The resolution is to re-create all deleted NAT rules manually to restore production traffic.
Best Practices:
The below recommendations should always be followed before running a script in a production environment:
1. Always Retrieve the Existing NAT Rule Set Before Adding a New Rule
2. Validate Script Logic for Declarative NSX-T Policy API
3. Avoid PUT for Single Rule Creation Where Possible
4. Test All Automation in a Non-Production Environment
5. Add Safety Checks to Automation
Below is an explanation of what potential causes may exist which can result in all NAT rules being removed.
1a. NAT Rules API Used by the Script Was Performing a Full Object Replace
A PUT call replaces the entire NAT rule list, not just the single rule being added.
"This is the complete and only desired state of NAT rules for this Tier-1/Tier-0."
As a result:
All existing NAT rules were wiped
Only the newly submitted rule remained
1.b Improper Use of PowerCLI / REST Calls
NSX-T Policy API operates declaratively: NAT rules are stored as a single JSON structure, not individual objects.
Many destructive automation issues come from misunderstanding the difference between PUT (replace) and PATCH (modify).
NSX-T Manager does not maintain implicit rule history; overwritten NAT rules are unrecoverable without API or Manager backups.
VMware strongly recommends:
"GET → Modify → PUT" workflow for any Policy API automation.
When integrating Tanzu/NCP, NAT rules often map to Kubernetes namespaces, pods, load balancer IPs, and org contexts—overwrites can cause widespread cluster outages.
2. The Script Did Not Retrieve and Re-Submit Existing NAT Rules
Please see the following reference documentation:
Install NCP in a Tanzu Application Service Environment
Deploying Elastic Application Runtime with NSX-T Networking