SNAT port usage high alarm triggered even though active sessions are not high
search cancel

SNAT port usage high alarm triggered even though active sessions are not high

book

Article ID: 327343

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Critical alarms is triggered for SNAT port usage on logical router for a specific IP. Upon checking the active flows for the specific SNAT IP, flow count will be lesser than the maximum possible SNAT limit per IP which is close to 60,000

 

nsxedge-01(tier0_sr[15])> get firewall connection state | count 10.10.10.10

Mon Apr 17 2023 UTC 17:44:48.822

Number of lines that match pattern '10.10.10.10': 15079

 

In case SNAT port usage limit is reached, we might find 'Failed NAT translation' incrementing

 

nsxedg-01(tier0_sr[15])> get firewall interface stats | find Failed.NAT.trans

Tue May 02 2023 UTC 10:19:42.495

Failed NAT translation                 : 0

 


Symptoms:

Critical alarm "SNAT Port Usage On Gateway Is High" is seen continuously for SNAT IP, even though there are not many active connections

 

NSX alarm in syslog:

 

2023-04-14T18:05:10.037Z nsxmgr-03 NSX 5281 MONITORING [nsx@6876 alarmId="927cab4a-e760-42b2-8faa-36f3a55a14b5" alarmState="OPEN" comp="nsx-manager" entId="62a03bb6-fb1e-4e20-983a-a7279c6a0ca6" errorCode="MP701099" eventFeatureName="nat" eventSev="CRITICAL" eventState="On" eventType="snat_port_usage_on_gateway_is_high" level="FATAL" nodeId="62a03bb6-fb1e-4e20-983a-a7279c6a0ca6" subcomp="monitoring"] SNAT ports usage on logical router 42ecb79b-5ad0-470e-bc6d-c3e599c41862 for SNAT IP 10.10.10.10 has reached the high threshold value of 80%. New flows will not be SNATed when usage reaches the maximum limit.

 

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.


Environment

VMware NSX-T Data Center

Cause

Alarm triggered due to a software bug

Resolution

Issue is resolved in upcoming NSX-T version 4.1.1


Workaround:

"Disable" the alarm under "Alarm Definitions". This should avoid the alarm from appearing. It is safe to do so as the error is happening by bug and not because of SNAT ports running out.


Additional Information

Impact/Risks:

No impact to production, alarm is false positive