SNAT Port Usage On Gateway is High alarm
search cancel

SNAT Port Usage On Gateway is High alarm

book

Article ID: 374440

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Title: Alarm for snat_port_usage_on_gateway_is_high
  • Event ID: nat.snat_port_usage_on_gateway_is_high

  • Alarm Description: SNAT port usage on the Gateway is high.
    • Purpose: Alarm is raised to indicate high usage of an SNAT IP which can potentially lead to new flows getting dropped. The alarm is raised when port usage crosses 80% of the total range.
    • Reason: Any TCP/UDP flow that matches NAT rule with action as SNAT, undergoes source port translation as well (typically referred to as PAT). The range of ports available for source port translation is limited per SNAT translation IP address. This is because for UDP/TCP protocol, port is defined as 16 bits in length. Of these 16 bits a pool is reserved as well known ports. This leaves only a subset of ports available for PAT. Therefore, at any time only a fixed number of simultaneous flows can undergo SNAT translation for an IP address used as translation IP across SNAT Rules. When the simultaneous number of flows exceed a system defined threshold of the overall available range of ports, this alarm is generated.
    • Impact: New TCP/UDP flows will not be able to allocate port for translation and shall be dropped after the range of ports is 100% utilized. Under such a condition, sometimes high datapath CPU utilization may also be observed.

Environment

VMware NSX-T Data Center

VMware NSX

Resolution

Steps to Resolve
For 3.2.0 and above

  • Steps to Resolve
    • Check the usage of SNAT IP by checking the UDP/TCP flows in NSX Edge node where the SNAT IP is used, as follows
      • Log in as the admin user on Edge node and invoke the NSX CLI command `get firewall <LR_INT_UUID> connection state`. LR_INT_UUID is the interface to which the SNAT rule is applied. If the SNAT rule is not applied to any specific interface, use any Uplink interface UUID for the logical router.
      • Check the UDP/TCP flows listed out
      • Check the flows for any denial-of-service attack or anomalous burst.
      • For any denial-of-service attack, consider  limiting usage of the NAT rule for the source of attack (e.g. apply appropriate firewall rules)
    • If the traffic appears to be within the normal load but the alarm threshold is hit, consider adding more SNAT IP addresses to distribute the load or route new traffic to another Edge node.

Maintenance window required for remediation: No