Unable to deleted ghosted (stale) NAT rule. Status unknown
search cancel

Unable to deleted ghosted (stale) NAT rule. Status unknown

book

Article ID: 403370

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

In rare conditions, after creating a NAT rule and attempting to delete the NAT rule.

  • The NAT rule is:
    • Still visible
    • Not editable

  • You may see GUI similar to this:



  • You may see recurring log similar to (time stamps will vary):
    • NAT Object Created:
      • [2025-01-23T01:23:45.678Z] <SRC NAT IP> <DST NAT IP> "PUT" "/global-manager/api/v1/global-infra/tier-0s/Stretched-Tier0/nat/USER/nat-rules/<NAT RULE NAME>" "HTTP/1.1" 200 - 305 568 259 256 "<SRC NAT IP>" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:137.0) Gecko/20100101 Firefox/137.0" "########-####-####-####-############" "<NSX_MANAGER_NAME>" "127.0.0.1:64440"
      • 2025-04-23T01:23:45.688Z INFO http-nio-127.0.0.1-64440-exec-1214 PolicyServiceImpl 6155 POLICY [nsx@6876 comp="global-manager" level="INFO" reqId="########-####-####-####-############" subcomp="global-manager" username="<USER_NAME>"] Entity /global-infra/tier-0s/Stretched-Tier0/nat/USER/nat-rules/<NAT RULE NAME> does not exist, creating
  • The following logs may repeat due multiple delete attempts:
    • NAT Object Exists:
      • /var/log/gmanager/gmanager.1.log:2025-01-23T11:23:45.675Z INFO http-nio-127.0.0.1-64440-exec-1236 AllSiteConsolidatedStatusStrategy 6155 POLICY [nsx@6876 comp="global-manager" level="INFO" reqId="########-####-####-####-############" subcomp="global-manager" username="<USER>"] SGS: Using strategy type: ALL_SITE_BASED for intent: /global-infra/tier-0s/Stretched-Tier0/nat/USER/nat-rules/<NAT RULE NAME> with gmObjectVersion: 2
    • NAT Object Doesn't Exists:
      • /var/log/gmanager/gmanager.log:2025-01-23T21:23:45.678Z WARN ScatterGatherService-1-31 PolicyRealizationFacadeHelper 6155 POLICY [nsx@6876 comp="global-manager" level="WARNING" subcomp="global-manager"] NsxTRestException while executing request Policy object path=[/global-infra/tier-0s/Stretched-Tier0/nat/USER/nat-rules/<NAT RULE NAME>] does not exist.
      • /var/log/gmanager/gmanager.log:Caused by: org.springframework.web.client.HttpClientErrorException$NotFound: 404 Not Found: "{<EOL> "httpStatus" : "NOT_FOUND",<EOL> "error_code" : 500090,<EOL> "module_name" : "Policy",<EOL> "error_message" : "Policy object path=[/global-infra/tier-0s/Stretched-Tier0/nat/USER/nat-rules/<NAT RULE NAME>] does not exist."<EOL>}"
      • {...}
    • NAT Object Exists:
      • /var/log/gmanager/gmanager.1.log:2025-01-23T11:23:45.675Z INFO http-nio-127.0.0.1-64440-exec-1236 AllSiteConsolidatedStatusStrategy 6155 POLICY [nsx@6876 comp="global-manager" level="INFO" reqId="########-####-####-####-############" subcomp="global-manager" username="<USER>"] SGS: Using strategy type: ALL_SITE_BASED for intent: /global-infra/tier-0s/Stretched-Tier0/nat/USER/nat-rules/<NAT RULE NAME> with gmObjectVersion: 2
      • {...}
    • {...}



Environment

  • NSX 4.1.x +

Cause

  • A rare race condition caused Roaringbitmap, an open source project, queue issues resulting in bitmap cardinality corruption.
  • This caused the illegal state exception

Resolution

  • This issue is resolved in future releases

Additional Information

  • The following workaround may be used:
    • A rolling reboot of the NSX managers.
      • Perform a rolling reboot of all NSX Manager nodes:
        • Reboot all 3 NSX Manager nodes, one at a time

        1) Reboot the first NSX Manager.
        2) SSH to a Manager as admin user and check cluster health: get cluster status
        3) When all services report up on all 3 NSX Manager nodes, reboot the next Manager.
        4) Repeat steps 2 and 3 for the third Manager.