Intermittent HTTP 503 error response when authenticating NSX-T manager via LDAPS
search cancel

Intermittent HTTP 503 error response when authenticating NSX-T manager via LDAPS

book

Article ID: 375741

calendar_today

Updated On:

Products

VMware NSX VMware NSX-T Data Center

Issue/Introduction

  • LDAPS is configured in NSX for authentication.
  • NSX manager UI may go unreachable intermittently.
  • NSX Manager returns HTTP response code 503 when connecting via API or admin UI page.
  • Rebooting the NSX Manager temporarily resolves the issue.
  • NSX version is lower than 4.2.
  • In the NSX manager log /var/log/proxy/proxy-tomcat-wrapper.log,  a significant number of threads with identical stack traces is observed java.lang.Thread.State: WAITING:

stackTrace:
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000######b3bde8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(AbstractConnPool.java:379)
at org.apache.http.pool.AbstractConnPool.access$200(AbstractConnPool.java:69)
at org.apache.http.pool.AbstractConnPool$2.get(AbstractConnPool.java:245)
- locked <0x0000######25510> (a org.apache.http.pool.AbstractConnPool$2)

  • In the NSX manager log /var/log/proxy/envoy_access_log HTTP response code 503 'service unavailable' is seen: 

1#.##.##.#4 1#.###.##.#8 "GET" "/api/v1/node" "HTTP/1.1" 503 UAEX 0 0 60003 - "1#.##.##.#4" "vAPI/2.14.0 Java/11.0.22 (Linux; 5.10.216-1.ph4; amd64)" "9#####-####-####-####-########ca7f" "1#.###.##.#8" "-"
1#.##.##.#4  1#.###.##.#8 "GET" "/api/v1/node" "HTTP/1.1" 503 UAEX 0 0 60000 - "1#.##.##.#4" "vAPI/2.14.0 Java/11.0.22 (Linux; 5.10.216-1.ph4; amd64)" "d#####-####-####-####-########b78c" "1#.###.##.#8" "-"
1#.##.##.#4  1#.###.##.#8 "GET" "/api/v1/node" "HTTP/1.1" 503 UAEX 0 0 60001 - "1#.##.##.#4" "vAPI/2.14.0 Java/11.0.22 (Linux; 5.10.216-1.ph4; amd64)" "a#####-####-####-####-########bb67" "1#.###.##.#8" "-"

Note: The UAEX in above lines means the external authentication RPC call failed and 60000 is timeout in milliseconds, 60 seconds and the "-" at the end of the log line indicates the message was not delivered. 

Environment

VMware NSX-T Data Center
VMware NSX

Cause

The issue occurs, when the auth (authentication service) on the NSX manager, is unable to establish a connection with LDAPS. LDAPS doesn’t support a timeout setting. Consequently, any threads that were previously created, but not serviced, can remain open indefinitely, potentially leading to the LDAP server connection hanging.

Resolution

This issue is resolved in VMware NSX 4.2.0, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

To workaround the issue, the NSX manager can be rebooted and/or the the auth service can be restarted on the NSX manager, however, the issue can reoccur until the environment has been upgraded to a fixed version.

To restart the auth service, log into the NSX manager as admin user and run: 

restart service auth

Additional Information

If this KB did not help resolve your issue, you can review the following KB for further troubleshooting steps: Troubleshooting NSX API Calls