Users are unable to log in to the NSX Manager UI and received HTTP 503 error. The affected NSX Manager is integrated with VMware Identity Manager (vIDM) for authentication with CRL checking enabled.
In the NSX Manager log /var/log/proxy/envoy_access_log, HTTP response code 503 with UAEX can be seen for these API calls:
<Source IP address> <NSX Manager IP address> "GET" "/api/v1/node/version" "HTTP/1.1" 503 UAEX 0 0 59997 - "<Source IP address>" "vAPI/2.52.0 Java/17.0.10 (Linux; 5.10.214-1.ph4; amd64)" "6db5cf7e-####-####-####-9226406b2876" "a#######01nsx01.#####.com:443" "-"<Source IP address> <NSX Manager IP address> "GET" "/api/v1/node" "HTTP/1.1" 503 UAEX 0 0 59996 - "<Source IP address>" "" "db640706-####-####-####-26c47e611fd1" "a#######01nsx01.#####.com" "-"
The NSX Manager log located at ./config/vidm/vidm.properties confirms that vIDM integration is enabled and configured.
lb_enable=Falsevidm.host_name=<vidm host name>vidm.thumbprint=###########################################31Dvidm.vidm_enable=Truevidm.client_id.admin=nsxt-managervidm.client_secret.admin=<obfuscated_vidm.client_secret.admin>node.host_name=<nsx manager hostname>vidm.client_id.user=<nsx manager oath client id>vidm.client_secret.user=####################fj
Verification via the following Policy API request confirms that CRL checking is enabled:
GET https://<NSX-Manager>/policy/api/v1/infra/security-global-config
{
"crl_checking_enabled": true,
...
}
The /var/log/proxy/reverse-proxy logs confirm multiple Certificate Revocation List (CRL) retrieval attempts during the time of the incident.
reverse-proxy.log:YYYY-MM-DDT00:01:03.825Z INFO Processing request ########-####-####-####-##########a3 CrlWebDirectFetcher 147691 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="http"] Fetching CRL from http://<CRL Server>/##########################CA1-4.crlreverse-proxy.log:YYYY-MM-DDT00:01:04.374Z INFO Processing request ########-####-####-####-##########a3 CrlWebDirectFetcher 147691 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="http"] Fetching CRL from http:///<CRL Server>/######GlobalRootCA.crlreverse-proxy.log:YYYY-MM-DDT00:01:04.926Z INFO Processing request ########-####-####-####-##########a3 CrlWebDirectFetcher 147691 SYSTEM [nsx@6876 comp="nsx-manager" level="INFO" subcomp="http"] Fetching CRL from http://<CRL Server>/##########################CA1-4.crl
The proxy‑tomcat-wrapper logs located in /var/log/proxy indicate that the CRL fetch process was locked.
INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | "Processing request ########-####-####-####-##########a3" #97445 daemon prio=5 os_prio=0 cpu=1959727.13ms elapsed=1075747.75s tid=0x00007936d52a8320 nid=0x18ded8 runnable [0x000079365f818000]INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | java.lang.Thread.State: RUNNABLEINFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at sun.nio.ch.SocketDispatcher.read0([email protected]/Native Method)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at sun.nio.ch.SocketDispatcher.read([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at sun.nio.ch.NioSocketImpl.tryRead([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at sun.nio.ch.NioSocketImpl.implRead([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at sun.nio.ch.NioSocketImpl.read([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at sun.nio.ch.NioSocketImpl$1.read([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at java.net.Socket$SocketInputStream.read([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at java.io.BufferedInputStream.fill([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at java.io.BufferedInputStream.read1([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at java.io.BufferedInputStream.read([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | - locked <0x00007936af1534c8> (a java.io.BufferedInputStream)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at sun.net.www.MeteredStream.read([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at java.io.FilterInputStream.read([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at org.bouncycastle.util.io.Streams.pipeAll(Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at org.bouncycastle.util.io.Streams.readAll(Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at org.bouncycastle.jcajce.provider.CertificateFactory.readCrl(Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at org.bouncycastle.jcajce.provider.CertificateFactory.engineGenerateCRL(Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at java.security.cert.CertificateFactory.generateCRL([email protected]/Unknown Source)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.CrlFetcher.readCrlFromStream(CrlFetcher.java:41)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.CrlWebDirectFetcher.downloadCrlFromWeb(CrlWebDirectFetcher.java:148)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.CrlWebDirectFetcher.downloadCrl(CrlWebDirectFetcher.java:44)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.CrlWebDirectFetcher.fetch(CrlWebDirectFetcher.java:35)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.CrlWebFetcher.fetch(CrlWebFetcher.java:50)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.CdpCrlChecker.checkRevocation(CdpCrlChecker.java:109)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.CdpCrlChecker.checkRevocation(CdpCrlChecker.java:79)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.NsxTrustManager.checkCertificateValid(NsxTrustManager.java:371)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.NsxTrustManager._checkServerTrusted(NsxTrustManager.java:330)INFO | jvm 1 | YYYY/MM/DD HH:MM:SS | at com.vmware.nsx.management.security.NsxTrustManager.checkServerTrusted(NsxTrustManager.java:297)
VMware NSX
This issue occurs because Certificate Revocation List (CRL) retrieval requests do not have a defined timeout. CRL retrieval requests from NSX Manager do not time out when the CRL endpoint fails to respond. As repeated CRL fetch attempts accumulate, they consume all available worker threads. Once these threads are exhausted, authentication related operations cannot be processed which caused NSX Manager to return HTTP 503 errors.
This is a known issue and will be addressed in a future NSX release.
Workarounds:
Option 1:
Disable CRL checking. This is the common default configuration. Refer to KB 396503 for detailed instructions.
Option 2:
Restart the authentication service when the issue occurs. Log in to the NSX Manager CLI as admin and run:
restart service auth