Compute Manager 'Connection Status' goes Down after Upgrade from 3.X or 4.0.0.1 to 4.1 and later
search cancel

Compute Manager 'Connection Status' goes Down after Upgrade from 3.X or 4.0.0.1 to 4.1 and later

book

Article ID: 331583

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • From NSX UI under System->Fabric->Compute Managers, you see "Connection Status" as Down.
  • If you take a closer look at "Last Inventory Update", it matches the date when NSX was upgraded.
  • Attempts to re-connect the compute manager provide the error: "Compute manager server ###.###.###.### could not be connected, server might be un-reachable or connection details might be invalid.  Please check if compute manager certificate is valid and not revoked.  If the issue persists please check whether the https and http ports of compute manager are open in the firewall on all NSX nodes. (Error code: 7058)"
  • From logs /var/log/cm-inventory/cm-inventory.log you see the following : 
    2023-05-05T20:57:33.609Z  WARN InventoryFetcher-5c9f6ff1-71cf-####-####-########b72 CrlWebFetcher 4518 SYSTEM [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="cm-inventory"] Couldn't get LDAP context from URI ldap:///CN=srv-root-CA(1),CN=######,CN=###,CN=Public%20Key%20Services,CN=Services,CN=Configuration,DC=example,DC=com?certificateRevocationList?base?objectClass=cRLDistributionPoint
    javax.naming.CommunicationException: ldapserver.example.com.:389
            at com.sun.jndi.ldap.Connection.<init>(Connection.java:243) ~[?:1.8.0_352]
            at com.sun.jndi.ldap.LdapClient.<init>(LdapClient.java:137) ~[?:1.8.0_352]
            at com.sun.jndi.ldap.LdapClient.getInstance(LdapClient.java:1615) ~[?:1.8.0_352]
            at com.sun.jndi.ldap.LdapCtx.connect(LdapCtx.java:2849) ~[?:1.8.0_352]
            at com.sun.jndi.ldap.LdapCtx.<init>(LdapCtx.java:347) ~[?:1.8.0_352]
    TRUNCATED….
    Caused by: java.net.ConnectException: Connection timed out (Connection timed out)
            at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_352]
            at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_352]
            at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_352]
            at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_352]
            at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_352]
            at java.net.Socket.connect(Socket.java:607) ~[?:1.8.0_352]
            at java.net.Socket.connect(Socket.java:556) ~[?:1.8.0_352]
            at java.net.Socket.<init>(Socket.java:452) ~[?:1.8.0_352]
            at java.net.Socket.<init>(Socket.java:229) ~[?:1.8.0_352]
            at com.sun.jndi.ldap.Connection.createSocket(Connection.java:380) ~[?:1.8.0_352]
            at com.sun.jndi.ldap.Connection.<init>(Connection.java:220) ~[?:1.8.0_352]
            ... 47 more
    		
    
  • Using the following API call to the NSX-T manager, we see CRL checking is enabled:
    GET https://<manager>/api/v1/global-configs/SecurityGlobalConfig
    Result:
    ...
    "crl_checking_enabled": true,
    ...
    


Note: The preceding log excerpts are only examples. Date, time, and other variables may vary depending on your environment.

 

Environment

  • VMware NSX-T Data Center
  • VMware NSX 4.0.0.1

Cause

VMware NSX 4.1 enabled CRL (Certificate Revocation List) checking for certificates.

The CRLs are published on CRL Distribution Points (CDPs), a URI that indicates their location.

A typical unsupported CDP URI resembles the following example.  In the example, <MISSING_HOST> has been added where absent between the second and third '/' character:

ldap://<MISSING_HOST>/CN=srv-root-CA(1),CN=######,CN=###,CN=Public%20Key%20Services,CN=Services,CN=Configuration,DC=example,DC=com?certificateRevocationList?base?objectClass=cRLDistributionPoint

In a URI, the host is expected after the protocol (e.g. ldap://), and before the path.  Without the host, the URI is unresolvable and will fail the Certificate Revocation List check.                                                                                                                  

Resolution

No resolution available.  LDAP protocol CRL Distribution Points (CDPs) with no host specified in the URI are the most common reason to have a problem with CRL verification.

 

You can observe any CRLs present on the vCenter certificate by logging into an NSX manager as the root user and executing:

echo | openssl s_client -connect <vCenter IP/FQDN>:443 2>/dev/null | openssl x509 -text -noout | grep "CRL Distribution Points" -A 2

Example output if CRLs are found:

X509v3 CRL Distribution Points: 
Full Name:
URI:ldap:///CN=snv-root-CA(1),CN=######,CN=###,CN=Public%20Key%20Services,CN=Services,CN=Configuration,DC=example,DC=com?certificateRevocationList?base?objectClass=cRLDistributionPoint

If CDP URIs are printed from the above command, those will be verified by NSX.  To see if the URI is valid and reachable from the NSX Manager, login to the manager as the root user and execute:

wget <cdp-link>

 

Workaround options:

  1. Disable CRL checking.  NSX will no longer validate CDPs if they are present in certificates.

    Capture the current configuration via API:
    GET https://{{ip}}/policy/api/v1/infra/security-global-config
    
    This will provide a JSON response like the following:
    {
    "crl_checking_enabled": true,
    "ca_signed_only": false,
    "eku_checking_enabled": true,
    "id": "########-####-####-####-###########",
    "_create_time": 1679339007871,
    "_create_user": "system",
    "_last_modified_time": 1679339007871,
    "_last_modified_user": "system",
    "_protection": "NOT_PROTECTED",
    "_revision": 0
    }
    Modify the JSON response, changing the value of crl_checking_enabled to false, e.g.:
    {
    "crl_checking_enabled": false
    ...
    }
    Use the API to submit the modified JSON response:
    PUT https://{{ip}}/policy/api/v1/infra/security-global-config
  2. Deploy a new vCenter Certificate without the CDP attribute (X509v3 CRL Distribution Points) enabled.  In this case, NSX will have no CRL URIs to verify.
  3. Deploy a new vCenter certificate that uses HTTP(s) addresses as the CDPs.
    crlDistributionPoints=URI:http://example.com/crl.pem

    This is the most correct workaround from a certificate compliance perspective.