Symptoms: - NSX Intelligence may be stuck in degraded state or deployment itself may fail. - Logical Switch Realization fails but works after all 3x UAs restart. - Host may show unknown status, manager GUI may show controller connectivity unavailable.
Environment
VMware NSX-T Data Center
Cause
- If the NSX Manager certificate contains a carriage return character, the deployment of the NSX Intelligence appliance might fail or the NSX Intelligence appliance might get stuck in a degraded state after it has been deployed. - If the NSX Manager certificate or NSX Intelligence certificate that has a carriage return character is updated after the NSX Intelligence appliance has been deployed, and when multiple manager services are restarted simultaneously, some of the services might fail to initialize.
Issue is only seen when there is Intelligence installed in the environment with certificate that includes carriage return.
Resolution
Logs to look for and their location ----------------------------------
++ /var/log/syslog
$ grep "unable to create new native thread" syslog* | wc -l 386
++ /var/log/nsxapi.log $ grep "java.lang.OutOfMemoryError: unable to create new native thread" nsxapi.* | wc -l 71766
++ desired_state_manager.json
Search for MGMT_CLUSTER and check if there is "\r\n" carriage return at the end of certificate as shown below.