This article provides the steps to bring up UI on one NSX manager node. Once the UI becomes accessible and file permissions are fixed, then a new certificate can be generated via UI and expired certificate can be replaced via Apply Certificate API.
Symptoms:
Certificate Name and Service Type Mapping
NSX Manager Certificate Documentation For Reference: https://docs.vmware.com/en/VMware-NSX/4.1/administration/GUID-3DD19193-770C-47F3-A0F3-7B7703F274C8.html
Certificate Name
|
Service Type |
---|---|
API-Corfu Client | service_type=CBM_API |
AR-Corfu Client | service_type=CBM_AR |
CCP-Corfu Client | service_type=CBM_CCP |
Cluster Manager-Corfu | service_type=CBM_CLUSTER_MANAGER |
CM Inventory-Corfu Client | service_type=CBM_CM_INVENTORY |
Corfu Server | service_type=CBM_CORFU |
IDPS reporting-Corfu Client | service_type=CBM_IDPS_REPORTING |
Messaging Manager-Corfu Client | service_type=CBM_MESSAGING_MANAGER |
Monitoring-Corfu Client | service_type=CBM_MONITORING |
MP-Corfu Client | service_type=CBM_MP |
Site Manager-Corfu Client | service_type=CBM_AR |
Upgrade Coordinator-Corfu Client | service_type=CBM_UPGRADE_COORDINATOR |
GM-Corfu Client | service_type=CBM_GM |
Cluster status Group Type: MANAGER Group Status: UNAVAILABLE Members: UUID FQDN IP IPv6 STATUS <UUID_MGR1> <FQDN_MGR1> <IP_MGR1> - DOWN <UUID_MGR2> <FQDN_MGR2> <IP_MGR2> - DOWN <UUID_MGR3> <FQDN_MGR3> <IP_MGR3> - DOWN Group Type: HTTPS Group Status: UNAVAILABLE Members: UUID FQDN IP IPv6 STATUS <UUID_MGR1> <FQDN_MGR1> <IP_MGR1> - DOWN <UUID_MGR2> <FQDN_MGR2> <IP_MGR2> - DOWN <UUID_MGR3> <FQDN_MGR3> <IP_MGR3> - DOWN
2023-09-16T19:10:58.975Z WARN pool-18-thread-4 Step 83803 - [nsx@6876 comp="nsx-manager" level="WARNING" subcomp="cbm"] javax.net.ssl.SSLException: java.io.FileNotFoundException: File '/config/cluster-manager/mp/private/keystore.password' does not exist
at com.vmware.nsx.cbm.cert.CertUtils.readFromFile(CertUtils.java:73)
at com.vmware.nsx.cbm.cert.impl.SelfSignedTrustArtifactory.replaceCertificatesOnDisk(SelfSignedTrustArtifactory.java:180)
at com.vmware.nsx.cbm.tasks.impl.ReplaceCertificatesTask$ReplaceCertificatesOnDisk.executeStep(ReplaceCertificatesTask.java:261)
at com.vmware.nsx.cbm.tasks.Task.executeTask(Task.java:329)
at com.vmware.nsx.cbm.tasks.Task.executeTaskWithCheck(Task.java:300)
at com.vmware.nsx.cbm.tasks.Task.call(Task.java:280)
at com.vmware.nsx.cbm.tasks.Task.call(Task.java:46)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.FileNotFoundException: File '/config/cluster-manager/mp/private/keystore.password' does not exist
at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:2368)
at org.apache.commons.io.FileUtils.readFileToString(FileUtils.java:2486)
at com.vmware.nsx.cbm.cert.CertUtils.readFromFile(CertUtils.java:71)
... 12 more
2023-09-16T19:10:58.975Z ERROR pool-18-thread-4 Task 83803 - [nsx@6876 comp="nsx-manager" errorCode="CBM41" level="ERROR" subcomp="cbm"] Step ReplaceCertificatesOnDisk (6/7) failed for Task com.vmware.nsx.cbm.tasks.impl.ReplaceCertificatesTask: java.io.FileNotFoundException: File '/config/cluster-manager/mp/private/keystore.password' does not exist
2023-09-16T19:10:58.975Z ERROR pool-18-thread-4 Task 83803 - [nsx@6876 comp="nsx-manager" errorCode="CBM411" level="ERROR" subcomp="cbm"] [CBM411] Error occurred while replacing certificates in private keyStores.
javax.net.ssl.SSLException: java.io.FileNotFoundException: File '/config/cluster-manager/mp/private/keystore.password' does not exist
2023-09-16T19:10:59.074Z ERROR CertificateStreamListener-1-1 CertificateStreamListener 83803 - [nsx@6876 comp="nsx-manager" errorCode="CBM100" level="ERROR" subcomp="cbm"] ReplaceCertificatesTask error: Optional[[CBM411] Error occurred while replacing certificates in private keyStores.], task status: FAILED.
/config/cluster-manager/<service>/private/
shows that the permissions are not set to 770
(-rwxrwx---
) as needed.# ls -l /config/cluster-manager/*/private/
/config/cluster-manager/ar/private/:
total 8
-rw------- 1 nsx-replicator nsx-replicator 2051 May 4 2021 keystore.jks
-rw------- 1 nsx-replicator nsx-replicator 44 May 4 2021 keystore.password
/config/cluster-manager/ccp/private/:
total 8
-rw------- 1 nsx nsx 2050 May 4 2021 keystore.jks
-rw------- 1 nsx nsx 44 May 4 2021 keystore.password
/config/cluster-manager/cluster-manager/private/:
total 8
-rw------- 1 nsx-cbm nsx-cbm 2076 May 4 2021 keystore.jks
-rw------- 1 nsx-cbm nsx-cbm 44 May 4 2021 keystore.password
/config/cluster-manager/cm-inventory/private/:
total 8
-rw------- 1 ucminv ucminv 2071 Jul 27 2022 keystore.jks
-rw------- 1 ucminv ucminv 44 Jul 27 2022 keystore.password
/config/cluster-manager/idps-reporting/private/:
total 8
-rw------- 1 nsx-idps nsx-idps 2077 May 4 2021 keystore.jks
-rw------- 1 nsx-idps nsx-idps 44 May 4 2021 keystore.password
/config/cluster-manager/messaging-manager/private/:
total 8
-rw------- 1 nsx-messaging nsx-messaging 2079 Jul 27 2022 keystore.jks
-rw------- 1 nsx-messaging nsx-messaging 44 Jul 27 2022 keystore.password
/config/cluster-manager/monitoring/private/:
total 8
-rw------- 1 uphc uphc 2067 May 4 2021 keystore.jks
-rw------- 1 uphc uphc 44 May 4 2021 keystore.password
/config/cluster-manager/mp/private/:
total 8
-rw------- 1 uproton uproton 2052 May 4 2021 keystore.jks
-rw------- 1 uproton uproton 44 May 4 2021 keystore.password
/config/cluster-manager/site-manager/private/:
total 8
-rwxrwx--- 1 nsx-sm nsx-sm 2073 Sep 16 15:20 keystore.jks
-rwxrwx--- 1 nsx-sm nsx-sm 44 Sep 16 15:20 keystore.password
/config/cluster-manager/upgrade-coordinator/private/:
total 8
-rw------- 1 uuc uuc 2085 Jul 27 2022 keystore.jks
-rw------- 1 uuc uuc 44 Jul 27 2022 keystore.password
Resolved in NSX release 4.1.2 and above.
Impact/Risks:
The CBM_<service> that has had its certificates replaced may be unable to connect to the CorfuDB. This can have varying impact and may result in the UI/API being inaccessible in the case of CBM_MP certificates having been replaced prior to permissions being fixed.
This issue has been found in environments upgraded to 4.1.1 from 3.2.x.
Greenfield environments deployed with NSX 4.1.1 or brownfield 4.0.x environments upgraded to 4.1.1 should not be impacted.
Found In: NSX 4.1.1