Kubernetes Master nodes failing with "Process 'ncp' Does not exist" following NSX-T upgrade to 2.4.1 or later
search cancel

Kubernetes Master nodes failing with "Process 'ncp' Does not exist" following NSX-T upgrade to 2.4.1 or later

book

Article ID: 298592

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

Following an NSX-T upgrade from from NSX-T 2.3 to 2.4.1 or 2.4.2, all Kubernetes (K8s) cluster master nodes in all clusters are in a state of 'failing'.


Additional Symptoms

  • From the K8s master node, monit summary reports:
"Process 'ncp' Does not exist"
  • The /var/vcap/sys/log/ncp/pre-start.stderr.log on the Kubernetes Master VM shows:
vmware_nsxlib.v3.exceptions.ServiceClusterUnavailable: Service cluster: 'https://<YOUR NSX MANABER ADDRESS>' is unavailable. Please, check NSX setup and/or configuration When checking the config in OpsMan we get a cert error.
  • Opsmanager > Bosh Director Tile > vCenter Config > Save returns with error:
  • The following command (when run from machine in same network as PKS Control Plane, such as Opsmanager VM) shows the "CN" set to a 'hostname' and "not" the FQDN or IP address of the new VIP of your NSX Management Cluster:
openssl s_client -connect <NSX-MANAGER-FQDN>:443 -showcerts


Environment

Product Version: 1.4

Resolution

RCA

Upgrading from NSX-T 2.3 to v2.4.1 or v2.4.2 introduces architecture of 3-node NSX Management Cluster behind a VIP. You have finished creating the VIP but NSX Manager creates a self-signed API certificate with its hostname as the subject and issuer, by default. That was the one shown via:
openssl s_client -connect <NSX-MANAGER-FQDN>:443 -showcerts

Solution

1. Identify the NSX CA Cert you had already been using for BOSH.

2. From Ops Manager UI -> vCenter Config tab -> NSX CA Cert

3. Copy the entire NSX CA Cert from above into a text: bosh_nsx_ca.cert

4. Use the openssl cli tool to identify certificate details for locating the correct certificate UUID from NSX Manager:
openssl x509 -in bosh_nsx_ca.cert -noout -serial -subject -issuer
Example output:
serial=<Your Cert Serial Number>
subject= /C=US/ST=California/L=CA/O=NSX/CN=<YOUR-CN-ISSUED-TO>
issuer= /C=US/ST=California/L=CA/O=NSX/CN=<YOUR-CN-ISSUED-BY>
5. Login to your NSX Manager UI with your admin credentials.

6. Select the "System" tab.

7. Select "Certificates" from sidebar

8. Locate the Certificate with the "Issued To" column showing your <YOUR-CN-ISSUED-TO> from above
    "Issued By" column showing your <YOUR-CN-ISSUED-BY> from above

9. Confirm the serial number matches by clicking the Certificate.

10. Copy the ID for that Certificate to use below.

11. Identify the VIP used by your NSX Management Cluster:
  • Login to your NSX Manager UI with your admin credentials
  •  Select "System" tab
  • Copy the VIP address from the "Virtual IP
12. Then register the original self-signed cert to the cluster certificate API with the IP of the VIP as shown below:
export NSX_MANAGER_IP_ADDRESS=<NSX Manager VIP address>
export CERTIFICATE_ID="Certificate UUID retrieved above"
export NSX_PASSWORD="your nsx manager password"

curl --insecure -u admin:$NSX_PASSWORD -X POST "https://$NSX_MANAGER_IP_ADDRESS/api/v1/cluster/api-certificate?action=set_cluster_certificate&certificate_id=$CERTIFICATE_ID"
For additional details on NSX Manager Certificate Authority certificate for 2.4.1 and later, refer to:
https://docs.pivotal.io/pks/1-5/nsxt-generate-ca-cert.html