pks-nsx-t-osb-proxy job fails while installing PKS
search cancel

pks-nsx-t-osb-proxy job fails while installing PKS

book

Article ID: 298559

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

Symptoms:
While installing PKS, Apply changes fails with below error message:
Task 2034 | 20:13:40 | Preparing deployment: Preparing deployment (00:00:06)
Task 2034 | 20:13:51 | Preparing package compilation: Finding packages to compile (00:00:00)
Task 2034 | 20:13:51 | Updating instance pivotal-container-service: pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524 (0) (canary) (00:05:50)
                    L Error: 'pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524 (0)' is not running after update. Review logs for failed jobs: pks-nsx-t-osb-proxy
Task 2034 | 20:19:41 | Error: 'pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524 (0)' is not running after update. Review logs for failed jobs: pks-nsx-t-osb-proxy

Task 2034 Started  Mon Jul 16 20:13:40 UTC 2018
Task 2034 Finished Mon Jul 16 20:19:41 UTC 2018
Task 2034 Duration 00:06:01
Task 2034 error
Output of BOSH VMS shows pivotal-container-service in failing state
bosh vms
Using environment '172.18.0.11' as user 'director' (bosh.*.read, openid, bosh.*.admin, bosh.read, bosh.admin)

Task 2036

Task 2036 done

Deployment 'pivotal-container-service-e260fba0d7899a68594c'

Instance                                                        Process State  AZ   IPs          VM CID                                   VM Type  Active  
pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524  failing        az1  172.18.0.12  vm-c94e18eb-a669-4c3d-9c36-2638881e20c3  large    -  

1 vms
After SSH into pivotal-container-service vm monit summary shows pks-nsx-t-osb-proxy failing
pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524:~# monit summary
The Monit daemon 5.2.5 uptime: 1h 4m 

Process 'pks-api'                   running
Process 'broker'                    running
Process 'pks-nsx-t-osb-proxy'       not monitored
Process 'mariadb_ctrl'              running
Process 'galera-healthcheck'        running
Process 'gra-log-purger-executable' running
Process 'cluster_health_logger'     running
Process 'uaa'                       running
Process 'telemetry'                 running
Process 'event-emitter'             running
Process 'bosh-dns'                  running
System 'system_localhost'           running
In /var/vcap/sys/log/pks-nsx-t-osb-proxy/pks-nsx-t-osb-proxy.stderr.log following errors are reported
time="2018-07-16T21:18:23Z" level=error msg="Failed to extract edge cluster ID from router 020124fa-184c-4570-babb-2ca4fb855102" pks-networking=networkManager
2018/07/16 21:18:23 Error initializing a NSX-T client: Error getting network manager for cluster Get https://10.193.53.20/api/v1/logical-routers/020124fa-184c-4570-babb-2ca4fb855102: remote error: tls: unknown certificate
time="2018-07-16T21:19:04Z" level=error msg="Failed to extract edge cluster ID from router 020124fa-184c-4570-babb-2ca4fb855102" pks-networking=networkManager
2018/07/16 21:19:04 Error initializing a NSX-T client: Error getting network manager for cluster Get https://10.193.53.20/api/v1/logical-routers/020124fa-184c-4570-babb-2ca4fb855102: remote error: tls: unknown certificate
time="2018-07-16T21:19:43Z" level=error msg="Failed to extract edge cluster ID from router 020124fa-184c-4570-babb-2ca4fb855102" pks-networking=networkManager
2018/07/16 21:19:43 Error initializing a NSX-T client: Error getting network manager for cluster Get https://10.193.53.20/api/v1/logical-routers/020124fa-184c-4570-babb-2ca4fb855102: remote error: tls: unknown certificate

Environment


Cause

This issue can be caused by one or more of the following reasons:
  • NSX CA Cert misconfigured under Bosh Director tile -> vCenter Config -> NSX Networking -> NSX-T
  • NSX Manager CA Cert misconfigured or missing under PKS Tile -> Networking
  • NSX Manager Super User Principal Identity Certificate misconfigured under PKS Tile -> Networking 

Resolution

Make sure the NSX Manager CA Cert and NSX Manager Super User Principal Identity Certificate values are present in the respective fields. Also, both these certs should be registered with NSX-T. Missing the certificate registration steps while configuring the tiles will result in these failures.