Symptoms:
While installing PKS, Apply changes fails with below error message:
Task 2034 | 20:13:40 | Preparing deployment: Preparing deployment (00:00:06)
Task 2034 | 20:13:51 | Preparing package compilation: Finding packages to compile (00:00:00)
Task 2034 | 20:13:51 | Updating instance pivotal-container-service: pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524 (0) (canary) (00:05:50)
L Error: 'pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524 (0)' is not running after update. Review logs for failed jobs: pks-nsx-t-osb-proxy
Task 2034 | 20:19:41 | Error: 'pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524 (0)' is not running after update. Review logs for failed jobs: pks-nsx-t-osb-proxy
Task 2034 Started Mon Jul 16 20:13:40 UTC 2018
Task 2034 Finished Mon Jul 16 20:19:41 UTC 2018
Task 2034 Duration 00:06:01
Task 2034 error
Output of BOSH VMS shows pivotal-container-service in failing state
bosh vms
Using environment '172.18.0.11' as user 'director' (bosh.*.read, openid, bosh.*.admin, bosh.read, bosh.admin)
Task 2036
Task 2036 done
Deployment 'pivotal-container-service-e260fba0d7899a68594c'
Instance Process State AZ IPs VM CID VM Type Active
pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524 failing az1 172.18.0.12 vm-c94e18eb-a669-4c3d-9c36-2638881e20c3 large -
1 vms
After SSH into pivotal-container-service vm monit summary shows pks-nsx-t-osb-proxy failing
pivotal-container-service/1a0f6b7c-5fdc-4a11-9a8a-38aa0df35524:~# monit summary
The Monit daemon 5.2.5 uptime: 1h 4m
Process 'pks-api' running
Process 'broker' running
Process 'pks-nsx-t-osb-proxy' not monitored
Process 'mariadb_ctrl' running
Process 'galera-healthcheck' running
Process 'gra-log-purger-executable' running
Process 'cluster_health_logger' running
Process 'uaa' running
Process 'telemetry' running
Process 'event-emitter' running
Process 'bosh-dns' running
System 'system_localhost' running
In
/var/vcap/sys/log/pks-nsx-t-osb-proxy/pks-nsx-t-osb-proxy.stderr.log following errors are reported
time="2018-07-16T21:18:23Z" level=error msg="Failed to extract edge cluster ID from router 020124fa-184c-4570-babb-2ca4fb855102" pks-networking=networkManager
2018/07/16 21:18:23 Error initializing a NSX-T client: Error getting network manager for cluster Get https://10.193.53.20/api/v1/logical-routers/020124fa-184c-4570-babb-2ca4fb855102: remote error: tls: unknown certificate
time="2018-07-16T21:19:04Z" level=error msg="Failed to extract edge cluster ID from router 020124fa-184c-4570-babb-2ca4fb855102" pks-networking=networkManager
2018/07/16 21:19:04 Error initializing a NSX-T client: Error getting network manager for cluster Get https://10.193.53.20/api/v1/logical-routers/020124fa-184c-4570-babb-2ca4fb855102: remote error: tls: unknown certificate
time="2018-07-16T21:19:43Z" level=error msg="Failed to extract edge cluster ID from router 020124fa-184c-4570-babb-2ca4fb855102" pks-networking=networkManager
2018/07/16 21:19:43 Error initializing a NSX-T client: Error getting network manager for cluster Get https://10.193.53.20/api/v1/logical-routers/020124fa-184c-4570-babb-2ca4fb855102: remote error: tls: unknown certificate