The purpose of this article is to assist operators in identifying the root cause for the NCP (NSX-T Container Plugin) job failing, when NCP may only report the NSX Manager service as being unavailable.
If the PKS cluster creation fails at the NCP job process, the NSX-T NCP job may require additional logging enabled to show details of the failure in NCP.
In this scenario:
-- The "pks create-cluster ... " command failed:
-- The cluster VMS were successfully created in BOSH. But the Kubernetes Master node failed with an error during ncp pre-start script.
-- The BOSH task showed ncp job fails, while others are successful:
Example task output:
L Error: Action Failed get_task: Task 65cbac71-8109-4a68-6a2a-79552eeaab28 result: 1 of 8 pre-start scripts failed. Failed Jobs: ncp. Successful Jobs: pks-nsx-t-ncp, etcd, bpm, bosh-dns-enable, syslog_forwarder, bosh-dns, pks-nsx-t-prepare-master-vm.
-- From /var/vcap/sys/log/ncp/pre-start.stderr.log on the Kubernetes Master VM, it shows:
No handlers could be found for logger "vmware_nsxlib.v3.cluster"
Traceback (most recent call last):
File "/usr/local/bin/configure_nsx_resources", line 285, in <module>
if not args.func(args):
File "/usr/local/bin/configure_nsx_resources", line 40, in wrapper
raise e
vmware_nsxlib.v3.exceptions.ServiceClusterUnavailable: Service cluster: https://<NSX-Manager-addr-redacted>; is unavailable. Please, check NSX setup and/or configuration
-- The NCP reports NSX Manager as "unavailable". But the NSX Manager may be reachable and the underlying reason for the failure may not be clear.
Because NCP is a process managed by Bosh, the following steps can be taken by the operator to obtain data from NCP (e.g., exceptions, errors, etc) in order to more-quickly identify the root cause:
- bosh ssh <Kubernetes Master node>
- sudo -i
- edit /var/vcap/jobs/ncp/config/ncp.ini
- Depending on what you are debugging, add/set the relevant param:
# for NCP operations
loglevel=DEBUG
# or for NSX API client operations
nsxlib_loglevel=DEBUG
- Then restart ncp:
monit restart ncp
And look NCP events generated in the /var/vcap/sys/log/ncp/ directory. For example, ncp.stdout.log:
Refer to the NCP documentation for more information on additional ncp.ini settings.