Symptoms:
- When creating a pks cluster from command line the cluster fails with the a cluster status similar to the following:
pks cluster my-cluster
PKS Version: 1.6.0-build.17
Name: my-cluster
K8s Version: 1.15.5
Plan Name: small
UUID: 4956a7db-dc0a-44af-9b5c-98e20ab7b30f
Last Action: CREATE
Last Action State: failed
Last Action Description: Instance provisioning failed: There was a problem completing your request. Please contact your operations team providing the following information: service: p.pks, service-instance-guid: 4956a7db-dc0a-44af-9b5c-98e20ab7b30f, broker-request-id: 0d65886f-34bf-42e5-88f7-593f1f5cca08, task-id: 237, operation: create, error-message: 0 succeeded, 1 errored, 0 canceled
Kubernetes Master Host: my-cluster
Kubernetes Master Port: 8443
Worker Nodes: 1
Kubernetes Master IP(s): In Progress
Network Profile Name:
- You see messages similar to the following in the debug output for the related bosh task:
bosh task 237 --debug
{"time":1576772663,"stage":"Fetching logs for apply-addons/42ad20a4-d091-4277-aa03-cd2d5de05725 (0)","tags":[],"total":1,"task":"Finding and packing log files","index":1,"state":"finished","progress":100}
', "result_output" = '{"instance":{"group":"apply-addons","id":"42ad20a4-d091-4277-aa03-cd2d5de05725"},"errand_name":"apply-addons","exit_code":1,"stdout":"Deploying /var/vcap/jobs/apply-specs/specs/coredns.yml\nserviceaccount/coredns created\nclusterrole.rbac.authorization.k8s.io/system:coredns created\nclusterrolebinding.rbac.authorization.k8s.io/system:coredns created\nconfigmap/coredns created\ndeployment.extensions/coredns created\nservice/kube-dns created\nWaiting for deployment \"coredns\" rollout to finish: 0 out of 3 new replicas have been updated...\nWaiting for deployment \"coredns\" rollout to finish: 0 of 3 updated replicas are available...\nfailed to start all system specs after 1200 with exit code 124\n","stderr":"","logs":{"blobstore_id":"2ac79808-5fce-4b4f-7ceb-380032bca801","sha1":"d66b8ba869ba484ea9fe3eed3d754dfc1e381350"}}
', "context_id" = '7c9b826b-8945-4325-be8e-cc5b26d05678' WHERE ("id" = 237)
- You see messages similar to the following repeatedly in the ncp.stderr.log file:
Note: To get the NCP logs, use bosh to download a log bundle from the deployment service instance matching the failed cluster uuid. for example "bosh -d service-instance_4956a7db-dc0a-44af-9b5c-98e20ab7b30f logs" Then unpack them and review the "master.*/ncp/ncp/stderr.log"
Traceback (most recent call last):
File "/usr/local/bin/ncp", line 10, in <module>
sys.exit(main())
File "/usr/local/lib/python2.7/dist-packages/nsx_ujo/cmd/ncp.py", line 16, in main
ncp_main.start_ncp(coe)
File "/usr/local/lib/python2.7/dist-packages/nsx_ujo/ncp/main.py", line 160, in start_ncp
nsx_errors = common_utils.validate_nsx_config()
File "/usr/local/lib/python2.7/dist-packages/nsx_ujo/common/utils.py", line 796, in validate_nsx_config
ipnetwork_errors = _validate_mgr_ip_network()
File "/usr/local/lib/python2.7/dist-packages/nsx_ujo/common/utils.py", line 675, in _validate_mgr_ip_network
return _validate_ip_network(all_blocks, all_pools, external_ip_space_ids)
File "/usr/local/lib/python2.7/dist-packages/nsx_ujo/common/utils.py", line 733, in _validate_ip_network
if is_overlapped(ip_block, ip_space):
File "/usr/local/lib/python2.7/dist-packages/nsx_ujo/common/utils.py", line 757, in is_overlapped
ip_network1 = ipaddress.ip_network(obj1['cidr'])
File "/usr/lib/python2.7/dist-packages/ipaddress.py", line 186, in ip_network
return IPv4Network(address, strict)
File "/usr/lib/python2.7/dist-packages/ipaddress.py", line 1656, in __init__
raise ValueError('%s has host bits set' % self)
ValueError: 172.26.0.1/16 has host bits set