Here is the error details. # pks create cluster failed [root@ovrjbxlandc1prd ~]# pks clusters PKS Version Name k8s Version Plan Name UUID Status Action 1.5.1-build.8 test1 1.14.6 small a22b58ed-0223-4006-8066-61a51d41bef0 failed UPGRADE # pks create cluster failed details [root@ovrjbxlandc1prd ~]# pks cluster test1 PKS Version: 1.5.1-build.8 Name: test1 K8s Version: 1.14.6 Plan Name: small UUID: a22b58ed-0223-4006-8066-61a51d41bef0 Last Action: UPGRADE Last Action State: failed Last Action Description: Failed for bosh task: 115957, error-message: 0 succeeded, 1 errored, 0 canceled Kubernetes Master Host: test1.local Kubernetes Master Port: 8443 Worker Nodes: 3 Kubernetes Master IP(s): 10.52.12.25 Network Profile Name: # failed task 115957 debug bb task 115957 --debug {"time":1571729153,"stage":"Running errand","tags":[],"total":1,"task":"apply-addons/799f5517-6f4c-445a-a33d-28beaa73d439 (0)","index":1,"state":"finished","progress":100} {"time":1571729153,"stage":"Fetching logs for apply-addons/799f5517-6f4c-445a-a33d-28beaa73d439 (0)","tags":[],"total":1,"task":"Finding and packing log files","index":1,"state":"started","progress":0} {"time":1571729154,"stage":"Fetching logs for apply-addons/799f5517-6f4c-445a-a33d-28beaa73d439 (0)","tags":[],"total":1,"task":"Finding and packing log files","index":1,"state":"finished","progress":100} ', "result_output" = '{"instance":{"group":"apply-addons","id":"799f5517-6f4c-445a-a33d-28beaa73d439"},"errand_name":"apply-addons","exit_code":1,"stdout":"Deploying /var/vcap/jobs/apply-specs/specs/coredns.yml\n failed to start all system specs after 1200 with exit code 1\n","stderr":"unable to recognize \"/var/vcap/jobs/apply-specs/specs/coredns.yml\": Get https://master.k8s.internal:8443/api?timeout=32s: dial tcp: lookup master.k8s.internal on 169.254.0.2:53: no such host\nunable to recognize \ "/var/vcap/jobs/apply-specs/specs/coredns.yml\": Get https://master.k8s.internal:8443/api?timeout=32s: dial tcp: lookup master.k8s.internal on 169.254.0.2:53: no such host\nunable to recognize \"/var/vcap/jobs/apply-specs/specs/coredns.yml\": Get https://master.k8s.internal:8443/api?timeout=32s: dial tcp: lookup master.k8s.internal on 169.254.0.2:53: no such host\nunable to recognize \"/var/vcap/jobs/apply-specs/specs/coredns.yml\": Get https://master.k8s.internal:8443/api?timeout=32s: dial tcp: lookup master.k8s.internal on 169.254.0.2:53: no such host\nunable to recognize \"/var/vcap/jobs/apply-specs/specs/coredns.yml\": Get https://master.k8s.internal:8443/api?timeout=32s: dial tcp: lookup master.k8s.internal on 169.254.0.2:53: no such host\nunable to recognize \"/var/vcap/jobs/apply-specs/specs/coredns.yml\": Get https://master.k8s.internal:8443/api?timeout=32s: dial tcp: lookup master.k8s.internal on 169.254.0.2:53: no such host\n","logs":{"blobstore_id":"3f640dd6-1b4f-4c74-6da8-33289cfeaac0","sha1":"6fcbae029454fe59705422c781011f689f88f054"}} ', "context_id" = '8a8c3158-2399-4d9f-93fe-5352941eb2fe' WHERE ("id" = 115957) D, [2019-10-22T07:26:09.986885 #6032] [task:115957] DEBUG -- DirectorJobRunner: (0.000588s) (conn: 47322485367800) COMMIT I, [2019-10-22T07:26:09.987040 #6032] [] INFO -- DirectorJobRunner: Task took 1 minute 57.806051812999996 seconds to process.
sync_dns
job is crashing when BOSH creates a new VM. In sync_dns.stdout.log, the following output is seen:
ERROR -- Director: Shutting down bosh-director-sync-dns: Thread terminatedTail the log, this message comes up a couple of times within the span of a minute or so when you create the errand VM.
monit summary
also shows that the process goes from "running" to "not monitored" and back a couple of times. Run monit summary
from the BOSH Director console and check the status of the output.Bosh::Director::Models::LocalDnsBlob.latest <Bosh::Director::Models::LocalDnsBlob @values={:id=>5418, :blob_id=>nil, :version=>nil, :created_at=>nil, :records_version=>0, :aliases_version=>0}>Check the output and compare to the following:
Bosh::Director::Models::LocalDnsBlob.last <Bosh::Director::Models::LocalDnsBlob @values={:id=>6042, :blob_id=>6041, :version=>6042, :created_at=>2019-10-22 20:22:12 UTC, :records_version=>6110, :aliases_version=>0}> Bosh::Director::Models::LocalDnsBlob.latest <Bosh::Director::Models::LocalDnsBlob @values={:id=>5418, :blob_id=>nil, :version=>nil, :created_at=>nil, :records_version=>0, :aliases_version=>0}>To fix the problem, update the latest value with the following command, where version field is updated to match the id:
Bosh::Director::Models::LocalDnsBlob.find(id: 5418).update(version: 5418)