TKGI cluster creation fails due to missing or incorrect SNAT rule
search cancel

TKGI cluster creation fails due to missing or incorrect SNAT rule

book

Article ID: 298496

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

Symptoms:

Cluster creation fails with the following message in Bosh task logs:

Task 332 | 18:58:48 | Updating instance worker: worker/########-cf83-####-a9cc-############ (0) (canary) (00:02:23)
                    L Error: Action Failed get_task: Task f2b11c6c-ab26-4dbc-7832-dff69df0a84d result: 2 of 6 pre-start scripts failed. Failed Jobs: ncp, kubelet. Successful Jobs: bosh-dns-enable, syslog_forwarder, bosh-dns, nsx-pod-networking.
Task 332 | 19:01:11 | Error: Action Failed get_task: Task f2b11c6c-ab26-4dbc-7832-dff69df0a84d result: 2 of 6 pre-start scripts failed. Failed Jobs: ncp, kubelet. Successful Jobs: bosh-dns-enable, syslog_forwarder, bosh-dns, nsx-pod-networking.

Task 332 Started  Fri Apr  6 18:54:29 UTC 2018
Task 332 Finished Fri Apr  6 19:01:11 UTC 2018
Task 332 Duration 00:06:42
Task 332 error

Capturing task '332' output:
  Expected task '332' to succeed but state is 'error'

Exit code 1

 

TKGI ncp job fails with the below error message in /var/vcap/sys/log/ncp/pre-start.stderr.log

cat pre-start.stderr.log 
curl: (6) Could not resolve host: test-mgr.domain.com

Environment


Cause

While configuring T1 logical router for TKGI service vms, SNAT rules are configured as described here. These SNAT rules help Kubernetes vms to communicate with NSX-T manager and other infrastructure services such as DNS and NTP. In this failure scenario these SNAT rule were missing and this resulted in Kubernetes vms not being able to reach the DNS server. This caused NSX-T manager host lookups to fail.

Resolution

Configure T1 logical router for TKGI service vms with SNAT rules as described here.