TCA-2.3/3.0 TKG nodes are unable to resolve hostnames with the ".local" domain suffix
search cancel

TCA-2.3/3.0 TKG nodes are unable to resolve hostnames with the ".local" domain suffix

book

Article ID: 345726

calendar_today

Updated On:

Products

VMware VMware Telco Cloud Automation

Issue/Introduction

 Attempts to resolve hostnames that have a domain suffix that ends in ".local" fail on Tanzu Kubernetes Grid (TKG) nodes.

Environment

VMware Telco Cloud Automation 2.3

Cause

A known issue is documented in Tanzu Kubernetes Grid (TKG) as per VMware KB:  Search cancel Search Tanzu Kubernetes Grid nodes are unable to resolve hostnames with the ".local" domain suffix

Resolution

This is a known issue affecting Tanzu Kubernetes Grid. There is currently no resolution in TCA 2.3/TCA3.0.

Workaround:
To deploy TKG Management Cluster with .local domain from TCA 2.3/ 3.0 UI:

Step 1: 



A. For TCA 2.3 - SSH to TCA-CP with Admin credentials > Switch to root user > and then proceed with Step C:

B. For TCA 3.0 SSH to TCA-CP with Admin credentials and perform the following commands then proceed with step C:

admin@tca [ ~ ]$ kubectl get pods -A|grep tkg230
tca-cp-cn                  kbs-tkg230-55795ccb89-jlrdz                         1/1     Running     2 (8d ago)    28d

admin@tca [ ~ ]$ kubectl exec -n tca-cp-cn                  kbs-tkg230-55795ccb89-jlrdz -it bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Defaulted container "kbs-tkg230-operator" out of: kbs-tkg230-operator, init-bootstrapper (init)
 

C. Update the file “config_default.yaml” for the below parameters with the correct .local domain under the directory “/root/.config/tanzu/tkg/providers/”.
 

#! Specifies the service domain for the cluster
SERVICE_DOMAIN:
#! Node SearchDomains
CONTROL_PLANE_NODE_SEARCH_DOMAINS: vmware.local
WORKER_NODE_SEARCH_DOMAINS: vmware.local


D. Create below two files (vsphere-overlay-dns-control-plane.yaml | vsphere-overlay-dns-workers.yaml) with the content in the directory “/root/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/”

 vi vsphere-overlay-dns-control-plane.yaml
 

#@ load("@ytt:overlay", "overlay")
#@ load("@ytt:data", "data")
#@overlay/match by=overlay.subset({"kind":"VSphereMachineTemplate", "metadata": {"name": data.values.CLUSTER_NAME+"-control-plane"}})
---
spec:
  template:
    spec:
      network:
        devices:
        #@overlay/match by=overlay.all, expects="1+"
        -
          #@overlay/match missing_ok=True
          nameservers: ["172.22.22.43"]
          #@overlay/match missing_ok=True
          searchDomains: ["vmware.local"]


vi vsphere-overlay-dns-workers.yaml
 

#@ load("@ytt:overlay", "overlay")
#@ load("@ytt:data", "data")
#@overlay/match by=overlay.subset({"kind":"VSphereMachineTemplate", "metadata": {"name": data.values.CLUSTER_NAME+"-worker"}})
---
spec:
  template:
    spec:
      network:
        devices:
        #@overlay/match by=overlay.all, expects="1+"
        -
          #@overlay/match missing_ok=True
          nameservers: ["172.22.22.43"]
          #@overlay/match missing_ok=True
          searchDomains: ["vmware.local"]


Step2: Deploy the TKG Management cluster from TCA UI.

Step 3: After TKG Management cluster creation execute the below steps before starting v2 TKG Workload Cluster.

3a: SSH to the TKG Management cluster with CAPV user


3b: Identify the "tca-kubecluster-operator" pod:
 

capv@tcpran30-mgmt-9fqh5-k8fkt [ ~ ]$ kubectl get pods -n tca-system
NAME                                        READY   STATUS    RESTARTS       AGE
nodeconfig-daemon-97csj                     1/1     Running   0              121m
nodeconfig-daemon-control-plane-4lhtw       1/1     Running   0              121m
nodeconfig-daemon-control-plane-h9s5c       1/1     Running   0              121m
nodeconfig-daemon-control-plane-s8w6b       1/1     Running   0              121m
nodeconfig-daemon-fcjpj                     1/1     Running   0              121m
nodeconfig-operator-7fcd9b44b8-qtvff        2/2     Running   0              121m
tca-diagnosis-operator-548d99df66-8276c     1/1     Running   0              120m
tca-kubecluster-operator-6879b6bd48-fv697   1/1     Running   0              120m
test-controller-65d68cf67-9lqbn             1/1     Running   0              121m
vmconfig-operator-bfd9c589-2qrf7            2/2     Running   2 (119m ago)   119m


3c: Create the above same files with same content in the management cluster master node and copy it to the tca-kubecluster-operator-6879b6bd48-fv697 under “/root/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/” as mentioned in the above Step 1-D.

Example: 
 

capv@tcpran30-mgmt-9fqh5-k8fkt [ ~ ]$ kubectl -n tca-system cp vsphere-overlay-dns-control-plane.yaml tca-kubecluster-operator-6879b6bd48-fv697:/root/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/

 

capv@tcpran30-mgmt-9fqh5-k8fkt [ ~ ]$ kubectl -n tca-system cp vsphere-overlay-dns-workers.yaml  tca-kubecluster-operator-6879b6bd48-fv697:/root/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/


3d: Login to the POD and validate the files:
 

capv@tcpran30-mgmt-9fqh5-k8fkt [ ~ ]$ kubectl exec tca-kubecluster-operator-6879b6bd48-fv697 -n tca-system -it bash

NOTE: kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.

root [ / ]# ls -al /root/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt
total 32
drwxr-xr-x 1 root root  4096 2023-05-08 10:01 .
drwxr-xr-x 1 root root  4096 2023-03-09 09:50 ..
-rw-r----- 1 1000 users  491 2023-05-08 10:00 vsphere-overlay-dns-control-plane.yaml
-rw------- 1 root root  8392 2023-03-09 09:50 vsphere-overlay.yaml
-rw-r----- 1 1000 users  484 2023-05-08 10:01 vsphere-overlay-dns-workers.yaml 

3e: At POD - Update the file “config_default.yaml” for the below parameters with the correct .local domain under the directory “/root/.config/tanzu/tkg/providers/”:

# cd /root/.config/tanzu/tkg/providers/

root [ ~/.config/tanzu/tkg/providers ]# cat config_default.yaml | grep NODE_SEARCH_DOMAINS
CONTROL_PLANE_NODE_SEARCH_DOMAINS: 
WORKER_NODE_SEARCH_DOMAINS:

root [ ~/.config/tanzu/tkg/providers ]# sed -i "s/CONTROL_PLANE_NODE_SEARCH_DOMAINS:/CONTROL_PLANE_NODE_SEARCH_DOMAINS: vmware.local/g" config_default.yaml

root [ ~/.config/tanzu/tkg/providers ]# sed -i "s/WORKER_NODE_SEARCH_DOMAINS:/WORKER_NODE_SEARCH_DOMAINS: vmware.local/g" config_default.yaml

root [ ~/.config/tanzu/tkg/providers ]# cat config_default.yaml | grep NODE_SEARCH_DOMAINS
CONTROL_PLANE_NODE_SEARCH_DOMAINS: vmware.local
WORKER_NODE_SEARCH_DOMAINS: vmware.local


Step 4: Now proceed with the v2 TKG workload cluster deployment.

Additional Information

Impact/Risks:
Failed to deploy TKG Management or Workload clusters when there are hostnames with the ".local" domain suffix from TelcoCloudAutomation (TCA)-UI.