DHCP provided NTP servers do not get reflected in TKG nodes prior to TKG 1.4.1
search cancel

DHCP provided NTP servers do not get reflected in TKG nodes prior to TKG 1.4.1

book

Article ID: 327444

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid

Issue/Introduction

Symptoms:

Affected versions – All vSphere OVAs versions prior to TKG 1.4.1

DHCP provided NTP servers do not get reflected in vSphere TKG nodes. As a result,  time-drift is likely to occur in environments where public internet NTP pools are unreachable, and static NTP servers have not been configured.

Starting with TKG 1.4.1, TKG OVAs use systemd-networkd to add DHCP Option 42 provided NTP servers to the chrony configuration. For all TKG versions prior to 1.4.1, it is recommended that static NTP addresses be set via YTT overlay before deploying management or workload clusters, as documented here.

For clusters that are already deployed, they must be patched in-place, by updating the configuration on deployed nodes, and patching the kubeadm config for future nodes.


Resolution

The issue is fixed in TKG 1.4.1

Workaround:
To patch existing deployed nodes, a script is executed via SSH to inject the NTP server IP address and restart chrony.

Step #1
- A script with NTP server details can be used to edit the chrony.conf file and restart the chronyd service.

#!/bin/sh
set -x
sudo su
echo 'server <NTP Server IP> iburst prefer' >> /etc/chrony.conf
sed -i '/^pool/d' /etc/chrony.conf
systemctl restart chronyd

- If using kubectx to manage Kubernetes contexts, create a file (say clusters) (kubectx > clusters)

for name in `cat clusters`; do
  kubectx $name
  NODEIPS=$(kubectl get nodes -o jsonpath='{.items[*].status.addresses[?(@.type=="ExternalIP")].address} {"\n"}') && for i in $NODEIPS; do echo -n "$i "; ssh capv@$i 'bash -s' < ntp-fix.sh; done && sleep 5 && for i in $NODEIPS; do echo -n "$i "; ssh capv@$i date; done
done;

- Execute a loop on all the nodes of all the clusters to update chrony.conf & restart the chronyd service.
- Validate the time on all the nodes of all the clusters by comparing it with UTC time on the DHCP server.
- Validate that the output of chrony sources shows the correct NTP sources.

Step #2
- Persist the changes across all the clusters nodes (control plane & worker nodes).
  There are 2 situations here:
  
  Situation 1 - Persisting the changes when creating new clusters
  - Create an overlay file under "~/.tanzu/tkg/providers/ytt/"
#@ load("@ytt:overlay", "overlay")
#@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"})
---
spec:
  kubeadmConfigSpec:
    preKubeadmCommands:
    #! Add NTP to all k8s nodes
    #@overlay/append
    - echo 'server <NTP-Server-IP> iburst prefer' >> /etc/chrony.conf
    #@overlay/append
    - echo 'server <NTP-Server-IP> iburst prefer' >> /etc/chrony.conf
    #! Remove old pool
    #@overlay/append
    - sed -i '/^pool/d' /etc/chrony.conf
    #! Restart chrony
    #@overlay/append
    - systemctl restart chronyd
  #@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"})
---
spec:
  template:
    spec:
      preKubeadmCommands:
        #! Add NTP to all k8s nodes
        #@overlay/append
        - echo 'server <NTP-Server-IP> iburst prefer' >> /etc/chrony.conf
        #@overlay/append
        - echo 'server <NTP-Server-IP> iburst prefer' >> /etc/chrony.conf
        #! Remove old pool
        #@overlay/append
        - sed -i '/^pool/d' /etc/chrony.conf
        #! Restart chrony
        #@overlay/append
        - systemctl restart chronyd
  - Create the new cluster.    
tanzu cluster create -f ~/.tanzu/tkg/clusterconfigs/workload-test.yaml
  - [For reference] Generic details about how to patch TKGm nodes https://docs.vmware.com/en/VMware-Tanzu-Kubernetes-Grid/1.3/vmware-tanzu-kubernetes-grid-13/GUID-tanzu-k8s-clusters-config-plans.html#ytt-overlays-4

  Situation 2 - Persist the changes on the existing cluster nodes. On each cluster:
  - Start patching the control plane nodes
kubectl -n <namespace-where-cluster-resides> patch KubeadmControlPlane <kubeadm-control-plane-object>  --type='json' -p='[
{"op": "add", "path": "/spec/kubeadmConfigSpec/preKubeadmCommands/-", "value": "echo \"server <NTP-Server-IP>  iburst prefer\" >> /etc/chrony.conf"},
{"op": "add", "path": "/spec/kubeadmConfigSpec/preKubeadmCommands/-", "value": "echo \"server <NTP-Server-IP> iburst prefer\" >> /etc/chrony.conf"},
{"op": "add", "path": "/spec/kubeadmConfigSpec/preKubeadmCommands/-", "value": "sed -i \"/^pool/d\" /etc/chrony.conf"},
{"op": "add", "path": "/spec/kubeadmConfigSpec/preKubeadmCommands/-", "value": "systemctl restart chronyd"}]'

  - Wait for the control plane nodes to be recreated
  - Start patching the worker nodes

kubectl -n <namespace-where-cluster-resides> patch KubeadmConfigTemplate <kubeadm-config-template-object>  --type='json' -p='[
{"op": "add", "path": "/spec/template/spec/preKubeadmCommands/-", "value": "echo \"server <NTP-Server-IP> iburst prefer\" >> /etc/chrony.conf"},
{"op": "add", "path": "/spec/template/spec/preKubeadmCommands/-", "value": "echo \"server <NTP-Server-IP> iburst prefer\" >> /etc/chrony.conf"},
{"op": "add", "path": "/spec/template/spec/preKubeadmCommands/-", "value": "sed -i \"/^pool/d\" /etc/chrony.conf"},
{"op": "add", "path": "/spec/template/spec/preKubeadmCommands/-", "value": "systemctl restart chronyd"}]'


Note - Patching kubeadm config templates won't trigger recreation of the worker nodes.
Validate that the output of chrony sources shows the correct NTP sources.