You notice after SSH to your TKGm cluster's control plane node(s) of which are VMC on AWS based that the command 'date' is provided the time in UTC however there is a drift in time. In this scenario the time drift was between 5 and 10 minutes.
You also notice that from your jumpbox server when performing kubectl queries or upgrades on your clusters on TKGm that are VMC on AWS you are seeing their AGE is not displayed but you see <invalid> displayed instead.
TKGm 2.5.2
VMC on AWS
The chronyd.service's /etc/chrony/chrony.conf file has been configured with pool.ntp.org which is incorrect for VMC on AWS.
When viewing the journalctl logs for chronyd you see below messages:
May 30 15:11:04 #####-control-plane-##### chronyd[583]: Selected source 212.71.253.212 (pool.ntp.org)
May 30 15:12:10 #####-control-plane-##### chronyd[583]: Source 217.154.60.177 replaced with 109.74.206.120 (pool.ntp.org)
May 30 15:15:22 #####-control-plane-##### chronyd[583]: Can't synchronise: no majority
May 30 15:21:15 #####-control-plane-##### chronyd[583]: Selected source 109.74.206.120 (pool.ntp.org)
When SSH'd to the control plane and after installing netcat for instance, the nc -vzu <pool.ntp.org_IP> 123 will time out but will not with the AWS NTP IP 169.254.169.123.
Note - below found in KB 329764
Pre Req: Add a Firewall rule in your Compute Gateway which allows NTP traffic to 169.254.169.123
Sample Rule:
Source: Compute Workload VM/Segment
Destination: 169.254.169.123
Services: NTP (UDP:123)
Applied To: Internet Interface or Direct Connect Interface (i.e.The interface where the default route is pointing - if it is not advertised over a direct connect, it will be the Internet Interface.). In this example, we do not have a DX connection to SDDC, so the rule is applied to the Internet Interface.
Note: If you have a default route advertised over a VPN, then you wouldn't be able to use the native Amazon Time Sync Service
Be sure that a Firewall rule in your Compute Gateway which allows NTP traffic to 169.254.169.123.
Follow the below procedure to correct the time sync on the problem cluster and it's nodes.
For management and workload legacy clusters using ytt overlay:
$ cat > ~/.tanzu/tkg/providers/ytt/03_customizations/add_ntp.yaml <<EOF
#@ load("@ytt:overlay", "overlay")
#@ load("@ytt:data", "data")
#@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"})
---
spec:
kubeadmConfigSpec:
#@overlay/match missing_ok=True
ntp:
enabled: true
servers:
- 169.254.169.123
#@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"}),expects="1+"
---
spec:
template:
spec:
#@overlay/match missing_ok=True
ntp:
enabled: true
servers:
- 169.254.169.123
EOF
$ tanzu cluster create dryrun-cluster --dry-run --file cluster-config.yaml > dryrun-cluster.yaml
$ cat dryrun-cluster.yaml | yq e 'select(.kind == "KubeadmControlPlane") | .spec.kubeadmConfigSpec.ntp' -
enabled: true
servers:
- 169.254.169.123
$ kubectl get secrets mgmt-control-plane-98zlp -o json | jq '.data.value' -r | base64 -d | grep ntp -A 4
ntp:
enabled: true
servers:
- 169.254.169.123
$ cat /etc/chrony/chrony.conf | grep server
# Use servers from the NTP Pool Project. Approved by Ubuntu Technical Board
# servers
server 169.254.169.123 iburst
For management and workload ClusterClass clusters:
1. When creating the cluster, add the following information for your NTP server variables into your Configuration File Variables https://techdocs.broadcom.com/us/en/vmware-tanzu/standalone-components/tanzu-kubernetes-grid/2-5/tkg/config-ref.html#vsphere NTP_SERVERS: "169.254.169.123"
2. It will generate the cluster with its Class-based Object Structure after creating the cluster like the following:
kind: Cluster
spec:
topology:
variables:
- name: ntpServers
value:
- "169.254.169.123"
3. On the nodes of the cluster, it will provide the ntp server setting in chrony service like the following:
$ cat /etc/chrony/chrony.conf | grep server
# Use servers from the NTP Pool Project. Approved by Ubuntu Technical Board
# servers
server 169.254.169.123 iburst
Workaround:
If the TKG clusters have been already created and you want to modify the NTP parameters without downtime, you may edit the /etc/chrony/chrony.conf with the NTP server and restart chronyd.service. Be aware this workaround is not persistent if the VM gets recreated.
$ vim /etc/chrony/chrony.conf
# Use servers from the NTP Pool Project. Approved by Ubuntu Technical Board
# servers
server 169.254.169.123 iburst
$ systemctl restart chronyd.service