NTP synchronization issue on the Supervisor Control plan VM
search cancel

NTP synchronization issue on the Supervisor Control plan VM

book

Article ID: 411463

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

Issue is observed with the NTP server and Supervisor control plane VM time date synchronization. Supervisor Control Plane VM loops into time synchronization restarts and failing to synchronize with NTP server.

When multiple NTP servers are configured; and the first NTP server is down/inaccessible/impacted. NTP would not pick the next working NTP server as the time synchronization is stuck in loop of restart. 

 

Journal Logs:
Command : journalctl -u systemd-timesyncd

Month DD TT:MM:SS changeme systemd[1]: started Network Time Synchronization.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[463]: Network configuration changed, trying to establish connection.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[463]: Timed out waiting for reply from NTP IP (NTP IP).
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd[1]: Stopping Network Time Synchronization ...
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd[1]: systemd-timesyncd.service: Succeeded.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd[1]: Stopped Network Time Synchronization.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd[1]: Starting Network Time Synchronization ...
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd[1]: Started Network Time Synchronization.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924] :
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Timed out waiting for reply from NTP IP (NTP IP).
Month DD TT:MM:SS 42162a59125elfc869007d0b040df042 Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Network configuration changed,trying to establish connection. Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Network configuration changed, trying to establish connection.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Network configuration changed, trying to establish connection.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Network configuration changed, trying to establish connection.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Timed out waiting for reply from NTP IP (NTP IP).
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Timed out waiting for reply from NTP IP (NTP IP).
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Timed out waiting for reply from NTP IP (NTP IP).
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Timed out waiting for reply from NTP IP (NTP IP)
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Initial synchronization to time server NTP IP (NTP IP).
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Network configuration changed, trying to establish connection.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Network configuration changed, trying to establish connection.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Network configuration changed, trying to establish connection.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Network configuration changed, trying to establish connection.
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd-timesyncd[6924]: Timed out waiting for reply from NTP IP (NTP IP).
Month DD TT:MM:SS 42162a59125e1fc869007d0b040df042 systemd[1]: Stopping Network Time Synchronization ...

Environment

vSphere with Tanzu

VMware vSphere Kubernetes Service 

Cause

This is a known issue related to the sso_domain change module within the supervisor control plane node.

Resolution

Permanent Fix:

This issue is resolved in vSphere 8.0U3e or higher.


Workaround:

1. Fix the non-working or not reachable NTP server first.

2. If fixing the NTP server is not an option, you need to use the next working NTP from the list follow below steps to break the loop of time synchronisation. 

# Login to all supervisor directly each one by one

ssh root@supervisor_ipaddr 


# Backup

cp /usr/lib/vmware-wcp/update-controller/sync.py /root/syncbak.py


# comment out - line:516

chmod 600 /usr/lib/vmware-wcp/update-controller/sync.py
vi /usr/lib/vmware-wcp/update-controller/sync.py
:
# Sync if domain changes
# retry = self.sync_sso_domain_change(messages) or retry <<<=== Comment out or delete this line.
:


# Restart wcp-sync

systemctl restart wcp-sync
systemctl status wcp-sync # The output should show active(running)