Time Synchronization Issues After Upgrading VMware Aria Operations
search cancel

Time Synchronization Issues After Upgrading VMware Aria Operations

book

Article ID: 342839

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

After upgrading VMware Aria Operations, you may encounter a time difference across the Analytics nodes within the VMware Aria Operations cluster. This issue is often due to the ntpd service being down on one or more nodes. The time discrepancy can lead to several problems, including:

  • Cluster going online
  • Upgrade failures
  • Data collection issues

These problems are critical as they can affect the overall operations and reliability of the VMware Aria Operations cluster.

The following error messages can be found in the /storage/log/vcops/log/analytics-wrapper.log file, indicating a time synchronization issue:

2024/08/07 05:23:23 | INFO   | jvm 1    | WARNING: Please consider reporting this to the maintainers of com.vmware.vcops.casarest.client.HttpRequesterURLConnectionImpl
2024/08/07 05:23:23 | INFO   | jvm 1    | WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
2024/08/07 05:23:23 | INFO   | jvm 1    | WARNING: All illegal access operations will be denied in a future release
2024/08/07 05:23:24 | INFO   | jvm 1    | >>> AnalyticsMain.run failed with error: IllegalStateException: Time difference between servers is:125402 ms. It is greater than 30000 ms. Unable to operate, terminating...
2024/08/07 05:23:24 | INFO   | jvm 1    | WrapperManager Debug: WrapperManager.stop(-1) called by thread: SystemExitThread
2024/08/07 05:23:24 | INFO   | jvm 1    | WrapperManager Debug: Send a packet STOP : -1
2024/08/07 05:23:24 | INFO   | jvm 1    | WrapperManager Debug: Pausing for 1,000ms to allow a clean shutdown...
2024/08/07 05:23:24 | INFO   | jvm 1    | WrapperManager Debug: Stopped checking for control events.
2024/08/07 05:23:24 | DEBUG  | wrapperp | read a packet STOP : -1
2024/08/07 05:23:24 | DEBUG  | wrapper  | JVM requested a shutdown. (-1)

Additionally, the status of the ntpd service on an affected node can be checked with the following command:

# systemctl status ntpd
ntpd.service - Network Time Service
Loaded: loaded (/usr/lib/systemd/system/ntpd.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Tue 2023-08-01 11:33:24 UTC; 1s ago
Docs: man:ntpd
Process: 862 ExecStart=/usr/bin/ntpd -g -u ntp:ntp (code=exited, status=0/SUCCESS)
Main PID: 874 (code=exited, status=0/SUCCESS)

 

Environment

VMware Aria Operations 8.x

Cause

The 'ntpd' service being inactive (dead) on one or more nodes in the cluster. This service is responsible for maintaining time synchronization across the nodes. When it fails, a significant time difference between nodes can occur, leading to errors and potential cluster instability.

Resolution

To resolve the issue, disable the systemd-timesyncd service and start the ntpd service

Complete the following on all Analytics nodes (Primary, Replica (if present), and Data nodes) simultaneously

  1. Log into the Analytics node as root via SSH or Console
  2. Run the following command to stop and disable the systemd-timesyncd service
systemctl stop systemd-timesyncd && systemctl disable systemd-timesyncd
  1. Run the following command to start the ntpd service
systemctl start ntpd
  1. Run the following command and cross reference between nodes to verify that the times are now synced across nodes
date

Cross-reference the output between nodes to ensure the times are now synced.