VCF Operations cluster first initialization hangs
search cancel

VCF Operations cluster first initialization hangs

book

Article ID: 428606

calendar_today

Updated On:

Products

VCF Operations VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

The initialization/configuration of VCF Operations cluster can hang during the NTP configuration phase. This issue can occur during very first cluster initialization/configuration.

The configuration can be performed through either:

  • Admin UI
  • REST API endpoint: https://<Primary-IP>/casa/cluster
    {
      "master": {
        "name": "###",
        "address": "###.###.###.###",
        "thumbprint": "##:##:##"
      },
      "admin_password": "#####",
      "ntp_servers": ["##.##.##"],
      "init": true,
      "dry-run": false,
      "enable-ca": false
    }

To verify if the NTP process has hung, SSH to the VCF Operations primary node and run:

ps aux | grep ntp

If the output shows the following processes, the NTP synchronization has hung:

root <PID_#1> /usr/bin/sudo -n /usr/lib/vmware-python-3/bin/python /usr/lib/vmware-casa/bin/ntp_update.py
root <PID_#2> /usr/lib/vmware-python-3/bin/python /usr/lib/vmware-casa/bin/ntp_update.py
root <PID_#3> /usr/bin/ntpd -nNgq

Environment

VCF Operations 9.x

Aria Operations 8.18.x

Cause

The time synchronization process (/usr/bin/ntpd -nNgq) hangs indefinitely when it attempts to synchronize with an unresponsive external NTP server IP address

Resolution

  1. SSH into VCF Operations primary node as root
  2. Run ps aux | grep ntp
  3. Identify the PID of 
    root <PID_#3> /usr/bin/ntpd -nNgq
  4. Run kill -9 <PID_#3>
  5. The initialization phase should now progress