Skyline Health Cluster Status Degraded Due to vSphere Cluster Services (vCLS) Failure Caused by Trailing Dot in ESXi Hostname
search cancel

Skyline Health Cluster Status Degraded Due to vSphere Cluster Services (vCLS) Failure Caused by Trailing Dot in ESXi Hostname

book

Article ID: 421342

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Skyline Health reports that a cluster's status has degraded.

The issue is observed in the vSphere Client where:

  • The top-level inventory shows the alarm: "Skyline Health has detects issues in your vsphere envirnoment".
  • Navigating to Monitor > Skyline Health shows the message: "A vSphere Cluster Services health status" warning that vSphere DRS depends on VCLS health.
  • Clicking on Cluster Health shows the status as Unhealthy with the error: "# load balancing runs skipped."
  • An error is found in the ESXi host's /var/run/log/infravisor.log indicating a hostname validation failure:
    ...ValidatePodCreate failed: [spec.nodeName: Invalid value: \"<Node_name>\": a lowercase RFC 1123 subdomain must consist of lower case alphanumeric characters, '-' or '.', and must start and end with an alphanumeric character (e.g. 'example.com', regex used for validation is '[a-z0-9]([-a-z0-9]*[a-z0-9])?(\\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*')]" 
  • The affected cluster has both HA and DRS turned off, and attempts to force a vCLS regeneration by toggling HA/DRS or setting the vCLS status to Retreat do not resolve the issue.

 

Cause

The root cause is an improperly formatted ESXi hostname.

The hostname, when checked via hostname or uname -a on the ESXi host, is found to end with a trailing dot (.).

The trailing dot violates the RFC 1123 naming standard required by the vSphere Cluster Services (vCLS). vCLS uses Kubernetes-based components to establish quorum, and the system rejects hostnames that fail the RFC 1123 validation, preventing the vCLS Pods from being created and deployed successfully.

Resolution

To resolve the degraded cluster status, the trailing dot must be removed from the ESXi hostname:

  1. Access the ESXi Shell via SSH or console.
  2. Set the hostname, ensuring the trailing dot is removed. Replace <hostname without dot> with the correct name:
    esxcli system hostname set --host=<hostname without dot>
  3. Verify the hostname change:
    esxcli system hostname get
  4. After the hostname is set correctly, the vSphere Cluster Services will automatically begin to redeploy and initialize the necessary components. The cluster health status in Skyline Health returns to Healthy within approximately 10 minutes.