All KMS servers appear to be retrieving for a few minutes if one of the KMS servers is not available
search cancel

All KMS servers appear to be retrieving for a few minutes if one of the KMS servers is not available

book

Article ID: 403042

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • When one of the configured KMS (Key Management Server) nodes becomes unreachable or goes offline, the behavior observed in vCenter Server UI is:
    • All KMS servers (including healthy ones) display their Connection Status as:

      • "Retrieving" or

      • "Initializing"

  • This status persists for several minutes, even for healthy KMS nodes.

  • The issue occurs because:

    • vCenter attempts to validate the status of the entire KMS cluster, not just individual KMS servers.

    • The UI does not immediately distinguish between a single point of failure and a full cluster-wide issue.

 

 

Environment

vCenter 7.x, 8.x, 9.0

Cause

  • When validating the KMS cluster, vCenter initiates up to three connection attempts per KMS server if the initial attempt fails.

    • Each attempt is subject to a 60-second timeout.

  • Therefore, if a KMS server is unreachable:

    • vCenter could spend up to 3 minutes trying to connect before reporting a final status.

  • During this time:

    • The vCenter UI shows "Retrieving" as it awaits responses from the affected KMS node(s).

    • Even healthy KMS nodes show this status, as the validation API waits for the entire cluster response.

Resolution

  • No changes or actions are required in vCenter.
    This is expected behavior and the system is functioning as designed.
  • The UI behavior is based on how the retrieveKmipServersStatus API operates:

    • This API retrieves the status of all KMS servers in the specified cluster.

    • If even one KMS server encounters a problem (e.g., network issue, service down), the API:

      • Waits until a timeout occurs before proceeding.

      • Holds up the UI status for all KMS servers during this time.

  • Once the API completes (after timeout or success):

    • The vCenter UI updates the "Connection Status" accordingly.

    • If all KMS servers are healthy, the status is updated almost instantly to "Healthy".

    • This behavior does not indicate a bug or misconfiguration.

    • It reflects vCenter’s conservative approach to validating the health of all KMS servers before updating the UI.

    • To avoid long UI delays:

      • Ensure all KMS nodes are available and responsive.

      • Remove or update stale/unreachable KMS entries from the cluster if no longer in use.