A workload domains fails to populate certificate and password information within the Fleet Management UI
search cancel

A workload domains fails to populate certificate and password information within the Fleet Management UI

book

Article ID: 440595

calendar_today

Updated On:

Products

VCF Operations

Issue/Introduction

VCF has multiple workload domains, but one of the workload domains fails to populate certificate and password information within the Fleet Management UI.

Executing the following curl command from SDDC-M shows the time it takes the API command to retrieve the certificates see knowledge base article: 425493

command to get the bearer token:
TOKEN=$(curl -d '{"username" : "[email protected]", "password" : "<PASSWORD>"}' -H "Content-Type: application/json" -X POST https://<SDDC-MANAGER-FQDN>/v1/tokens -k | jq -r '.accessToken')

This command will retrieve the time:
time curl -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" https://<SDDC-MANAGER-FQDN>/v1/domains/{id}/resource-certificates -k | jq

Note: id should be replaced with the ID of the WLD. 

Note: Depending on DNS server state the response may vary, you may need to run this command at various times to get an accurate certificate collection time. 

 

The operationsmanager.log on the SDDC Manager may also show delays in IP resolution like this:

YYYY-MM-DDT:HH:MM:24 INFO  [vcf_om,6a0226949cf93506a1a4824b20fa7317,3d10] [c.v.e.s.common.util.NetworkService,http-nio-127.0.0.1-7300-exec-18] Resolved FQDN <esxhosta>.<domain> to an IP <##.##.##.##>.

<--- delay here for exactly 5.005 seconds --->

YYYY-MM-DDT:HH:MM:29 INFO  [vcf_om,6a0226949cf93506a1a4824b20fa7317,3d10] [c.v.e.s.common.util.NetworkService,http-nio-127.0.0.1-7300-exec-18] Resolved FQDN <esxhostb>.<domain> to an IP <##.##.##.##>.

YYYY-MM-DDT:HH:MM:29 INFO  [vcf_om,6a0226949cf93506a1a4824b20fa7317,3d10] [c.v.e.s.common.util.NetworkService,http-nio-127.0.0.1-7300-exec-18] Resolved FQDN <esxhostc>.<domain> to an IP <##.##.##.##>.

<--- delay here for exactly 5.005 seconds --->

YYYY-MM-DDT:HH:MM:34 INFO  [vcf_om,6a0226949cf93506a1a4824b20fa7317,3d10] [c.v.e.s.common.util.NetworkService,http-nio-127.0.0.1-7300-exec-18] Resolved FQDN <esxhostd>.<domain> to an IP <##.##.##.##>5.

 

Environment

VCF Operations 9.0x

Cause

The issue is caused by the unresponsiveness or slowness of DNS servers within the environment. During an inventory fetch, the system performs DNS resolution for each host.

The issue directly correlates with the size of the workload domain. Certificates may populate for a smaller workload domain (fewer than 15 hosts) because the cumulative DNS delay remains under the 60-second UI limit. However, data may fail to display for larger workload domains (over 50 hosts) where the cumulative delay exceeds the timeout threshold

Due to the DNS latency, each lookup waits for the default OS timeout. 

Resolution

  1. Troubleshoot configured DNS servers to resolve the delays in DNS resolution.
  2. Workaround the problem by reducing the DNS timeout.
    1.  Take an offline snapshot of the SDDC Manager.
    2. SSH to SDDC Manager as VCF user and su to root.
    3. Backup the configuration file:
      cp -p /etc/resolv.conf /etc/resolv.conf.bak
    4. Edit /etc/resolv.conf and append the following line to force a faster timeout and failover:
      options timeout:1 attempts:1
    5. Restart the service to apply the changes:
      systemctl restart operationsmanager

Additional Information

The following additional resources can be reviewed to resolve the issue, if the problem is not with the DNS server response:

VCF Operations UI does not show Password or Certificate information for the VCF Instance

Unable to update DNS servers from SDDC Manager

"Could not find the component certificate. Use Fleet Management --> Certificates to either add or replace the component certificate" error when adding nodes to VCF Operations

 

Use this to determine the time it takes for API to reply to a certificate get request:

Certificate management for VCF Instances unavailable in Fleet management due to configured Microsoft CA not reachable from SDDC Manager