VCF Identity Broker (VIDB) deployment through Fleet Management gets stuck at stage 4
Mac address is assigned to the node when viewing the VIDB Virtual Machine Details in vCenter
/var/log/vrlcm/vmsp_bootstrap_######.log in Fleet Management shows errors similar to:
2026/02/05 08:39:31 retrying after error: deployment failed
release: vmsp-platform/vmsp-global-config failed: StateError:Could not determine release state: unable to determine cluster state: [Certificate/vmsp-platform/seaweedfs-client-cert dry-run failed (InternalError): Internal error occurred: failed
calling webhook "webhook.cert-manager.io": failed to call webhook: Post "https://cert-manager-webhook.vmsp-platform.svc:443/validate?timeout=30s": dial tcp ###.###.###.###:443: connect: connection refused, Certificate/vmsp-platform/seaweedfs-filer-cert dry-run failed (InternalError): Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "https://cert-manager-webhook.vmsp-platform.svc:443/validate?timeout=30s": dial tcp ###.###.###.###:443:443: connect: connection refused, Certificate/vmsp-platform/seaweedfs-master-cert dry-run failed (InternalError): Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "https://cert-manager-webhook.vmsp-platform.svc:443/validate?timeout=30s": dial tcp ###.###.###.###:443:443: connect: connection refused, Certificate/vmsp-platform/seaweedfs-volume-cert dry-run failed (InternalError): Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "https://cert-manager-webhook.vmsp-platform.svc:443/validate?timeout=30s": dial tcp ###.###.###.###:443:443: connect: connection refused]
2026/02/05 08:55:48 retrying after error: deployment timed out: context deadline exceeded
2026/02/05 08:55:48 /v1, Kind=Secret/vmsp-platform/vmsp.release.vmsp-platform-bootstrap.v2: created
Error: failed to apply the PackageDeployment: context deadline exceeded
2026/02/05 08:55:48 error: failed to apply the PackageDeployment: context deadline exceeded
In some instances, the following Error Code: LCMVSPHERECONFIG1000095 can be seen in Fleet Manager.
The /var/log/vrlcm/vmsp_bootstrap_#####.log file in Fleet Manager shows:
error: timed out waiting for the condition on clusters/<cluster-name>
ERR:INFRA0002 - Waiting for VCF services platform cluster nodes to become ready
CoreDNS pods on VIDB nodes remain in Pending due to node.cloudprovider.kubernetes.io/uninitialized taints and will show the below warning:
Warning Failed Scheduling 55s (x4 over 16m) default-scheduler 0/1 nodes are available: 1 node(s) had untolerated taint {node.cloudprovider.kubernetes.io/uninitialized: true}. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
The above was verified with the following commands:
export KUBECONFIG =/etc/kubernetes/admin.conf
kubectl get pods -A
kubectl describe pod coredns-<id> -n kube-system
VCF Operations 9.0.1.x
Fleet Management 9.0.1.x
VIDB 9.0.1.x
Network Policy: A network policy is blocking traffic on the VIDB IP address range as a result it could not reach DNS/NTP servers.
Network/Firewall: Inter-node communication (Ports 53, 443, 22) is blocked, preventing Kubernetes initialization.
DNS: The specified DNS server is unreachable from the VIDB management network.
Credentials: Using the VCF Ops admin account instead of the required vmware-system-user from the Locker.
Traffic: Allow traffic on the network for assigned VIDB IP address
Ports: Open all required ports (53, 443, 22) between VIDB nodes and the Fleet Manager.
DNS: Verify bidirectional connectivity to DNS servers.
Locker: Create a unique password entry for vmware-system-user and select it during the VIDB deployment wizard.