VKS cluster management service unhealthy after installing VCF 9.0.1 with error "tls: failed to verify certificate: x509: certificate ... Get "http://mgmt-image-proxy.kube-system.svc.cluster.local/v2/": dial tcp ############: connect: connection refused .
search cancel

VKS cluster management service unhealthy after installing VCF 9.0.1 with error "tls: failed to verify certificate: x509: certificate ... Get "http://mgmt-image-proxy.kube-system.svc.cluster.local/v2/": dial tcp ############: connect: connection refused .

book

Article ID: 425499

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

After deploying VCF 9.0.1 the VKS Cluster Management service on the Supervisor is showing an error:

 

GUI error message:

Reason: ReconcileFailed. Message: vendir: Error: Syncing directory '0': Syncing directory '.' with imgpkgBundle contents: Fetching image: Error while preparing a transport to talk with the registry: Unable to create round tripper: Get "https://mgmt-image-proxy.kube-system.svc.cluster.local/v2/": tls: failed to verify certificate: x509: certificate is valid for docker-registry.kube-system.svc, docker-registry.kube-system.svc.cluster.local, localhost, 127.0.0.1, ############, not mgmt-image-proxy.kube-system.svc.cluster.local; Get "http://mgmt-image-proxy.kube-system.svc.cluster.local/v2/": dial tcp ############: connect: connection refused .

 

The package svc-auto-attach.vksm.broadcom.com is not reconciling

 

kubectl get pkgi -n vmware-system-supervisor-services
NAME                                PACKAGE NAME                            PACKAGE VERSION           DESCRIPTION                                                            AGE
kube-state-metrics                  kube-state-metrics.vsphere.vmware.com   2.14.0-24825200           Reconcile succeeded                                                    28d
svc-auto-attach.vksm.broadcom.com   auto-attach.vksm.broadcom.com           0.1.0                     Reconcile failed: Error   22d
svc-tkg.vsphere.vmware.com          tkg.vsphere.vmware.com                  3.4.1-embedded+v1.33      Reconcile succeeded                                                    28d
svc-velero.vsphere.vmware.com       velero.vsphere.vmware.com               1.8.2-embedded+24925800   Reconcile succeeded                                                    28d

kubectl describe pkgi svc-auto-attach.vksm.broadcom.com -n vmware-system-supervisor-services
Name:         svc-auto-attach.vksm.broadcom.com
Namespace:    vmware-system-supervisor-services
Labels:       appplatform.vmware.com/serviceId=auto-attach
              appplatform.vmware.com/serviceVersion=0.1.0
              managedBy=vSphere-AppPlatform
Annotations:  ext.packaging.carvel.dev/ytt-paths-from-secret-name: carvel-services-overlay
              packaging.carvel.dev/ignore-kubernetes-version-selection: true
API Version:  packaging.carvel.dev/v1alpha1
Kind:         PackageInstall
Metadata:
  Creation Timestamp:  2025-12-17T15:39:42Z
  Finalizers:
    finalizers.packageinstall.packaging.carvel.dev/delete
  Generation:        4
  Resource Version:  29082324
  UID:               ############
Spec:
  Package Ref:
    Ref Name:  auto-attach.vksm.broadcom.com
    Version Selection:
      Constraints:       0.1.0
  Service Account Name:  default-carvel-install-sa
  Values:
    Secret Ref:
      Name:  auto-attach.vksm.broadcom.com-0.1.0-config-secret-u9k
Status:
  Conditions:
    Message:               Error (see .status.usefulErrorMessage for details)
    Status:                True
    Type:                  ReconcileFailed
  Friendly Description:    Reconcile failed: Error (see .status.usefulErrorMessage for details)
  Last Attempted Version:  0.1.0
  Observed Generation:     4
  Useful Error Message:    vendir: Error: Syncing directory '0':
  Syncing directory '.' with imgpkgBundle contents:
    Fetching image:
      Error while preparing a transport to talk with the registry:
        Unable to create round tripper:
          Get "https://mgmt-image-proxy.kube-system.svc.cluster.local/v2/": tls: failed to verify certificate: x509: certificate is valid for docker-registry.kube-system.svc, docker-registry.kube-system.svc.cluster.local, localhost, 127.0.0.1, ############, not mgmt-image-proxy.kube-system.svc.cluster.local; Get "http://mgmt-image-proxy.kube-system.svc.cluster.local/v2/": dial tcp ############: connect: connection refused

 

The configmap of the image-registry in namespace  vmware-system-mgmt-proxy does not have any certificates

 

k get cm -n vmware-system-mgmt-proxy  image-registry -oyaml
apiVersion: v1
data:
  auth_location: /oauth/provider/token
  host: vcf-automation.tsdc.local
  trusted_certificates: ""
kind: ConfigMap
metadata:
  annotations:
    addons.vcf.vmware.com/vcfa-owner: ############
    addons.vcf.vmware.com/vcfa-owner-last-update: "2026-01-09T14:02:28Z"
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"auth_location":null,"host":null,"trusted_certificates":null},"kind":"ConfigMap","metadata":{"annotations":{},"name":"image-registry","namespace":"vmware-system-mgmt-proxy"}}
  creationTimestamp: "2025-12-12T09:51:06Z"
  name: image-registry
  namespace: vmware-system-mgmt-proxy
  resourceVersion: "29138371"
  uid: ############

 

 

Environment

VCF 9.0.1

Cause

On the Supervisor there are certificates missing in the configmap of the image-registry in namespace vmware-system-mgmt-proxy

 

Resolution

Engineering are aware of this issue and it will be resolved in a future release

 

Workaround:

 

ssh to the VCFA appliance with the vmware-system-user and get the missing certificates

 

kubectl get secrets vmsp-tls -n istio-ingress -o json | jq -r '.data | to_entries[] | "\(.key): \(.value | @base64d)"'

tls.crt: -----BEGIN CERTIFICATE-----
############
-----END CERTIFICATE-----

-----BEGIN CERTIFICATE-----
############
-----END CERTIFICATE-----

tls.key: -----BEGIN PRIVATE KEY-----
############
-----END PRIVATE KEY-----

 

 

ssh to the Supervisor and edit the configmap adding the missing certificates and changing the UUID

 

kubectl edit cm -n vmware-system-mgmt-proxy  image-registry

 

Add the 2 certificates from the VCFA and replace the UUID in "addons.vcf.vmware.com/vcfa-owner" with a new random auto-generated UUID (e.g. via https://www.guidgenerator.com/)

 

root@4225f439a9eb644ac1ee00b985752f3e [ ~ ]# k get cm -n vmware-system-mgmt-proxy  image-registry -oyaml
apiVersion: v1
data:
  auth_location: /oauth/provider/token
  host: vcf-automation.tsdc.local
  trusted_certificates: |
    -----BEGIN CERTIFICATE-----
    ############
    -----END CERTIFICATE-----
    -----BEGIN CERTIFICATE-----
    ############
    -----END CERTIFICATE-----
kind: ConfigMap
metadata:
  annotations:
    addons.vcf.vmware.com/vcfa-owner: ############
    addons.vcf.vmware.com/vcfa-owner-last-update: "2026-01-09T14:02:28Z"
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"auth_location":null,"host":null,"trusted_certificates":null},"kind":"ConfigMap","metadata":{"annotations":{},"name":"image-registry","namespace":"vmware-system-mgmt-proxy"}}
  creationTimestamp: "2025-12-12T09:51:06Z"
  name: image-registry
  namespace: vmware-system-mgmt-proxy
  resourceVersion: "35370991"
  uid: ############

 

Check if the package svc-auto-attach.vksm.broadcom.com is able to reconcile now (It might take some time)


root@4225f439a9eb644ac1ee00b985752f3e [ ~ ]# k get pkgi -n vmware-system-supervisor-services
NAME                                PACKAGE NAME                            PACKAGE VERSION           DESCRIPTION           AGE
kube-state-metrics                  kube-state-metrics.vsphere.vmware.com   2.14.0-24825200           Reconcile succeeded   34d
svc-auto-attach.vksm.broadcom.com   auto-attach.vksm.broadcom.com           0.1.0                     Reconcile succeeded   28d
svc-tkg.vsphere.vmware.com          tkg.vsphere.vmware.com                  3.4.1-embedded+v1.33      Reconcile succeeded   34d
svc-velero.vsphere.vmware.com       velero.vsphere.vmware.com               1.8.2-embedded+24925800   Reconcile succeeded   34d

Additional Information

If the service remains unhealthy after applying the TLS fix above, check the kubectl-plugin-vsphere logs for a 405 Proxy Error.

If a 405 error is present, the Container Registry configuration may be missing. Please refer to VKS Cluster Management Service fails with 405 Proxy Error for the resolution steps to recreate the registry.