vSphere with Tanzu ContentLibrary sync failure leads to missing TKrs on Supervisor Cluster
search cancel

vSphere with Tanzu ContentLibrary sync failure leads to missing TKrs on Supervisor Cluster

book

Article ID: 319407

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

Symptoms:

  • In vCenter Server GUI:
    • There will be "Operation failed!" messages presented in the top right corner, indicating  "Task name: Sync Library" with "Status: RuntimeFault"

    • The Recent Tasks pane will show repeated tasks for "Fetch Content of a Library Item" for the TKrs in the vSphere with Tanzu Content Library

    • If Guest Clusters are attempting node rollouts, the Recent Tasks pane will show failures in "Deploy OVF Template" tasks

 

  • From vCenter Server SSH:

    • Content Library logging in /var/log/vmware/content-library/cls.log will show errors like:
2023-11-10T23:11:01.241Z | ERROR    | null             | type-adapter-4            | ManifestCertServiceImpl        | The manifest checksum from the provided certificate file ubuntu-ova.cert does not match the content of the manifest file ubuntu-ova.mf, Error: [CERTIFICATE_INVALID_SIGNATURE]

2023-11-10T23:11:01.241Z | ERROR    | null             | type-adapter-4            | ManifestCertServiceImpl        | The manifest checksum from the provided certificate file photon-ova.cert does not match the content of the manifest file photon-ova.mf, Error: [CERTIFICATE_INVALID_SIGNATURE]
 
  • From Supervisor SSH:

    • VMOP logging will present with errors like:
E1110 23:11:02.533733       1 controller.go:317] controller/pvcsi-controller "msg"="Reconciler error" "error"="failed to get controlPlane TKR for TKC '<NAMESPACE_NAME>/<GUEST_CLUSTER_NAME>': tanzukubernetesreleases.run.tanzu.vmware.com \"v1.23.8---vmware.3-tkg.1.ubuntu\" not found" "name"="<GUEST_CLUSTER_NAME>" "namespace"="<NAMESPACE_NAME>"
  • kubectl get tkr and kubectl get virtualmachineimage -A commands may return no resources in the command output.

Environment

VMware vSphere 7.0 with Tanzu
VMware vSphere 8.0 with Tanzu

Resolution


Broadcom engineering teams are working to identify a resolution to this, which will be provided in a future release.

Workaround:

At present, the only workaround to this issue is to restart the Content Library service on the vCenter server. This action can be performed from the vCenter Server Appliance GUI on port 5480

Or

From the vCenter command line interface with the following command:

#service-control --restart content-library
 

Please note that the Content Library may have to complete a sync operation before TKrs and virtualmachineimages will be repopulated in the Supervisor Cluster.



Additional Information

Impact/Risks:

These failures will prevent completion of the Content Library sync on vCenter and may lead to failure in populating TKrs in the vSphere with Tanzu Supervisor Cluster.  As TKr population in Supervisor Cluster is a prerequisite for VirtualMachineImage reconciliation by VMOP controller, errors may present in VMOP controller logging indicating a failure to gather the TKR for TKC. This leads to a failure in upgrade/rollout operations on Guest Cluster nodes. Guest Clusters may show Ready:False state as a result.