When attempting to upgrade a vSphere Supervisor service, the service's configuration status reaches an error state with a message indicating failure to pull an image from the container registry listed in that service version's Package manifest.
This could be an upgrade to VKS service or TKG service, or any of the available Supervisor Services.
The error message will be similar to one of the following errors, where values in brackets <> will vary by environment and Supervisor Service failure:
Configured Core Supervisor Service
service: <supervisor service>. Reason: ReconcileFailed. Message: kapp: error waiting on reconcile packageinstall/<pkgi which will vary based on the supervisor service> (packing.carvel.dev/v#alpha#)
namespace: svc-tkg-domain-c<id>: Finished unsuccessfully (Reconcile Failed: (message vendir: Error: Syncing directory '0': Syncing Directory '.' with imgpkgBundle contents: Fetching image: Get https://localhost:5000/v2/<container.registry.hostname>/manifest/sha256:<package information>: MANIFEST_UNKNOWN: manifest unknown; map[]
Reason: ReconcileFailed. Message: vendir: Error: Syncing directory '0': Syncing directory '.' with imgpkgBundle contents: Fetching image: GET https://<container.registry.hostname>/v#/vsphere/supervisor/packages/YYYY.MM.DD/vks-standard-packages/manifests/fake: MANIFEST_UNKNOWN: The named manifest is not known to the registry.; map[manifest:vsphere/supervisor/packages/YYYY.MM.DD/vks-standard-packages] ."
Core Supervisor Services are: VKS service, TKG service or Velero Operator. The VKS service or TKG service are responsible for workload cluster management.
vCenter 8.0u3 and higher
vSphere Kubernetes Service (VKS) v3.0.0 and higher
Failure messages concerning image resolution and image fetch operations can appear in the service's configuration status for a number of reasons. Some possible reasons:
Note: Supervisor inherits vCenter Service Appliance proxy settings by default and will attempt to use it unless configured otherwise.
See Configure the Supervisor to Use a Proxy for details
In some cases the revert process will cause more harm to the supervisor service than good and add significant complexities to the overall repair process.
The Core Supervisor Services under the namespace vmware-system-supervisor-services are critical for the environment to function and manage workload clusters.
Core Supervisor Services are: VKS service, TKG service or Velero Operator. The VKS service or TKG service are responsible for workload cluster management.
Deletion of the TKG or VKS supervisor service pkgi will lead to the deletion of all workload clusters in the environment.
Management of Supervisor Services should be performed from Workload Management in the vSphere web UI and will not allow Core Supervisor Services to be deleted.
vSphere Supervisor Services Github
Airgapped vSphere Supervisor Guide on Github
--------------------
lastAttemptedVersion: 1.9.3+vmware.0
observedGeneration: 3
usefulErrorMessage: |-
Stopped installing matched version '1.7.4+vmware.0' since last attempted version '1.9.3+vmware.0' is higher.
hint: Add annotation packaging.carvel.dev/downgradable: "" to PackageInstall to proceed with downgrade
version: 1.7.4+vmware.0
This procedure is intended only to revert a failed Supervisor service upgrade. It should not be used as a general-purpose mechanism to downgrade services. Service downgrade is intentionally not permitted by vCenter and for some services it may render instances of their managed resources orphaned. This risk doesn't apply if the current (failed) version was never successfully upgraded and rolled out.