message : Component TkgUpgrade failed: Failed to run command: ['kubectl', 'apply', '-f', '/usr/lib/vmware-wcp/objects/PodVM-GuestCluster/11-tkgsconfig', '--record'] ret=1 out= err=Flag --record has been deprecated, --record will be removed in the future\nError from server: error when retrieving current configuration of:\nResource: \"run.tanzu.vmware.com/v1alpha2, Resource=tkgserviceconfigurations\", GroupVersionKind: \"run.tanzu.vmware.com/v1alpha2, Kind=TkgServiceConfiguration\"\nName: \"tkg-service-configuration\", Namespace: \"\"\nfrom server for: \"/usr/lib/vmware-wcp/objects/PodVM-GuestCluster/11-tkgsconfig/tkgserviceconfiguration.yaml\": conversion webhook for run.tanzu.vmware.com/v1alpha1, Kind=TkgServiceConfiguration failed: Post \"https://vmware-system-tkg-webhook-service.vmware-system-tkg.svc:443/convert?timeout=30s\": dial tcp 10.1.1.1:443: connect: connection refused\n
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
vmware-system-tkg vmware-system-tkg-webhook-service ClusterIP 10.1.1.1 <none> 443/TCP 87d
2022-11-12T16:42:05.466Z ERROR compupgrade: {"error": "Exception", "message": "Failed to run command: ['kubectl', 'apply', '-f', '/usr/lib/vmware-wcp/objects/PodVM-GuestCluster/11-tkgsconfig', '--record'] ret=1 out= err=Flag --record has been deprecated
, --record will be removed in the future\nError from server: error when retrieving current configuration of:\nResource: \"run.tanzu.vmware.com/v1alpha2, Resource=tkgserviceconfigurations\", GroupVersionKind: \"run.tanzu.vmware.com/v1alpha2, Kind=TkgServ
iceConfiguration\"\nName: \"tkg-service-configuration\", Namespace: \"\"\nfrom server for: \"/usr/lib/vmware-wcp/objects/PodVM-GuestCluster/11-tkgsconfig/tkgserviceconfiguration.yaml\": conversion webhook for run.tanzu.vmware.com/v1alpha1, Kind=TkgServi
ceConfiguration failed: Post \"https://vmware-system-tkg-webhook-service.vmware-system-tkg.svc:443/convert?timeout=30s\": dial tcp 10.1.1.1:443: connect: connection refused\n", "backtrace": [" File \"/usr/lib/vmware-wcp/upgrade/compupgrade.py\", lin
e 252, in do\n comp.doUpgrade(upCtx)\n", " File \"/usr/lib/vmware-wcp/objects/PodVM-GuestCluster/10-tkg/gc_component_upgrade.py\", line 82, in doUpgrade\n applyAppConfig(join(TKG_CONFIG, '11-tkgsconfig'))\n", " File \"/usr/lib/vmware-wcp/upgrade
/comphelper.py\", line 236, in applyAppConfig\n run(cmd)\n", " File \"/usr/lib/vmware-wcp/upgrade/comphelper.py\", line 71, in run\n raise Exception(exMsg)\n"]}
/usr/lib/vmware-wcp/upgrade/upgrade-ctl.py get-status | jq '.progress | to_entries | .[] | "\(.value.status) - \ (.key)"' | sort
VMware vCenter Server 7.0.x
Version | Release Date | vCenter build | Supported K8 versions | Supervisor Cluster Version |
---|---|---|---|---|
vCenter Server 7.0 Update 3h (7.0.3.01000) | 2022-09-13 | 20395099 | 1.22 1.21 1.20 | 0.0.19 |
vCenter Server 7.0 Update 3g (7.0.3.00800) | 2022-07-23 | 20150588 | 1.22 1.21 1.20 | 0.0.17 |
vCenter Server 7.0 Update 3f (7.0.3.00700) | 2022-07-12 | 20051473 | 1.22 1.21 1.20 | 0.0.17 |
vCenter Server 7.0 Update 3e (7.0.3.00600) | 2022-05-12 | 19717403 | 1.22 1.21 1.20 | 0.0.15 |
We can workaround this issue. First make sure to validate that this is the EXACT issue you are running into. If you are unsure, please contact VMware Support. You can use Troubleshooting Supervisor Control Plane VMs KB for exact instructions on ssh-ing into supervisor control plane VM's.
1. Identify which Supervisor Control Plane VM is running the component upgrade. This is the one that has the /var/log/vmware/upgrade-ctl-compupgrade.log on it. You can identify it via
ls -l /var/log/vmware
2. From a shell into the SV VM from step 1 take a backup of the upgrade script here.
cp /usr/lib/vmware-wcp/objects/PodVM-GuestCluster/10-tkg/gc_component_upgrade.py /root/gc_component_upgrade.py.backup
3. Edit this file with vi and enable line numbers via :set nu Make sure that you are in command mode and not insert mode.
/usr/lib/vmware-wcp/objects/PodVM-GuestCluster/10-tkg/gc_component_upgrade.py
4. Modify line 14
FROM
TKG_DEPLOYMENT_NAMES = ('vmware-system-tkg-controller-manager',)
TO
TKG_DEPLOYMENT_NAMES = ('vmware-system-tkg-controller-manager', 'vmware-system-tkg-webhook',)
5. Remove the following block of code (Should be lines 94-98)
if not self.tkgServiceConfigurationExists():
logger.info('Applying default TKGServiceConfiguration')
applyAppConfig(join(TKG_CONFIG, '11-tkgsconfig'))
else:
logger.info('Skipping apply of the default TKGServiceConfiguration as it already exists')
6. Wait ~10 minutes for the next loop to happen and check components again to see if the tkgUpgrade component is not complete.
root@422e3622efef84f459d1713d7025acef [ ~ ]# /usr/lib/vmware-wcp/upgrade/upgrade-ctl.py get-status | jq '.progress | to_entries | .[] | "\(.value.status) - \(.key)"' | sort
"skipped - AKOUpgrade"
"skipped - HarborUpgrade"
"skipped - LoadBalancerApiUpgrade"
"skipped - TelegrafUpgrade"
"upgraded - AppPlatformOperatorUpgrade"
"upgraded - CapvUpgrade"
"upgraded - CapwUpgrade"
"upgraded - CertManagerUpgrade"
"upgraded - CsiControllerUpgrade"
"upgraded - ImageControllerUpgrade"
"upgraded - KappControllerUpgrade"
"upgraded - LicenseOperatorControllerUpgrade"
"upgraded - NamespaceOperatorControllerUpgrade"
"upgraded - NetOperatorUpgrade"
"upgraded - NSXNCPUpgrade"
"upgraded - PinnipedUpgrade"
"upgraded - PspOperatorUpgrade"
"upgraded - RegistryAgentUpgrade"
"upgraded - SchedextComponentUpgrade"
"upgraded - SphereletComponentUpgrade"
"upgraded - TkgUpgrade" <<<<<<<<<<<<<<< succeeded
"upgraded - TMCUpgrade"
"upgraded - UCSUpgrade"
"upgraded - UtkgClusterMigration"
"upgraded - UtkgControllersUpgrade"
"upgraded - VmOperatorUpgrade"
"upgraded - VMwareSystemLoggingUpgrade"
"upgraded - WCPClusterCapabilities"