Telco Cloud Automation (TCA) versions 2.3 and earlier stored the kubeconfig of the management cluster both on the TCA-CP filesystem and in the database. This kubeconfig is used for Lifecycle Management (LCM) of the workload clusters, as well as for network function instantiation on top of the workload clusters. If the kubeconfig stored in TCA-CP expires, you may see the following symptoms:
Pre-interface script execution failed." Additionally, HTTP status 401 Unauthorized" error.In TCA-CP app-engine logs, below errors can be seen:
[ClusterAutomationService_SvcThread-xxx, Ent: HybridityAdmin, Usr: xyz@abc , TxId: ####-####-####-####-###############] WARN CaaS.Flow- error quering flow status
io.kubernetes.client.openapi.ApiException:
at io.kubernetes.client.openapi.ApiClient.handleResponse(ApiClient.java:973)
at io.kubernetes.client.openapi.ApiClient.execute(ApiClient.java:885)
Based on the error above, the app-engine is encountering an issue while trying to access the management cluster using the k8s-bootstrapperd service on TCA-CP.
The Bootstapper logs, located at /comon/logs/k8s-bootstrapper/bootstrapperd.log, will contain the following:
Mar 21 13:09:23 apiserverd[14919] : [Warning-controller] : Failed to reach to management cluster [#######-####-####-####-##########], err: Unauthorized
Mar 21 13:09:23 apiserverd[14919] : [Warning-controller] : Failed to reach to management cluster [#######-####-####-####-##########], err: Unauthorized
Mar 21 13:09:23 apiserverd[14919] : [Warning-controller] : Failed to reach to management cluster [#######-####-####-####-##########], err: Unauthorized2.3 or lower versions
The management cluster kubeconfig is usually valid for one year and can be renewed either automatically or manually by the user. Once renewed, the kubeconfig needs to be updated both in the file system and in the database. The issue in 2.3 or earlier release is the automated poller only updates the kubeconfig into the database but not on the file system. When the kubeconfig on the file system is out of sync with the endpoint, users will encounter the symptoms mentioned above.
This issue is because the kubeconfig that is being used to access management cluster that is (located at /opt/vmware/k8s-bootstrapper/<mgmt-cluster-id>/kubeconfig) has expired.
The issue is resolved in 3.x and later versions.
The below workaround can be applied to TCA 2.3 or lower versions.
# kbsctl show managementclusters
Count: 1
----------------------------------------
ID: 0######a-b7a2-####-####-a###########22
Name: tca1-mgmt-cluster1234
Status: unknown
TKG ID: #######-1##d-####-####-3###########a
The management cluster ID in this case is 0######a-b7a2-####-####-a###########22.
The kubeconfig is located at /opt/vmware/k8s-bootstrapper/0######a-b7a2-####-####-a###########22/kubeconfig.
kbsctl show managementclusterskbsctl show managementclusters
Count: 1
----------------------------------------
ID: 0######a-b7a2-####-####-a###########22
Name: tca1-mgmt-cluster1234
Status: Running
TKG ID: #######-1##d-####-####-3###########a
Management cluster status should come back to Running
Restart the app-engine on TCA-CP
# systemctl restart app-engine
After this, NF instantiation should work properly, and any LCM operations on the associated workload cluster should succeed without errors.