During the upgrade of a Supervisor Cluster from vSphere with Tanzu Kubernetes Grid (TKG) version 1.27 to 1.28, the upgrade fails, and an error message is encountered stating "System error occurred on master node with identifier."
Additionally, the pod description for pinniped-concierge-kube-cert-agent
shows the following error:
VMware vSphere with Tanzu
The issue occurs when a ControlPlane VM does not have the necessary issuer information configured in the OpenSSL configuration. This results in the failure to retrieve the keyid
from the CA certificate, leading to the upgrade failure.
Delete the Affected ControlPlane VM: Follow the steps outlined in the Broadcom Knowledge Base article for troubleshooting vSphere with Tanzu TKGs:
Troubleshooting vSphere with Tanzu TKGS
Deploy a New ControlPlane VM: After deleting the affected VM, automatically a new ControlPlane VM will be deployed wait till IP address is assigned.
SSH into the New ControlPlane VM: Use SSH to log into the newly deployed ControlPlane VM.
Update the OpenSSL Configuration: Edit the /etc/vmware/wcp/openssl.conf
file on the newly deployed ControlPlane VM and append the following line under the relevant section:
Repeat for All Newly Deployed ControlPlane VMs: If there are multiple ControlPlane VMs in the cluster, repeat the above step for all newly deployed VMs to ensure the issuer is correctly configured.
This configuration change allows OpenSSL to fall back to using the issuer information if the keyid
cannot be retrieved from the CA certificate, resolving the issue and allowing the upgrade to complete successfully.
This issue has been identified and fixed in the upcoming release of vCenter Server 8.0 U3. However, the release timelines are not yet published.