How to set up new key-pair to SSH into nodes in Tanzu Kubernetes Grid Plan-based Cluster (Legacy Cluster)
search cancel

How to set up new key-pair to SSH into nodes in Tanzu Kubernetes Grid Plan-based Cluster (Legacy Cluster)

book

Article ID: 327478

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid

Issue/Introduction

The purpose of this article is to provide a walkthrough of how to create a new keypair to SSH into TKGm nodes and apply changes in your TKG clusters.


Symptoms:

Unable to SSH into a control plane or worker node because the customer doesn't have the private key.

ssh capv@$IP

Environment

VMware Tanzu Kubernetes Grid Plus 1.x, 2.x

Cause

There are times where customers don't have a way to SSH into nodes either by password or SSH key.

Resolution

If a customer doesn't have the files as id_rsa and id_rsa.pub inside ~/.ssh/ directory and if there is no directory with any private key, support is not able to assist getting it for the customer since this is internal administration.

Workaround

We can offer creating a new key pair, but keep in mind template changes should be done for every single one related to a cluster. Meaning that, i.e. if we have an environment with 30 clusters, we would need to make changes to approximately 60 files, KCP and kubeadmconfigtemplates.


GENERATE A PRIVATE KEY

1. On the machine on which you will run the Tanzu CLI, run the following ssh-keygen command.

ssh-keygen -t rsa -b 4096 -C "[email protected]"

Note: -C flag is for comment, you can add anything you want.

2. At the prompt, enter file in which to save the key (/root/.ssh/id_rsa): press Enter to accept the default.

3. Enter and repeat a password for the key pair.
Note: You may leave password blank

4. Add the private key to the SSH agent running on your machine, and enter the password you created in the previous step.

ssh-add ~/.ssh/id_rsa


5. Check in ~/.ssh/ and confirm you have your keys in there.

 

Notes:

  • If the Tanzu Kubernetes Grid Class-based cluster (Classy Cluster) please use KB How to set up new key-pair to SSH into nodes in Tanzu Kubernetes Grid Class-based cluster (Classy Cluster).

     

  • To identify the cluster is a Class-based or Plan-based cluster, run the following command and if you got and output:  tkg-INFRASTRUCTURE-default-VERSION, for example, tkg-vsphere-default-v1.0.0, then this is a Class-based cluster.

    # kubectl get cluster <cluster-name>  -n <namespace> -o jsonpath="{.spec.topology.class}"  | more

    Ex: 

    kubectl get cluster tkg-mgmt  -n tkg-system -o jsonpath="{.spec.topology.class}"  | more

    tkg-vsphere-default-v1.2.0

    NOTE: If the cluster is Plan-based, the command above will return nothing.

  • See Tanzu Kubernetes Grid doc for more information on TKG Cluster Types .
  • If the ssh private key is not missing and there is no need to change the ssh key-pair but still ssh is failing to the cluster nodes, this could be due to a corrupt ssh-key configuration, see the following KB  How to troubleshoot a corrupt ssh-key configuration

 


ADD KEY TO CONTROL PLANE NODES

In this step we will be changing the KCP YAML and add the public key we just generated.

1. Switch context to the management cluster

kubectl config use-context $mgmt_context

2. Get KCP for the cluster

kubectl get kcp -A

3. Edit the yaml:

kubectl edit kcp kne

4. Look for capv in the editor and add the public key under sshAuthorizedKeys

Example:

  - name: capv
    sshAuthorizedKeys:

 

5. After saving this, it will start creating/provisioning new Control Plane nodes.


6. Once they provisioning is done, you can try to SSH into the CP nodes using the new private key.

 

ADD KEY TO WORKER NODES

Using the same process, we will be making changes in this case to the KubeadmConfigTemplate associated with our Machine Deployment.

The YAML of the machine deployment references the KubeadmConfigTemplate, which it contains the public key of capv.  

1. Get the KubeadmConfigTemplate and edit it.

kubectl get KubeadmConfigTemplate -A 

kubectl edit KubeadmConfigTemplate $cluster-md

2. Add the public key under capv user > sshAuthorizedKeys

Here is where is the process changes, editing the KubeadmConfigTemplate won't trigger a restart in the machine deployment, unlike KCP, which rolls out control plane nodes after a successful edit.

For workload clusters, we need to run a patch command on the MachineDeployment object to trigger a rollout:

        #For TKG 2.2 and older (ClusterAPI v1.3 and older)

        kubectl patch machinedeployment MACHINE-DEPLOYMENT --type merge -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}"

        #For TKG 2.3 and newer (ClusterAPI v1.4 and newer)

        kubectl patch machinedeployment MACHINE-DEPLOYMENT --type merge -p "{\"spec\":{\"rolloutAfter\":\"$(date +'%Y-%m-%dT%TZ')\"}}"

The machine deployment should start rolling and you will get new worker nodes with the new public key.

3. Once it's finished you can SSH into the nodes.
 

NOTE: Keep in mind you would need to do this for every cluster KCP and MD objects in order to push the new key

 

Additional Information

Impact/Risks:

Not being able to SSH into nodes will limit our troubleshooting steps and we won't be able to gather logs either by crashd or manually.