capv user password expiry in Tanzu Kubernetes Grid
search cancel

capv user password expiry in Tanzu Kubernetes Grid

book

Article ID: 319310

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid VMware Tanzu Kubernetes Grid 1.x VMware Tanzu Kubernetes Grid Plus VMware Tanzu Kubernetes Grid Plus 1.x

Issue/Introduction

capv user password is set to expire in 60 days in Ubuntu OS and 90 days in Photon OS as part of STIG Hardening.

While this is implemented as part of Security Hardening this impacts the ssh login to the nodes once the password has expired.

Errors include:

# ssh capv@<node-ip>
Your account has expired; please contact your system administrator

# ssh capv@<node-ip>
You are required to change your password immediately (password expired)
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for capv.
Current password:

Environment

Tanzu Kubernetes Grid up to 2.5.0

Resolution

It's recommended to upgrade to TKG 2.5.1 or beyond, as capv user password is set to not expire in this release.

If the upgrade is not yet possible, follow the instructions below.

 

Changing the capv user password expiry on Existing Clusters:

The method to follow will depend on whether your cluster is a legacy or classy (ClusterClass-based) one.
To check which type your cluster is, from the Management Context:
# kubectl get cluster <cluster-name> -n <namespace> -o yaml

Check if .spec.topology.class is set for the cluster.

If it's set, the cluster is classy type.
If it's not set, the cluster is legacy type.

 

Method 1 (Recommended for legacy type, valid for already expired passwords and non-expired passwords):

This would require us to edit the KCP (kubeadmcontrolplane) and kubeadmconfigtemplate objects for each and every cluster from the management cluster context.

Note:

  • This method can be used both before or after the password expiry.
  • Please remember to reserve the IP addresses of these node post recreation as described in the TKG documentation

This would require all the nodes to be rolled out however this is persistent, if a node is recreated changes would persist.

  • Set the context to the Management cluster context 
  • Get the list of kcp objects across all the namespace
kubectl get kcp -A
  • For each and every workload cluster there will be a corresponding KCP object with the name CLUSTER_NAME-control-plane, edit this object using the below command (if the cluster is in a different namespace please make sure to add the namespace using -n flag)
kubectl edit kcp CLUSTER_NAME-control-plane
  • Add the below line to set the password expiry under .spec.kubeadmConfigSpec.preKubeadmCommands 
- chage -I -1 -m 0 -M 99999 -E -1 capv
  • After the edit the section would look similar to the below snippet:
    preKubeadmCommands:
    - chage -I -1 -m 0 -M 99999 -E -1 capv
    - hostname "{{ ds.meta_data.hostname }}"
    - echo "::1         ipv6-localhost ipv6-loopback" >/etc/hosts
    - echo "127.0.0.1   localhost" >>/etc/hosts
    - echo "127.0.0.1   {{ ds.meta_data.hostname }}" >>/etc/hosts
    - echo "{{ ds.meta_data.hostname }}" >/etc/hostname
    - '! which rehash_ca_certificates.sh 2>/dev/null || rehash_ca_certificates.sh'
    - '! which update-ca-certificates 2>/dev/null || (mv /etc/ssl/certs/tkg-custom-ca.pem
      /usr/local/share/ca-certificates/tkg-custom-ca.crt && update-ca-certificates)'
    useExperimentalRetryJoin: true
  • You can also query this using kubectl command and json query
 kubectl get kcp CLUSTER_NAME-control-plane -o jsonpath='{.spec.kubeadmConfigSpec.preKubeadmCommands}{"\n"}'


Once the KCP is edited all the control plane nodes would be rolled out one at a time and once the VM is recreated you can validate the changes by logging into the nodes.

Similarly, for the worker nodes, we need to edit the kubeadmconfigtemplate object.

  • Set the context to the Management cluster context 
  • Get the list of kubeadmconfigtemplate objects across all the namespace
kubectl get kubeadmconfigtemplate -A
  • For each and every workload cluster there will be a corresponding kubeadmconfigtemplate object with the name CLUSTER_NAME-md-0 edit this object using the below command
kubectl edit kubeadmconfigtemplate CLUSTER_NAME-md-0 
  • Add the below line to set the password expiry under .spec.template.spec.preKubeadmCommands 
- chage -I -1 -m 0 -M 99999 -E -1 capv
  • After the edit the section would look similar to the below snippet:
      preKubeadmCommands:
      - chage -I -1 -m 0 -M 99999 -E -1 capv
      - hostname "{{ ds.meta_data.hostname }}"
      - echo "::1         ipv6-localhost ipv6-loopback" >/etc/hosts
      - echo "127.0.0.1   localhost" >>/etc/hosts
      - echo "127.0.0.1   {{ ds.meta_data.hostname }}" >>/etc/hosts
      - echo "{{ ds.meta_data.hostname }}" >/etc/hostname
      - '! which rehash_ca_certificates.sh 2>/dev/null || rehash_ca_certificates.sh'
      - '! which update-ca-certificates 2>/dev/null || (mv /etc/ssl/certs/tkg-custom-ca.pem
        /usr/local/share/ca-certificates/tkg-custom-ca.crt && update-ca-certificates)'
  • You can also query this using kubectl command and json query
 kubectl get kubeadmconfigtemplate CLUSTER_NAME-md-0 -o jsonpath='{.spec.template.spec.preKubeadmCommands}{"\n"}'


Unlike KCP which rolls out control plane nodes after successful edit for workload clusters we need to run a patch command on the MachineDeployment object to trigger a rollout.

For TKG 2.2 and older (ClusterAPI v1.3 and older):
kubectl patch machinedeployment CLUSTER_NAME-md-0 --type merge -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}"

For TKG 2.3 and newer (ClusterAPI v1.4 and newer):
kubectl patch machinedeployment CLUSTER_NAME-md-0 --type merge -p "{\"spec\":{\"rolloutAfter\":\"$(date +'%Y-%m-%dT%TZ')\"}}"


Once the VM is recreated you can validate the changes by logging into the nodes.

For New Cluster Creation:

If you are newly creating clusters in TKG 1.6 to include this password expiry setting we need to add the below overlay files under ~/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/

For Control plane: 

Create a file under  ~/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/ say with the name capv-user-expiry-control-plane.yaml

#@ load("@ytt:overlay", "overlay")
#@ load("@ytt:data", "data")#@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"})
---
spec:
  kubeadmConfigSpec:
    preKubeadmCommands:
    #! setting the password expiry of capv user to one year
    #@overlay/append
    - chage -I -1 -m 0 -M 99999 -E -1 capv


For Worker node:

Create a file under  ~/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/ say with the name capv-user-expiry-worker.yaml

#@ load("@ytt:overlay", "overlay")
#@ load("@ytt:data", "data")#@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"}),expects="1+"
---
spec:
  template:
    spec:
      #@overlay/match missing_ok=True
      preKubeadmCommands:
      #! setting the password expiry of capv user to one year
      #@overlay/append
      - chage -I -1 -m 0 -M 99999 -E -1 capv

 

Method 2 (Recommended for classy type, valid for already expired passwords and non-expired passwords):

You can deploy a DaemonSet that will execute into each node with root privileges and remove the expiration for capv user.
The DaemonSet will scale together with your cluster, so if you rollout new nodes, new pods will be deployed in them removing their capv expiration.

  1. Create a yaml file called pass_expiry.yaml using the following command, copy from cat <<EOF>> until the EOF line at the bottom:

    # cat <<EOF>> pass_expiry.yaml
    ---
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: capv-credential-admin
    spec:
      selector:
        matchLabels:
          tkg: capv-credential-admin
      template:
        metadata:
          labels:
            tkg: capv-credential-admin
        spec:
          volumes:
            - name: hostfs
              hostPath:
                path: /
          initContainers:
            - name: init
              image: <PATH_TO_REGISTRY>/ubuntu:23.04
              command:
                - /bin/sh
                - -xc
                - |
                  chroot /host chage -l capv \
                  && chroot /host chage -I -1 -m 0 -M 99999 -E -1 capv \
                  && echo expiry updated \
                  && chroot /host chage -l capv \
                  && echo done
              volumeMounts:
                - name: hostfs
                  mountPath: /host
          containers:
            - name: sleep
              image: projects.registry.vmware.com/tkg/pause:3.9
          tolerations:
          - effect: NoSchedule
            key: node-role.kubernetes.io/master
            operator: Exists
          - key: CriticalAddonsOnly
            operator: Exists
          - effect: NoExecute
            key: node.alpha.kubernetes.io/notReady
            operator: Exists
          - effect: NoExecute
            key: node.alpha.kubernetes.io/unreachable
            operator: Exists
          - effect: NoSchedule
            key: kubeadmNode
            operator: Equal
            value: master
          - effect: NoSchedule
            key: node-role.kubernetes.io/control-plane
            operator: Exists
    EOF


    Note:

    If ubuntu:23.04 image is not available, any other ubuntu version is also valid.
    If pause:3.9 image is not available in your nodes, you can check which version is available with below command:
    # ssh capv@<node-ip> "sudo crictl images | grep projects.registry.vmware.com/tkg/pause"

  2. From the cluster Context where you want to remove the capv expiration, apply the newly created pass_expiry.yaml:
    # kubectl apply -f pass_expiry.yaml

  3. Make sure that the DaemonSet pods are up and running:
    # kubectl get po | grep capv-credential-admin

    If the pods are in ImagePullBackOff or ErrImagePull status, make sure that both ubuntu and pause images are accessible from the nodes.


Method 3 (both legacy and classy types, valid only for non-expired passwords, ephemeral method):

You can use a simple for loop to set the password expiry which is quicker and simple however one thing to note is this is ephemeral (i.e) if a VM is recreated the changes would be lost. This method is only applicable for clusters where the password is not expired yet.

  • Set the context to the cluster 
  • Get the list of nodes in a file using the below command   
kubectl get nodes -o jsonpath='{.items[*].status.addresses[?(@.type=="ExternalIP")].address}' > nodes
  • Set the password expiry to never expire, please make sure to change the location of the private key according to your environment
for i in `cat nodes`; do ssh -i /home/ubuntu/.ssh/id_rsa -o "UserKnownHostsFile=/dev/null" -o "StrictHostKeyChecking=no" -q capv@$i sudo chage -I -1 -m 0 -M 99999 -E -1 capv; done;
  • You can check if the password expiry is set correctly using the below command
for i in `cat nodes`; do ssh -i /home/ubuntu/.ssh/id_rsa -o "UserKnownHostsFile=/dev/null" -o "StrictHostKeyChecking=no" -q capv@$i sudo chage -l capv; done;