capv user password is set to expire in 60 days in Ubuntu OS and 90 days in Photon OS as part of STIG Hardening.
While this is implemented as part of Security Hardening this impacts the ssh login to the nodes once the password has expired.
Errors include:
# ssh capv@<node-ip>
Your account has expired; please contact your system administrator
# ssh capv@<node-ip>
You are required to change your password immediately (password expired)
WARNING: Your password has expired.
You must change your password now and login again!
Changing password for capv.
Current password:
It's recommended to upgrade to TKG 2.5.1 or beyond, as capv user password is set to not expire in this release.
If the upgrade is not yet possible, follow the instructions below.
Changing the capv user password expiry on Existing Clusters:
The method to follow will depend on whether your cluster is a legacy or classy (ClusterClass-based) one.
To check which type your cluster is, from the Management Context:# kubectl get cluster <cluster-name> -n <namespace> -o yaml
Check if .spec.topology.class
is set for the cluster.
If it's set, the cluster is classy type.
If it's not set, the cluster is legacy type.
Method 1 (Recommended for legacy type, valid for already expired passwords and non-expired passwords):
This would require us to edit the KCP (kubeadmcontrolplane) and kubeadmconfigtemplate objects for each and every cluster from the management cluster context.
Note:
This would require all the nodes to be rolled out however this is persistent, if a node is recreated changes would persist.
kubectl get kcp -A
kubectl edit kcp CLUSTER_NAME-control-plane
.spec.kubeadmConfigSpec.preKubeadmCommands
- chage -I -1 -m 0 -M 99999 -E -1 capv
preKubeadmCommands:
- chage -I -1 -m 0 -M 99999 -E -1 capv
- hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts
- echo "127.0.0.1 localhost" >>/etc/hosts
- echo "127.0.0.1 {{ ds.meta_data.hostname }}" >>/etc/hosts
- echo "{{ ds.meta_data.hostname }}" >/etc/hostname
- '! which rehash_ca_certificates.sh 2>/dev/null || rehash_ca_certificates.sh'
- '! which update-ca-certificates 2>/dev/null || (mv /etc/ssl/certs/tkg-custom-ca.pem
/usr/local/share/ca-certificates/tkg-custom-ca.crt && update-ca-certificates)'
useExperimentalRetryJoin: true
kubectl get kcp CLUSTER_NAME-control-plane -o jsonpath='{.spec.kubeadmConfigSpec.preKubeadmCommands}{"\n"}'
Once the KCP is edited all the control plane nodes would be rolled out one at a time and once the VM is recreated you can validate the changes by logging into the nodes.
Similarly, for the worker nodes, we need to edit the kubeadmconfigtemplate object.
kubectl get kubeadmconfigtemplate -A
kubectl edit kubeadmconfigtemplate CLUSTER_NAME-md-0
.spec.template.spec.preKubeadmCommands
- chage -I -1 -m 0 -M 99999 -E -1 capv
preKubeadmCommands:
- chage -I -1 -m 0 -M 99999 -E -1 capv
- hostname "{{ ds.meta_data.hostname }}"
- echo "::1 ipv6-localhost ipv6-loopback" >/etc/hosts
- echo "127.0.0.1 localhost" >>/etc/hosts
- echo "127.0.0.1 {{ ds.meta_data.hostname }}" >>/etc/hosts
- echo "{{ ds.meta_data.hostname }}" >/etc/hostname
- '! which rehash_ca_certificates.sh 2>/dev/null || rehash_ca_certificates.sh'
- '! which update-ca-certificates 2>/dev/null || (mv /etc/ssl/certs/tkg-custom-ca.pem
/usr/local/share/ca-certificates/tkg-custom-ca.crt && update-ca-certificates)'
kubectl get kubeadmconfigtemplate CLUSTER_NAME-md-0 -o jsonpath='{.spec.template.spec.preKubeadmCommands}{"\n"}'
Unlike KCP which rolls out control plane nodes after successful edit for workload clusters we need to run a patch command on the MachineDeployment object to trigger a rollout.
For TKG 2.2 and older (ClusterAPI v1.3 and older):
kubectl patch machinedeployment CLUSTER_NAME-md-0 --type merge -p "{\"spec\":{\"template\":{\"metadata\":{\"annotations\":{\"date\":\"`date +'%s'`\"}}}}}"
For TKG 2.3 and newer (ClusterAPI v1.4 and newer):
kubectl patch machinedeployment CLUSTER_NAME-md-0 --type merge -p "{\"spec\":{\"rolloutAfter\":\"$(date +'%Y-%m-%dT%TZ')\"}}"
Once the VM is recreated you can validate the changes by logging into the nodes.
For New Cluster Creation:
If you are newly creating clusters in TKG 1.6 to include this password expiry setting we need to add the below overlay files under ~/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/
For Control plane:
Create a file under ~/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/
say with the name capv-user-expiry-control-plane.yaml
#@ load("@ytt:overlay", "overlay")
#@ load("@ytt:data", "data")#@overlay/match by=overlay.subset({"kind":"KubeadmControlPlane"})
---
spec:
kubeadmConfigSpec:
preKubeadmCommands:
#! setting the password expiry of capv user to one year
#@overlay/append
- chage -I -1 -m 0 -M 99999 -E -1 capv
For Worker node:
Create a file under ~/.config/tanzu/tkg/providers/infrastructure-vsphere/ytt/
say with the name capv-user-expiry-worker.yaml
#@ load("@ytt:overlay", "overlay")
#@ load("@ytt:data", "data")#@overlay/match by=overlay.subset({"kind":"KubeadmConfigTemplate"}),expects="1+"
---
spec:
template:
spec:
#@overlay/match missing_ok=True
preKubeadmCommands:
#! setting the password expiry of capv user to one year
#@overlay/append
- chage -I -1 -m 0 -M 99999 -E -1 capv
Method 2 (Recommended for classy type, valid for already expired passwords and non-expired passwords):
You can deploy a DaemonSet that will execute into each node with root privileges and remove the expiration for capv user.
The DaemonSet will scale together with your cluster, so if you rollout new nodes, new pods will be deployed in them removing their capv expiration.
pass_expiry.yaml
using the following command, copy from cat <<EOF>> until the EOF line at the bottom:# cat <<EOF>> pass_expiry.yaml
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: capv-credential-admin
spec:
selector:
matchLabels:
tkg: capv-credential-admin
template:
metadata:
labels:
tkg: capv-credential-admin
spec:
volumes:
- name: hostfs
hostPath:
path: /
initContainers:
- name: init
image: <PATH_TO_REGISTRY>/ubuntu:23.04
command:
- /bin/sh
- -xc
- |
chroot /host chage -l capv \
&& chroot /host chage -I -1 -m 0 -M 99999 -E -1 capv \
&& echo expiry updated \
&& chroot /host chage -l capv \
&& echo done
volumeMounts:
- name: hostfs
mountPath: /host
containers:
- name: sleep
image: projects.registry.vmware.com/tkg/pause:3.9
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- key: CriticalAddonsOnly
operator: Exists
- effect: NoExecute
key: node.alpha.kubernetes.io/notReady
operator: Exists
- effect: NoExecute
key: node.alpha.kubernetes.io/unreachable
operator: Exists
- effect: NoSchedule
key: kubeadmNode
operator: Equal
value: master
- effect: NoSchedule
key: node-role.kubernetes.io/control-plane
operator: Exists
EOF
Note:
If ubuntu:23.04 image is not available, any other ubuntu version is also valid.
If pause:3.9 image is not available in your nodes, you can check which version is available with below command:# ssh capv@<node-ip> "sudo crictl images | grep projects.registry.vmware.com/tkg/pause"
pass_expiry.yaml
:# kubectl apply -f pass_expiry.yaml
# kubectl get po | grep capv-credential-admin
Method 3 (both legacy and classy types, valid only for non-expired passwords, ephemeral method):
You can use a simple for loop to set the password expiry which is quicker and simple however one thing to note is this is ephemeral (i.e) if a VM is recreated the changes would be lost. This method is only applicable for clusters where the password is not expired yet.
kubectl get nodes -o jsonpath='{.items[*].status.addresses[?(@.type=="ExternalIP")].address}' > nodes
for i in `cat nodes`; do ssh -i /home/ubuntu/.ssh/id_rsa -o "UserKnownHostsFile=/dev/null" -o "StrictHostKeyChecking=no" -q capv@$i sudo chage -I -1 -m 0 -M 99999 -E -1 capv; done;
for i in `cat nodes`; do ssh -i /home/ubuntu/.ssh/id_rsa -o "UserKnownHostsFile=/dev/null" -o "StrictHostKeyChecking=no" -q capv@$i sudo chage -l capv; done;