TKG cluster has been upgraded to 1.28.7 Photon 5 with cgroupv2 enabled. But workloads do not work well with cgroupv2 and need to run on a node with cgroupv1.
TKG v2.5 Plan based clusters
TKGm 2.5 supports kubernetes 1.27 with Photon 3 which in supports cgroupv1.
A nodepool can be created with 1.27 in the cluster thats already running 1.28 nodes.
Create a node pool with 1.27 image
tanzu cluster node-pool set <Cluster Name> -f nodepool.yaml
Where nodepool.yaml has template specified for 1.27 image
name: tkg-1_27_np
replicas: 1
labels:
key1: value1
vsphere:
memoryMiB: 8192
diskGiB: 64
numCPUs: 4
datacenter: <Datacenter>
datastore: "/<Datacenter>/datastore/<Datastore>"
folder: "/<Datacenter>/vm/<Folder>"
resourcePool: "<Resource Pool>
vcIP: <vCenter IP>
template: "<Template Folder>/photon-3-kube-v1.27.11+vmware.1"
network: <Network>
Confirm new nodepool created
tanzu cluster node-pool list <Cluster Name>
NAME NAMESPACE PHASE REPLICAS READY UPDATED UNAVAILABLE
md-0 default Running 1 1 1 0
tkg-1-27-np default Running 1 1 1 0
Confirm v1 is set on newly created node, it will be "tmpfs" for v1 and "cgroup2fs" for v2.
ssh capv@< Worker node IP>
stat -fc %T /sys/fs/cgroup/
Cluster will now have workers on 1.27 and 1.28 and all workloads can be moved onto 1.27 node.
kubectl cordon <1.28 Worker node>
kubectl drain <1.28 Worker node> --ignore-daemonsets --delete-emptydir-data
Confirm all workloads are running well on 1.27 node, then delete 1.28 nodepool
kubcetl get pods -A
tanzu cluster node-pool delete <Cluster name> -n md-0
tanzu cluster node-pool list <Cluster name>
NAME NAMESPACE PHASE REPLICAS READY UPDATED UNAVAILABLE
tkg-1-27-np default Running 1 1 1 0
Cluster can be upgraded at a later date to 1.28 once workloads have been verified with cgroupv2
tanzu cluster upgrade <Cluster name> --tkr v1.28.7---vmware.1-tkg.3