Management cluster deletion and creation operations fails on RHEL OS.

Products

Tanzu Kubernetes Grid

Issue/Introduction

The creation and deletion of management cluster fails on RHEL OS. This is because during the boot process, RHEL 8 mounts the cgroup-v1 virtual filesystem by default. To utilise CGroup-v2 functionality in limiting resources for your applications, Manually configure the system.
The creation of the management cluster will fail with the following error.

Error: unable to set up management cluster: unable to create bootstrap cluster: failed to create kind cluster tkg-kind-#########: failed to init node with kubeadm: command 
"docker exec --privileged tkg-kind-##########-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1

Environment

Tanzu kubernetes Grid 2.5.x

Red Hat Enterprise Linux 8

Cause

This will prevent the deletion and creation of the TKG Management cluster because the kind cluster will not boot. The deletion/creation will fail with the following error.

Error: unable to set up management cluster: unable to create bootstrap cluster: failed to create kind cluster tkg-kind-#########: failed to init node with kubeadm: command 
"docker exec --privileged tkg-kind-############-control-plane kubeadm init --skip-phases=preflight --config=/kind/kubeadm.conf --skip-token-print --v=6" failed with error: exit status 1

- Once the kind cluster comes up we can exec inside the container using the following command and check the journalctl logs of kubelet.

worker@host ~]$ 
[worker@host ~]$ docker ps
CONTAINER ID   IMAGE                                                                        COMMAND                  CREATED          STATUS         PORTS                       NAMES
0916a90b22e6   projects.registry.vmware.com/tkg/kind/node:v1.28.11_vmware.2-tkg.2_v0.20.0   "/usr/local/bin/entr…"   12 seconds ago   Up 6 seconds   127.0.0.1:44383->6443/tcp   tkg-kind-cu7sdpqo2gf8a4d4qdng-control-plane
[worker@host ~]$ 
[worker@host~]$ docker exec -it 0916a bash
root@tkg-kind-############-control-plane:/# 
root@tkg-kind-############-control-plane:/# 
root@tkg-kind-############-control-plane:/# journalctl -fu kubelet
-- Journal begins at Tue 2025-01-21 16:10:21 UTC. --

Jan 21 07:22:59 tkg-kind-##########-control-plane kubelet[2607]: I0121 07:22:59.517743    2607 state_mem.go:75] "Updated machine memory state"
Jan 21 07:22:59 tkg-kind-##########-control-plane kubelet[2607]: E0121 07:22:59.523846    2607 kubelet.go:1511] "Failed to start ContainerManager" err="failed to initialize top level QOS containers: root container [kubelet kubepods] doesn't exist"
Jan 21 07:22:59 tkg-kind-##########-control-plane systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
Jan 21 07:22:59 tkg-kind-##########-control-plane systemd[1]: kubelet.service: Failed with result 'exit-code'.

- From the kubelet logs we can see the error "err="failed to initialize top level QOS containers: root container [kubelet kubepods] doesn't exist"

- To check the current CGroup version we can run the docker info command.

$ docker info | grep Cgroup  
Cgroup Driver: systemd  
Cgroup Version: 1

- The mount command also shows the similar output.

[worker@host ~]$ mount -l | grep cgroup
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/systemd-cgroups-agent,name=systemd)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec,relatime,hugetlb)

Resolution

- Mounting cgoups-v2

Prerequisites

You have root permissions.

Procedure

By default RHEL8 runs on cgroup-v1 . You can switch to using cgroup-v2 by adding systemd.unified_cgroup_hierarchy=1 parameter on the kernel command line and reboot

Using grubby tool

To modify the kernel command line, you can use the grubby tool:

# grubby --update-kernel=ALL --args="systemd.unified_cgroup_hierarchy=1"

Alternatively, you can use the the grub2-mkconfig method:

Using grub2-mkconfig

1. Edit the /etc/default/grub file.

2. This file contains multiple GRUB2 options. Kernel boot parameters are specified by the line that contains GRUB_CMDLINE_LINUX .

Add the parameter systemd.unified_cgroup_hierarchy=1 at the end of GRUB_CMDLINE_LINUX line within the quotes.

3. Once the file is edited, save it and execute the command specified below to generate a new grub.cfg file:

For an MBR (BIOS-based) system
# grub2-mkconfig -o /boot/grub2/grub.cfg
For a GPT (UEFI-based) system:
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

4. Now, reboot the system to apply the changes.

5. Execute the following command to validate.

# mount -l|grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2
(rw,nosuid,nodev,noexec,relatime,nsdelegate)

Additional Information

For more information kindly refer to Red hat link (only accessible if you have Redhat account) - https://access.redhat.com/solutions/6898151