Support for Node Volume Mounts for vSphere Supervisor Workload Clusters
search cancel

Support for Node Volume Mounts for vSphere Supervisor Workload Clusters

book

Article ID: 319405

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service VMware vSphere 7.0 with Tanzu vSphere with Tanzu Tanzu Kubernetes Runtime

Issue/Introduction

Requirements to use Node Volume Mounts

  • Requires TKR to be version 1.17 or later.
  • Minimum of vCenter 7.0U2 and Supervisor Cluster version 1.19.1
  • This feature can only be used on Workload clusters in a Supervisor cluster

Limitations

  • In vSphere 8.0u2b and lower, the volume mounts on Control Plane Nodes cannot be changed after workload cluster has already been deployed. You MUST redeploy the workload cluster with the desired node volume mount.
  • Starting in vSphere 8.0u2c and higher, volumes mounts on control plane nodes can be changed after workload cluster deployment.
  • Not all node volume mounts are tested and some do not work at all or have unintended consequences. That is outlined below.

 

Fully Supported Mount Locations

These node volume mounts are fully supported by VMware. 

/var/lib/containerd  -  This increases the size available for cached images. ie the customer is using very large images for their deployment and deployments are taking a long time since it has to constantly be ejecting/pulling images due to the disk space being limited on the node. 

/var/lib/kubelet  -  This increases the size available for ephemeral containers. ie the customer is using containers that require a lot of ephemeral storage and are getting errors on their pods about limited disk space.

NOTE: Both of the above node volume mounts CAN be added onto control plane nodes, however since control plane nodes are not designed to handle large workloads, we recommend revising your control plane nodes footprint to be smaller versus dedicating volume mounts to have a larger image or larger pod. In the event that this is unavoidable, there is nothing stopping these node volumes from being added to control plane nodes. 
 

Not Recommended Mount Locations

VMware does not recommend placing node volume mounts in these locations. Individual reasons for why the node volume mount is not recommended are noted below.

/var/lib/etcd  -  This has been assumed to increase the total size available for etcd, but that is not true. It will create a node volume mount on etcd, but this does not tell etcd to consume extra space during initialization. This means that regardless of the size of the node volume mount, once etcd has hit the 2GB max it will start to fail writes with "out of disk space" errors. The Kubernetes etcd database is designed to be small and even very large kubernetes deployments should not hit this value. When etcd has run out of space, the issue is always with an out of control CRD object's getting created/deleted in thousands every minute. The other reason we do not recommend the etcd node volume mount is because it will prevent cluster creation if the PVC backing the node volume mount either fails to create or takes too long to create due to slower storage.

NOTE: There is a known issue with node volume mounts on etcd for single node control plane VM's that CAN cause cluster data loss. If any existing single control plane clusters currently have an etcd node volume mount, they need to be moved to 3 to avoid data loss. 

Unsupported Mount Locations

During the node volume mount creation process everything is moved from the existing directory into a temp folder, then the additional storage is configured and mounted to the node, then finally everything is moved from the temp folder into the new storage. Any running service that is actively using files in the node volume mount location will not function during the creation process of the node volume mount, meaning that any directory that is being used by core system processes are not supported. This includes but are not limited to the following configured directories. Please use your best judgement when deciding where to use node volume mounts.

/ (root)
/var
/var/lib
/etc

Environment

vSphere Supervisor
 
This is relevant regardless of whether or not the workload cluster is managed by Tanzu Mission Control (TMC)

Cause

A node's containerd images and kubelet container storage can be stored in dedicated volume mounts instead of being stored directly in root.

It is currently not supported to increase the total disk space size of a workload cluster node in the vSphere Supervisor product.

Unsupported changes to disk space will be reverted on node recreation, such as through a workload cluster's TKR version upgrade.

 

Root disk space on a node can be filled by application pods creating files directly on root disk space.

Consider using persistent volumes for application storage.

Related Documentation:

Kubernetes Ephemeral Volumes

Kubernetes Persistent Volumes

Example Guestbook Application Using Persistent Volumes

Resolution

Explanation of Volume Mounts

Volume mounts are dedicated volumes for the specified directories.

This does not increase the overall root disk space in a node, but moves those dedicated directories into separate persistent volumes mounted onto the node.

See the below for "df -h" examples of worker nodes in a workload cluster that has no applications running on it:

Worker node without containerd and kubelet volume mounts df -h

Filesystem Size Used Avail Use% Mounted on
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 983M 0 983M 0% /dev/shm
tmpfs 394M 6.5M 387M 2% /run
/dev/sda3 20G 6.5G 13G 35% /
tmpfs 983M 0 983M 0% /tmp
/dev/sda2 10M 1.4M 8.7M 14% /boot/efi

 

Worker node with containerd and kubelet volume mounts df -h

This output is based on dedicated volume mounts of 50G for both containerd and kubelet directories.

Filesystem Size Used Avail Use% Mounted on
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 983M 0 983M 0% /dev/shm
tmpfs 394M 6.5M 387M 2% /run
/dev/sda3 20G 3.7G 15G 20% /
tmpfs 983M 0 983M 0% /tmp
/dev/sda2 10M 1.4M 8.7M 14% /boot/efi
/dev/sbd1 49G 3.0G 44G 7% /var/lib/containerd
/dev/sdc1 49G 456K 47G 1% /var/lib/kubelet

 

Resolution

Follow the appropriate workload cluster documentation for adding supported volume mounts to control plane nodes and node-pools in a workload cluster.

TKC on v1alpha3 Documentation Example

Cluster on v1beta1 Documentation

ClusterClass 3.X Documentation

Additional Information

Known issues with Volume Mounts

TKGS Volume Mounts are not mounted after reboot on vSphere 7.0U2 or earlier (323441)
-Issue is fixed in vSphere 7.0U3 and all versions of 8.0

PODs stuck in ContainerCreating state on TKGS Guest Clusters after vSphere HA Failover event (319371)
-Issue is fixed in TKR v1.23.8---vmware.3-tkg.1

-TKGS Cluster stuck in upgrading state with error spec.kubeadmConfigSpec.mounts: Forbidden: cannot be modified (Please open support case with Broadcom for assistance referencing this issue)

- TKGS Cluster Nodes Experience Swapped PVC-to-Mount Path Mappings (389012)