Update cluster class to install NFS client on Ubuntu 22.04 and TKGm 2.5.1
search cancel

Update cluster class to install NFS client on Ubuntu 22.04 and TKGm 2.5.1

book

Article ID: 376737

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Plus

Issue/Introduction

After upgrade of Ubuntu based classy clusters to 22.04 TKGm 2.5.1 the NFS client is no longer available (installed by default).

kubectl describe node $NODE
#> MountVolume.SetUp failed for volume "######" : mount failed: exit status 32 Mounting command: mount Mounting arguments: -t nfs #####:/#####/ /var/lib/kubelet/pods/####/volumes/kubernetes.io~nfs/#### Output: mount: /var/lib/kubelet/pods/####/volumes/kubernetes.io~nfs/####: bad option; for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program.

Environment

  • TKGm 2.5.1
  • TKGm 2.5.x

Cause

NFS client  packages were removed on 2.5.1 due to the NFS usage of RPCbind (which is bound on port 111 by default);  it was disabled to comply with the Ubuntu CIS Benchmark C-2.3.6.

TKGm does not support external NFS, and so far, there have been no reports of usage. NFS is only supported in the datastore via CSI.

Installing it back will regress a security issue and is out of the scope.
There are two alternatives to toggle the hardening procedures:

1) Bring Your Own Image, change the variables passed (adding these packages), and export a new template that can be used - this is more complex procedure and is not covered with this article 
2) Create a custom ClusterClass, to install the package before the node joins the cluster (preKubeadmCommand) - Procedure below

Resolution

Prerequisites

TKGm management cluster is created (tested with 2.5)
ytt installed
kubectl installed and set to management cluster context
Tanzu CLI

This process involves creating a custom cluster class that allows for deploying worker nodes in a workload cluster with NFS common utils installed. Creating custom clusterclasses is roughly documented here.

  1. In Tanzu Kubernetes Grid 2.3.0 and later, after you deploy a management cluster, you can find the default ClusterClass manifest in the ~/.config/tanzu/tkg/clusterclassconfigs folder.
    • cp ~/.config/tanzu/tkg/clusterclassconfigs/tkg-vsphere-default-v1.2.0.yaml .
  2. To customize your ClusterClass manifest, you create ytt overlay files alongside the manifest.
    • mkdir overlays
      cd overlays
      
      Return to top folder after creating the files: cd ..
      Create a file nfscommon.yaml:
      #@ load("@ytt:overlay", "overlay")
      
      #@overlay/match by=overlay.subset({"kind":"ClusterClass"})
      ---
      apiVersion: cluster.x-k8s.io/v1beta1
      kind: ClusterClass
      metadata:
        name: tkg-vsphere-default-v1.2.0-extended
      spec:
        #@overlay/match missing_ok=True
        variables:
        #@overlay/append
        - name: nfsCommon
          required: false
          schema:
            openAPIV3Schema:
              type: boolean
              default: false
        #@overlay/match expects=1
        patches:
        #@overlay/append
        - name: nfs
          enabledIf: '{{ .nfsCommon }}'
          definitions:
            - selector:
                apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
                kind: KubeadmConfigTemplate
                matchResources:
                  machineDeploymentClass:
                    names:
                      - tkg-worker
              jsonPatches:
                - op: add
                  path: /spec/template/spec/preKubeadmCommands/-
                  value: |
                    sudo add-apt-repository -s https://mirrors.bloomu.edu/ubuntu/ jammy main [mirrors.bloomu.edu] -y && \
                    sudo apt update -y && \
                    sudo apt-get install -y libnfsidmap1=1:2.6.1-1ubuntu1 --allow-downgrades --allow-change-held-packages && \
                    sudo apt-get install -y nfs-common --allow-change-held-packages
      
      
      Create a file filter.yaml:
      #@ load("@ytt:overlay", "overlay")
      
      #@overlay/match by=overlay.not_op(overlay.subset({"kind": "ClusterClass"})),expects="0+"
      ---
      #@overlay/remove
      
  3. Use the default ClusterClass manifest from step 1 to generate the base ClusterClass:
    • ytt -f tkg-vsphere-default-v1.2.0.yaml -f overlays/filter.yaml > default_cc.yaml
  4. Generate the custom ClusterClass this command will apply all files from the folder overlays:
    • ytt -f default_cc.yaml -f overlays/ > custom_cc.yaml
  5. Verify and Install the custom clusterClass in the Management cluster you should see in the file generated the new name and the apt-get install commands in it:
    • kubectl apply -f custom_cc.yaml
  6. You should see the following output when you run kubectl get clusterclasses:
    • NAME                                  AGE
      tkg-vsphere-default-v1.2.0            21h
      tkg-vsphere-default-v1.2.0-extended   20h
  7. We have now created an "extended" cluster class that accepts a new variable: nfsCommon

 

In order to create a new cluster with the custom class follow the below steps where the cluster_overlay.yaml is visible below:

#@ load("@ytt:overlay", "overlay")

#@overlay/match by=overlay.subset({"kind":"Cluster"})
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
spec:
  topology:
    class: tkg-vsphere-default-v1.2.0-extended
    variables:
    - name: nfsCommon
      value: true

 

  1. Copy the config file to your working director There are multiple ways to complete this step and it is mainly for demo purposes:
    • cp ~/.config/tanzu/tkg/clusterconfigs\{config_file}.yaml ./workload-1.yaml
  2. Generate the custom workload cluster manifest
    • tanzu cluster create --file workload-1.yaml --dry-run > default_cluster.yaml
  3. Using the overlay, create the custom manifest:
    • ytt -f default_cluster.yaml -f cluster_overlay.yaml > custom_cluster.yaml
  4. Deploy
    • tanzu cluster create -f custom_cluster.yaml

For existing clusters the procedure is similar where we have to update two fields on existing cluster:

  1. Modify .spec.topology.class - tkg-vsphere-default-v1.2.0-extended
  2. Add Variable nfsCoommon in .spec.topology.variables as seen in the example below
spec:
...
  topology:
    class: tkg-vsphere-default-v1.2.0-extended
    controlPlane:
      metadata:
        annotations:
          run.tanzu.vmware.com/resolve-os-image: image-type=ova,os-name=ubuntu
      replicas: 1
    variables:
    - name: nfsCommon
      value: true
    - name: cni
      value: antrea
    - name: controlPlaneCertificateRotation
...

The update will trigger immediate update of the worker nodes and will recreate the workers with Nfs-common 

Troubleshooting steps in case the worker nodes are recreated but the NFS client is not installed verify with commands below to confirm if the package was installed successfully

  • journalctl | grep nfs-client
  • journalctl | grep apt-get

Used the below git page as a guide for this KB 

https://github.com/logankimmel/tkgm-nfs-common/tree/master

Additional Information

In case of emergencies, you can manually install the NFS packages temporarily on the worker node.

ssh capv@${WORKER_NODE_IPADDRESS}
sudo add-apt-repository -s https://mirrors.bloomu.edu/ubuntu/ jammy main [mirrors.bloomu.edu] -y
sudo apt update -y
sudo apt-get install -y libnfsidmap1=1:2.6.1-1ubuntu1 --allow-downgrades --allow-change-held-packages
sudo apt-get install -y nfs-common --allow-change-held-packages