Worker Machines/Nodes stuck in Provisioned state due to invalid Kubelet extra arguments
search cancel

Worker Machines/Nodes stuck in Provisioned state due to invalid Kubelet extra arguments

book

Article ID: 368822

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid 1.x

Issue/Introduction

TKGm cluster is upgraded with WORKER_KUBELET_EXTRA_ARGS included in the config file.

WORKER_KUBELET_EXTRA_ARGS: "kube-reserved=cpu=100m,memory=256M;system-reserved=cpu=500m,memory=1024M;eviction-soft=memory.available<1024M,nodefs.available<10%;eviction-hard=memory.available<512M,nodefs.available<5%;eviction-soft-grace-period=memory.available=1m30s,nodefs.available=1m30s,imagefs.available=1m30s"

 

Certain characters, such as "<", may be automatically converted to their Unicode form, "u003c" in this case, in the corresponding KubeadmConfigTemplate.

          kubeletExtraArgs:
            cloud-provider: external
            eviction-hard: memory.availableu003c512M,nodefs.availableu003c5%
            eviction-soft: memory.availableu003c1024M,nodefs.availableu003c10%

 

Kubelet isn't able to interpret the Unicode string and fails to start in the node with error "failed to parse kubelet flag: invalid argument".

Environment

TKG 2.4.1

TKG 2.5.1

Cause

Bug in one of the internal ClusterAPI controllers.

Resolution

Fix:

TKG 2.5.2 or 2.5.3 (TBD)

Workaround:

The suggestion is to use Custom ClusterClasses to implement the Kubelet extra arguments as a JSON patch.

  • Create a Custom ClusterClass

  • Info on ClusterClass with patches

  • Custom overlay example:
     name: workerKubeletExtraArgs
        definitions:
        - selector:
            apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
            kind: KubeadmConfigTemplate
            matchResources:
              machineDeploymentClass:
                names:
                - tkg-worker
          jsonPatches:
          - op: add
            path: /spec/template/spec/joinConfiguration/nodeRegistration/kubeletExtraArgs
            valueFrom:
              template: |
                cloud-provider: external
                eviction-hard: memory.available<512M,nodefs.available<5%
                eviction-soft: memory.available<1024M,nodefs.available<10%

     

  • Upgrade clusters with Custom ClusterClass