TCA deployed TKG cluster worker nodes in a Not Ready state.
search cancel

TCA deployed TKG cluster worker nodes in a Not Ready state.

book

Article ID: 386602

calendar_today

Updated On:

Products

VMware Telco Cloud Automation

Issue/Introduction

TKG worker nodes in a 'Not Ready' state.

  • The Command kubectl get nodes -A shows nodes in NOT READY status

kubectl get node

NAME              STATUS                        ROLES           AGE    VERSION
node-name-#####   Ready                         <none>          26h    v1.24.10+vmware.1
node-name-#####   NotReady,SchedulingDisabled   <none>          100d   v1.24.10+vmware.1
node-name-#####   Ready                         <none>          100d   v1.24.10+vmware.1
node-name-#####   NotReady,SchedulingDisabled   <none>          100d   v1.24.10+vmware.1
node-name-#####   Ready                         control-plane   100d   v1.24.10+vmware.1
 

  • The restart of the VM will not recover the Worker Node and it stays in Not Ready State.

Environment

TCA 2.3

 

Cause

Kubelet service failed to start due to too many open file descriptors, as indicated in the kubelet logs.

kubelet[6045]: E#### ##:##:##.###### 6045 file_linux.go:61] "Unable to read config path" err="unable to create inotify: too many open files" path="/etc/kubernetes/manifests"
kubelet[6061]: E#### ##:##:##.###### 6061 dynamic_cafile_content.go:166] "Failed to watch CA file, will retry later" err="error creating fsnotify watcher: too many open files"
kubelet[6061]: E#### ##:##:##.###### 6061 file_linux.go:61] "Unable to read config path" err="unable to create inotify: too many open files" path="/etc/kubernetes/manifests"

Resolution

Workaround:
Delete the corresponding machine object from the management cluster to automatically remove the 'NotReady' node/VM and trigger the creation of a replacement VM.

Solution:

  • The default limit of parameter fs.file-max is 96000 and is based on the default dimensioning of photon OVA templates, which has a default RAM allocation of 16 GB.
  • Determine the required fs.file-max value in coordination with the design team and application vendor.
  • Configure the openfile parameter via the tuned profile in the application CSAR.In TCA UI, navigate to:
    Catalog → Network Function → [Select the Network Function] → Infrastructure Requirements tab
  • Add the following lines to your tuned.conf file and inject the modified file via CSAR file injection:
  • Example of sysctl.conf for photon 3:


  fs.file-max=96000
  net.ipv4.tcp_syncookies=1
  kernel.randomize_va_space=2
  net.ipv4.conf.all.accept_source_route=0
  net.ipv4.conf.default.accept_source_route=0
  net.ipv4.conf.eth0.accept_source_route=0
  net.ipv6.conf.all.accept_source_route=0
  net.ipv6.conf.default.accept_source_route=0
  net.ipv6.conf.eth0.accept_source_route=0
  net.ipv4.icmp_echo_ignore_broadcasts=1
  net.ipv4.conf.all.accept_redirects=0
  net.ipv4.conf.default.accept_redirects=0
  net.ipv4.conf.eth0.accept_redirects=0
  net.ipv4.conf.all.secure_redirects=0
  net.ipv4.conf.default.secure_redirects=0
  net.ipv4.conf.eth0.secure_redirects=0
  net.ipv4.conf.all.send_redirects=0
  net.ipv4.conf.default.send_redirects=0
  net.ipv4.conf.eth0.send_redirects=0
  net.ipv4.conf.all.log_martians=1
  net.ipv4.conf.default.log_martians=1
  net.ipv4.conf.eth0.log_martians=1
  net.ipv4.conf.all.rp_filter=1
  net.ipv4.conf.default.rp_filter=1
  net.ipv4.conf.eth0.rp_filter=1
  net.ipv4.conf.all.mc_forwarding=0
  net.ipv4.conf.default.mc_forwarding=0
  net.ipv4.conf.eth0.mc_forwarding=0
  net.ipv6.conf.all.mc_forwarding=0
  net.ipv6.conf.default.mc_forwarding=0
  net.ipv6.conf.eth0.mc_forwarding=0
  net.ipv4.tcp_timestamps=1

  • Example of sysctl.conf for Photon 5:
    fs.file-max=96000
    fs.suid_dumpable=0