Unable to start container process: unable to init seccomp: error loading seccomp filter into kernel err loading seccomp filter errno 524: unknown
search cancel

Unable to start container process: unable to init seccomp: error loading seccomp filter into kernel err loading seccomp filter errno 524: unknown

book

Article ID: 401709

calendar_today

Updated On:

Products

Tanzu Kubernetes Runtime VMware vSphere Kubernetes Service

Issue/Introduction

  • Fails to create a new pod
  • "kubectl describe pod" shows below error message:
    code = Unknown desc = failed to create containerd task. Failed to create shim task: OCI runtime create failed. runc create failed: Unable to start container process: unable to init seccomp: error loading seccomp filter into kernel err loading seccomp filter errno 524: unknown

 

Environment

vSphere with Tanzu

Worker nodes using Ubuntu OS with Linux Kernel Versions ( < 5.15.105)

Cause

  • When the usage of memory allocated for BPF JIT Compiler in the Linux Kernel goes beyond its limit, it causes failures in creation of new container processes. 
  • Old Linux Kernels have a known resource leakage in relation with net.core.bpf_jit_limit - https://github.com/moby/moby/issues/45498
  • The default value configured for BPF JIT Compiler (net.core.bpf_jit_limit) with Ubuntu OS may not be enough for few situations like extreme application workloads or too many zombie processes from ill functioning apps. 
  • This could happen when the VKS cluster has high churning pods.

Resolution

  • As a temporary workaround value of net.core.bpf_jit_limit can be increased manually on each worker node.
  • As a permanent approach, upgrade the TKr to v1.29.4 or higher. According to the community update https://github.com/moby/moby/issues/45498#issuecomment-1542155705,  Linux Kernels (>=  v5.15.105 ) are supposed to fix the resource leakage issue and since TKr 1.29.4 the higher versions (>= v5.15.107) of the Kernel are used.

Additional Information

In case the same symptom is still happening even with VKr/TKr 1.29.4 or higher, please collect below info and open a new support case.

1. Collect VKS Support Bundle

2. Collect following info from the worker node as soon as the symptom is observed

mkdir -p ~/seccomp
hostname -a > ~/seccomp/hostname
sudo bpftool prog show --json  > ~/seccomp/bpftool-prog-show.json
sudo bpftool cgroup tree --json > ~/seccomp/bpftool-cgroup-tree.json
sudo crictl ps -a > ~/seccomp/crictl-ps-a.out
sudo cat /proc/vmallocinfo | grep bpf_jit | awk '{s+=$2} END {print s}'  > ~/seccomp/vmallocinfo-bpf_jit.out
tar zcvf $(hostname)-$(date +%s)-seccomp-data.tgz ./seccomp