On TKGI cluster, pod creation keeps failing with below error message on particular worker, usually restarting the worker temporarily resolves the problem but it recurs after some time.
Warning FailedCreatePodSandBox pod/#### Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524: unknown
TKGI all releases
The problem is caused by customer workload on the cluster. The applications running in pods could reserve kernel resources (such as process resource, kernel memory, file descriptor, security resource, etc) but do not release those resources properly. Thus when a particular kernel resource is used up, then creation of a new pod which requires the kernel resource will fail.
Temporarily, it can be resolved by restarting the impacted worker, so that the leaked resource can be cleaned up.
As permanent resolution, troubleshooting, especially resource leak check and profiling has to be done with the the application, to identity if resource is properly used in the application.