Init:CrashLoopBackOff state.init-node container repeatedly terminates with an Exit Code 255. This issue prevents the TMC agents from successfully reconciling on the worker nodes during the deployment or upgrade phase.Running kubectl get pods -n <tmc-namespace> shows pods belonging to the domain-local-ds DaemonSet stuck in Init:CrashLoopBackOff.
Describing the pod (kubectl describe pod <pod-name> -n <tmc-namespace>) reveals the init-node container failing repeatedly.
Container logs for the init-node container are often empty or abruptly cut off.
init-node container's initialization script and the underlying host's container runtime (containerd). It is not a failure of TMC connectivity, but rather an architectural quirk triggered by how the agent installer interacts with the container runtime in the Kubernetes environment during deployment or upgrade cycles.domain-local-ds DaemonSet utilizes an initialization container (init-node) designed to escape its containerized isolation using nsenter. Its purpose is to write a custom TLS certificate to the underlying worker node and subsequently restart the host node's containerd service to apply the new certificate.nsenter --mount=/proc/1/ns/mnt -- sh -c 'printenv "tls.crt" > /etc/ssl/certs/$stack_type.crt ; systemctl restart containerd'
containerd service—the exact service keeping the container itself alive—the container essentially unplugs its own life support. It is violently terminated before it can report a successful execution back to the Kubelet.The new containerd process boots up, scrubs the abruptly killed container process, and flags it as a "dead shim".
The Kubelet temporarily loses connection to the containerd.sock socket, registers a hard failure, and triggers the CrashLoopBackOff.
This issue is resovled in TMC-SM v1.4.4.
Workaround
domain-local-ds DaemonSet to introduce a brief sleep command (sleep 5) immediately following the systemctl restart command.containerd service to cycle successfully, preventing the runtime from marking the container as a leaked shim and allowing the Kubelet to process the state change gracefully.Step 1: Patch the DaemonSet Run the following command, ensuring you replace <tmc-namespace> with your actual TMC namespace:
kubectl patch ds domain-local-ds -n <tmc-namespace> --type='strategic' -p '{"spec": {"template": {"spec": {"initContainers": [{"name": "init-node", "command": ["nsenter", "--mount=/proc/1/ns/mnt", "--", "sh", "-c", "printenv TLS_CRT > /etc/ssl/certs/$stack_type.crt ; systemctl restart containerd && sleep 5"], "env": [{"name": "TLS_CRT", "valueFrom": {"configMapKeyRef": {"name": "stack-config", "key": "tls.crt", "optional": false}}}]}]}}}}'
Step 2: Restart the DaemonSet Rollout Force the DaemonSet to redeploy its pods so they pick up the patched configuration:
kubectl rollout restart ds/domain-local-ds -n <tmc-namespace>
Step 3: Monitor the Rollout Verify that the new pods initialize successfully and transition into a Running state:
kubectl rollout status ds/domain-local-ds -n <tmc-namespace>kubectl get pods -n <tmc-namespace>