Error "cni config load failed: no network config found in /etc/cni/net.d" - containerd fails to start on worker nodes
search cancel

Error "cni config load failed: no network config found in /etc/cni/net.d" - containerd fails to start on worker nodes

book

Article ID: 428166

calendar_today

Updated On:

Products

VMware vSphere Kubernetes Service

Issue/Introduction

When deploying or upgrading a Tanzu Kubernetes Cluster (TKC), worker nodes fail to reach the Ready state. The containerd service fails to start on the worker nodes, reporting that it cannot load the CNI configuration.

The following error appears in the containerd logs: error="cni config load failed: no network config found in /etc/cni/net.d: cni plugin not initialized: failed to load cni config"

Cause

This issue occurs when restricted labels in the kubernetes.io or k8s.io namespace are applied to worker nodes via the nodePoolLabels variable in the Cluster configuration.

Specifically, setting the label node-role.kubernetes.io/worker to "true" via nodePoolLabels causes the Kubelet service to fail during startup. The Kubelet has built-in security validation that prevents manual application of labels within the kubernetes.io namespace unless they are in a specifically allowed whitelist.

When Kubelet fails to start, the CNI plugin (e.g., Antrea or Calico) cannot initialize. Consequently, containerd fails because the expected network configuration files are missing from /etc/cni/net.d.

Resolution

To resolve this issue, you must remove the restricted nodePoolLabels override from the cluster manifest.

  1. Edit the cluster YAML manifest.

  2. Locate the variables section under the worker machineDeployments.

  3. Find the override for nodePoolLabels.

  4. Remove the entry containing the restricted label (e.g., node-role.kubernetes.io/worker).

    Incorrect Configuration:

    YAML
     
    variables:
      overrides:
      - name: nodePoolLabels
        value:
        - key: node-role.kubernetes.io/worker
          value: "true"
    

    Corrected Configuration:

    YAML
     
    variables:
      overrides: []
    

    (Note: If other valid custom labels exist, only remove the restricted label).

  5. Apply the updated configuration to the cluster.

  6. Verify that the worker nodes reconcile and the containerd service starts successfully.

Additional Information

For a list of allowed labels that can be set via --node-labels, refer to the official Kubernetes documentation regarding Kubelet label restrictions. The label node-role.kubernetes.io/worker is reserved and typically managed by the system or specific lifecycle managers, not manual injection.