Pods creation failing with error "Failed to create pod sandbox: rpc error: code , networkPlugin cni failed to set up pod"
search cancel

Pods creation failing with error "Failed to create pod sandbox: rpc error: code , networkPlugin cni failed to set up pod"

book

Article ID: 345640

calendar_today

Updated On:

Products

VMware

Issue/Introduction

Symptoms:
  • Pods creation in TKGi cluster is failing with error:
    Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "a4175113f32a5c4733ac87676f0bedc878a0333ac8c83b329af39c9ae8b91a50" network for pod "example-deployment-rijen-9c8fbb69c-tmrr5": networkPlugin cni failed to set up pod "example-deployment-rijen-9c8fbb69c-tmrr5_default" network: netplugin failed with no error message

  • In the /var/vcap/sys/log/ncp/ncp.stdout.log file, you see entries similar to:
    No logical port found for node 4e17923a-ccb3-4fb3-b893-fbc841e472ca and Failed to get node vif or TN ID for node 4e17923a-ccb3-4fb3-b893-fbc841e472ca in cluster pks-2a19cbb1-fdcd-4d7c-a006-859c05ec5b90 

  • When you run the below command, you see multiple worker nodes Hyperbus status shows Unhealthy:
    bosh -d service-instance_<cluster_UUID> ssh worker -c "sudo /var/vcap/jobs/nsx-node-agent/bin/nsxcli -c get node-agent-hyperbus status" | grep -i unhealthy

  • When you check the logical ports for problematic nodes in the NSX-T Manager, you do not see bosh tag. (scope: bosh/id)



Environment

VMware Tanzu Kubernetes Grid Integrated Edition 1.x

Resolution

To resolve this issue, manually add the bosh ID tag on the worker node logical port. 

  1. Navigate to Logical Switch Ports tab on nsxmanager UI, and search for the vm CID of the problematic node.

  2. Manually add the tag with the key as “bosh/id” and value collected from below steps.

    1. use bosh vms to get the original bosh id, the UUID behind worker/ should be the bosh id

    2. echo -n "${bosh_id}" | shasum -a 1 can get the sha value of bosh id, which is what we need for the tag value

Alternatively, you can try recreating the node which will create a new vm connected to new logical port with all the tags.