Pod fails to start with the error "container init caused \"setenv: invalid argument\" in TKGI
search cancel

Pod fails to start with the error "container init caused \"setenv: invalid argument\" in TKGI

book

Article ID: 298656

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

A Pod is scheduled to the Kubernetes worker node but the Pod fails to start. 
$ k describe pod <POD_NAME>
...
  Warning  Failed     30m (x5 over 32m)      kubelet, bfabe4e5-a224-4989-bb2b-39b40732479f  Error: failed to start container "<CONTAINER_NAME>": Error response from daemon: OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:430: container init caused \"setenv: invalid argument\"": unknown
  Warning  BackOff    2m11s (x139 over 31m)  kubelet, bfabe4e5-a224-4989-bb2b-39b40732479f  Back-off restarting failed container
You can use the command, $ k describe pod <POD_NAME>, to see a similar error in the kubelet/docker logs. The log displays the CONTAINER_NAME of the failed container.

Environment

Product Version: 1.9

Resolution

Kubernetes Secrets can be created with base64 encoded values, see this Managing Secret using Configuration File for more information. When a Pod consumes such Secrets into its environment variables (Secrets), the container runtime needs to decode the base64 encoded values to setup the environment variables.

When the base64 encoded string contains a byte with value zero, i.e., 0x00, the underlying golang lib rejects the value and thus call out the error.

If the worker nodes use docker as container runtime (e.g., in TKGI), you could try the following steps to confirm the issue.

1. Get the container ID:
$ kubectl describe pod <POD_NAME>  | grep -i "container id"
    Container ID:  docker://d7699726e95484523d1eea5bb4b3419dbd0098d1865f11126f9e8b54c39f5fbd

2. SSH to the worker node to which the failed pod is scheduled.

3. Check the container config file under /<docker-datastore-path>/containers/<container-ID>/config.v2.json:
# cat /var/vcap/store/docker/docker/containers/d7699726e95484523d1eea5bb4b3419dbd0098d1865f11126f9e8b54c39f5fbd/config.v2.json | jq '.Config.Env'
[ 
   "SESSION_KEY=\u0010�(Q�\u0003h[z\u0014\u0007�uK\u0000��w�6�9�d~�\u0016w�(��",
   ......

You can then identify the source Secret of the environment variable (SESSION_KEY in above example) and correct it by some valid value.

1. In the .env definition of the failing container, identify the problem Secret:
$ kubectl get pod <POD_NAME> -o json | jq '.spec.containers|.[] | select(.name=="CONTAINER_NAME") | .env'
[
  {
    "name": "SESSION_KEY",
    "valueFrom": {
      "secretKeyRef": {
        "key": "sessionKey",
        "name": "mysecret"
      }
    }
  },
......

2. Retrieve the base64 value from the identified Secret:
$ kubectl get secret mysecret -o json | jq '.data.sessionKey'
"EMooUfYDaFt6FAfsdUsAoPl3ljb4OZFkftMWd9IontE="

3. Decode to verify it is the cause and correct it:
# xxd converts the value into hex format
$ echo -n "EMooUfYDaFt6FAfsdUsAoPl3ljb4OZFkftMWd9IontE=" | base64 -d | xxd
00000000: 10ca 2851 f603 685b 7a14 07ec 754b 00a0  ..(Q..h[z...uK..
00000010: f977 9636 f839 9164 7ed3 1677 d228 9ed1  .w.6.9.d~..w.(..

$ kubectl edit secret mysecret
$ kubectl rollout restart deployment <DEPLOYMENT>