Ingress controller statefulset fails to start after resize of worker nodes with permission denied
search cancel

Ingress controller statefulset fails to start after resize of worker nodes with permission denied

book

Article ID: 298618

calendar_today

Updated On:

Products

VMware Tanzu Kubernetes Grid Integrated Edition

Issue/Introduction

In the following example, an nginx ingress controller is used with specific permissions:
apiVersion: v1
kind: Pod
metadata:
  name: nginx-ingress-security-context-demo
  namespace: ingress-nginx
spec:
  securityContext:
    fsGroup: 33
  serviceAccountName: nginx-ingress-serviceaccount
  containers:
  - name: nginx-ingress-controller
    image: nginx-ingress-controller:0.26.1
    args:
      - /nginx-ingress-controller
      - --ingress-class=dos-nginx-controller-3
      - --annotations-prefix=nginx.ingress.kubernetes.io
    securityContext:
      allowPrivilegeEscalation: true
      capabilities:
        drop:
         - ALL
        add:
         - NET_BIND_SERVICE
      runAsUser: 33
    ports:
      - name: http
        containerPort: 80
        protocol: TCP
      - name: https
        containerPort: 443
        protocol: TCP

Here is a full example: https://raw.githubusercontent.com/kubernetes/ingress-nginx/nginx-0.26.1/deploy/static/mandatory.yaml

If you see the highlighted/bold part in the above template, you will see that the container is adding the Linux capability `CAP_NET_BIND_SERVICE` via the securityContext. At the same time, the template is also defining to run this container as user 33 (www-data) via runAsUser:33.  

However, after persistent disk resizes and cluster upgrades, when we check the capabilities on the running process in the container, we see the permitted and effective capabilities are removed from the file and the process.

The container fails to start with:
Error: exit status 1
nginx: the configuration file /tmp/nginx-cfg488150005 syntax is ok
2019/07/31 11:48:34 [emerg] 81#81: bind() to 0.0.0.0:80 failed (13: Permission denied)


Environment

Product Version: 1.7

Resolution

This issue was discovered in https://github.com/helm/charts/issues/15994#issuecomment-545986998

When persistent disk size is increased from the plan, this is applied to the workers' process with the upgrade as follows. 

BOSH creates new persistent disk and attaches it to the worker, then completes the migration process where it runs number of scripts to move all persistent data from the old to the new disk. 

It seems that there are specific setting on the docker image that are omitted during the migration process. which leads to lost in permissions. The problem is related to the nginx docker image doing some operations during the initial startup to grant itself permission on the OS. However, those operations are performed as a docker layer. If those layers are cached, they won't run again. By migrating the data from the old to the new disk, the cached docker images are there, but the operations that the docker image performed did not run again: https://github.com/kubernetes/ingress-nginx/blob/master/rootfs/Dockerfile#L46-L50 

Steps to to Workaround the problem:

1. Scale stateful set to 0. The same can be done for deployment as well.
2. Find all images on the workers' docker image list 
3. Remove the image from the workers' docker rmi nginx-ingress-controller:0.26.1
4. Make sure the image is removed from all workers.
5. Scale the stateful set to desired number 

Docker will pull the image again and will perform permission updates.