Upgrading Openshift nodes causes NSX agents on VMware NSX Container Plugin (NCP) to enter CrashLoopBackOff state
search cancel

Upgrading Openshift nodes causes NSX agents on VMware NSX Container Plugin (NCP) to enter CrashLoopBackOff state

book

Article ID: 322575

calendar_today

Updated On:

Products

VMware NSX Networking

Issue/Introduction

Symptoms:
  • You are running Openshift version less than 4.7.21.
  • You have upgraded to a version of Openshift 4.7.21 or higher.
  • Viewing the logs of the NSX agent we see alerts like the following:
#oc logs --all-containers nsx-node-agent-XXXXX
...
2021-11-23T10:43:28.738Z ocp-worker-prd02.corp.local NSX 18 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_kube_proxy" level="WARNING"] nsx_ujo.nsx_kube_proxy.proxy Unable to obtain Linux kernel version or OVS version, skipping the check for version requirements: Could not retrieve schema from unix:/var/run/openvswitch/db.sock
2021-11-23T10:43:28.743Z ocp-worker-prd02.corp.local NSX 26 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_kube_proxy" level="ERROR"] ovsdbapp.backend.ovs_idl.idlutils Unable to open stream to unix:/var/run/openvswitch/db.sock to retrieve schema: No such file or directory
2021-11-23T10:43:28.744Z ocp-worker-prd02.corp.local NSX 18 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_kube_proxy" level="ERROR" errorCode="NCP02001"] nsx_ujo.nsx_kube_proxy.proxy Failed to get attribute for interface ens192: Could not retrieve schema from unix:/var/run/openvswitch/db.sock
2021-11-23T10:43:28.744Z ocp-worker-prd02.corp.local NSX 18 - [nsx@6876 comp="nsx-container-node" subcomp="nsx_kube_proxy" level="ERROR" errorCode="NCP02001"] nsx_ujo.nsx_kube_proxy.proxy Failed to get gateway port: Could not retrieve schema from unix:/var/run/openvswitch/db.sock
...


Environment

VMware NSX-T Data Center 3.x
VMware NSX-T Data Center

Cause

In Openshift version 4.7.21, the RHEL version was changed from 8.3 to 8.4.
Therefore this does not impact Openshift versions 4.6.
This introduced changes in the enforced SELinux security policy.

These security policies prevent Network manager from being able to write to the OVS db.sock.

Resolution

This is a known issue impacting NSX-T Data Center NSX Container Plugin.

Workaround:
Rollback the version of Openshift to the previous version.
Then upgrade to a version less than 4.7.21.

Alternatively, you can use ncp version 3.2.0.1, this has compatibility implications, please review before deciding this choice.