<IPI Cluster Name> ovs-ctl[12336]: ovsdb-server is not running.
<IPI Cluster Name> systemd[1]: ovs-vswitchd.service: Scheduled restart job, restart counter is at 1.
�~V~Q�~V~Q Subject: Automatic restarting of a unit has been scheduled
�~V~Q�~V~Q Defined-By: systemd
.
.
�~V~Q�~V~Q Automatic restarting of the unit ovsdb-server.service has been scheduled, as the result for
�~V~Q�~V~Q the configured Restart= setting for the unit.
<IPI Cluster Name> systemd[1]: Stopped Open vSwitch Database Unit.
<IPI Cluster Name> NSX 12584 - [nsx@6876 comp="nsx-container-node" subcomp="NSX-HOST-OVS-MONITOR" level="INFO"] .usr.local.bin.start_host_ovs_monitor Started the loop that checks OVS status every 120 seconds.
<IPI Cluster Name> NSX 12584 - [nsx@6876 comp="nsx-container-node" subcomp="NSX-HOST-OVS-MONITOR" level="INFO"] .usr.local.bin.start_host_ovs_monitor OVS bridge 'br-int' does not exist
<IPI Cluster Name> NSX 12584 - [nsx@6876 comp="nsx-container-node" subcomp="NSX-HOST-OVS-MONITOR" level="INFO"] .usr.local.bin.start_host_ovs_monitor Stopped OVS running on the host. nsx-node-agent/nsx-ovs will run OVS in the container.
<IPI Cluster Name> NSX 12584 - [nsx@6876 comp="nsx-container-node" subcomp="NSX-HOST-OVS-MONITOR" level="INFO"] .usr.local.bin.start_host_ovs_monitor OVS bridge 'br-int' does not exist
<IPI Cluster Name> NSX 12584 - [nsx@6876 comp="nsx-container-node" subcomp="NSX-HOST-OVS-MONITOR" level="INFO"] .usr.local.bin.start_host_ovs_monitor Stopped OVS running on the host. nsx-node-agent/nsx-ovs will run OVS in the container.
Pod/Container nsx-node-agent/nsx-ovs:
The below log will be printed only once:
<IPI Cluster Name> NSX 1 - [nsx@6876 comp="nsx-container-node" subcomp="NSX-OVS" level="INFO"] .usr.local.bin.start_ovs Started OVS in this container.
The below log will be printed numerous times:
<IPI Cluster Name> NSX 1 - [nsx@6876 comp="nsx-container-node" subcomp="NSX-OVS" level="ERROR"] .usr.local.bin.start_ovs Another process is currently managing OVS. Waiting for 3 seconds.
VMware NSX-T Data Center 3.x
VMware NSX 4.x
OpenShift 4.15 and above
OpenShift cluster nodes have openvswitch service installed and activated by default. The OVS service runs two OVS processes: ovsdb-server and ovs-vswitchd. These OVS processes must be stopped because nsx-node-agent has a pod named nsx-ovs which runs the same OVS processes in the container. To this aim, another pod nsx-ncp-bootstrap stops the OVS processes on the node via a container named nsx-host-ovs-monitor. As the OVS service is still active on the node, the service attempts to restart the OVS processes on the host.
In OpenShift 4.15 and above, OVS processes on the host are restarted aggressively. The nsx-ncp-bootstrap stops the OVS processes and nsx-ovs attempts to start them, while the node restarts the OVS processes almost at the same time. This behaviour can result in a deadlock state were the OVS processes cannot be run on the node or on the nsx-ovs container.
This issue is resolved in VMware Cloud Foundation 9.0 and above along with NCP 4.2.3, available at Broadcom downloads.
If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.
Workaround
Reboot impacted node and stop the openvswitch service with the following commands:
NB: It is recommend to stop the OVS service on all the cluster nodes even if they are not currently impacted.