NSX Bare Metal Edge CLI stops responding and dataplane outage observed after uplink configuration is changed.
search cancel

NSX Bare Metal Edge CLI stops responding and dataplane outage observed after uplink configuration is changed.

book

Article ID: 410782

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

  • Dataplane service in NSX Edge Bare Metal is impacted after changing uplink configuration.

  • Log entries similar to the below are seen on the  NSX Edge Bare Metal:
    /var/log/syslog.log:
    ####-##-##T##:##:##.###Z BMEdge NSX 5859 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 1000 ms waiting for dp-learning3 to quiesce
    ####-##-##T##:##:##.###Z BMEdge NSX 5859 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 2000 ms waiting for dp-learning3 to quiesce
    ####-##-##T##:##:##.###Z BMEdge NSX 5859 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 4000 ms waiting for dp-learning3 to quiesce
    ####-##-##T##:##:##.###Z BMEdge NSX 5859 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 8000 ms waiting for dp-learning3 to quiesce
    ####-##-##T##:##:##.###Z BMEdge NSX 5859 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 16000 ms waiting for dp-learning3 to quiesce
    ####-##-##T##:##:##.###Z BMEdge NSX 5859 SYSTEM [nsx@6876 comp="nsx-edge" subcomp="datapathd" s2comp="ovs-rcu" tname="urcu2" level="WARN"] blocked 32000 ms waiting for dp-learning3 to quiesce
  • Edge CLI commands fail and log entries similar to the below are seen on the  NSX Edge Bare Metal:
    /var/log/nsx-cli/nsxcli.log
    ####-##-##T##:##:##.###Z 50985 cli.server.cli_command_service INFO {10000} CMD: get logical-router
    ####-##-##T##:##:##.###Z 50985 cli.utils.edge_utils ERROR Failed to invoke edge-appctl command: ####-##-##T##:##:##.###Z edge-appctl 51093 unixctl [WARN] failed to connect to /var/run/vmware/edge/dpd.ctl
    edge-appctl: cannot connect to "/var/run/vmware/edge/dpd.ctl" (Protocol error)
    ####-##-##T##:##:##.###Z 50985 cli.server.cli_command_service WARNING An unexpected error occurred: The dataplane service is in error state, has failed, is disabled, or Maintenance Mode is Enabled
    ####-##-##T##:##:##.###Z 50985 cli.audit INFO CMD: get logical-router (duration: 0.195s) (command: get logical-routers), Operation status: CMD_EXECUTED_WITH_ERROR_RESULT

Environment

VMware NSX 4.2.x (except 4.2.4), 9.0 and 9.0.1.

Cause

Dataplane threads enter a deadlock scenario after uplink configuration is changed in an NSX Bare Metal Edge.

Resolution

This issue is resolved in VMware NSX 4.2.4, VCF 9.0.2 and VCF 9.1, available at Broadcom downloads.

If you are having difficulty finding and downloading software, please review the Download Broadcom products and software KB.

 

Workaround
Restart the dataplane service by running the following command from the impacted NSX Bare Metal Edge CLI (admin mode):

restart service dataplane