On upgrade or install Dataplane service fails to start on a Bare Metal Edge
search cancel

On upgrade or install Dataplane service fails to start on a Bare Metal Edge

book

Article ID: 325112

calendar_today

Updated On:

Products

VMware NSX

Issue/Introduction

Symptoms:
  •  The Bare Metal Edge Server has a 1 NUMA node socket configuration
  •  On install of a NSX-T 2.4.x/2.5.x, the Dataplane service fails to start on the Bare Metal Edge
  •  On upgrade of a Bare Metal Edge from 2.3.x to 2.4.x/2.5.x the upgrade fails because the Dataplane service fails to start
  •  The Data Plane service is stopped
   >get service dataplane
     Service name:     dataplane
     Service state:    stopped
  •  On the Edge get log-file syslog shows logging similar to
2020-03-07T12:28:04.330Z EDGE NSX 36311 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd.intel_rte" level="INFO"] EAL: Detected 1 NUMA nodes
2020-02-28T12:28:04.491723+00:00 EDGE NSX 26591 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd.intel_rte" level="WARN"] EAL: Could not find space for memseg. Please increase CONFIG_RTE_MAX_MEMSEG_PER_TYPE and/or CONFIG_RTE_MAX_MEM_PER_TYPE in configuration.
2020-02-28T12:28:04.493292+00:00 edi02pe300-002 NSX 18829 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd.intel_rte" level="WARN"] EAL: No free hugepages reported in hugepages-2048kB                                 
2020-02-28T12:28:04.492292+00:00 EDGE NSX 26591 FABRIC [nsx@6876 comp="nsx-edge" subcomp="datapathd.intel_rte" level="WARN"] EAL: Couldn't remap hugepage files into memseg lists
2020-02-28T12:28:04.493277+00:00 EDGE 7e1e76ca7f4c 2073 - -  EAL: FATAL: Cannot init memory 


Environment

VMware NSX-T Data Center 2.x
VMware NSX-T Data Center

Cause

Hugepage memory for DPDK use is reserved to a value either half of total physical memory or 64GB, whichever is lowest.
Starting from NSX-T Data Center 2.4.0, DPDK supports a maximum of 32GB memory per NUMA socket.
If more than 32GB of hugepage memory is reserved for DPDK use, memory init fails and Dataplane service cannot start.
This failure is observed on 1 NUMA node system which have total system memory greater than 64GB.

Resolution

This issue is resolved in VMware NSX-T Data Center 3.0, available at VMware Downloads.


Workaround:
To workaround this issue:

Login to the Edge cli as admin user and run

set hugepage-reserve size 32768

Note: this will trigger an immediate reboot of the system and when the system reboots Dataplane service will start automatically.

If this issue is encountered as part of a failed Edge upgrade, an additional step is required to clear the upgrade in progress status

****************************************************************************
Node Upgrade is in progress. Please do not make any changes, until
the upgrade operation is complete.
****************************************************************************

Once the Edge comes back online post reboot, restart the Edge upgrade on the Upgrade Coordinator to complete the upgrade.