Adding a new host fails when scaling out a vSAN cluster that has NSX-T and vLCM enabled
search cancel

Adding a new host fails when scaling out a vSAN cluster that has NSX-T and vLCM enabled

book

Article ID: 327059

calendar_today

Updated On:

Products

VMware NSX VMware vCenter Server VMware vSAN VMware vSphere ESXi

Issue/Introduction

  • Scaling out a vSAN cluster with NSXT-CVDS and vLCM enabled by adding a new host directly to the cluster fails due to the errors below:
  • You get a failing task of "Apply NSX Solution" with an error similar to:
A general system error occur red: Health Check for 'Cluster *' failed
  • In  NSX Manager, you get a failed node with an error similar to:
Failed to install software on host. Solution apply failed on host: '#.#.#.#' vSAN health test 'vSAN cluster partition' reported an issue. Check the vSAN health.vSAN health test 'All hosts have a vSAN vmknic configured' reported an issue. Check the vSAN health.Solution apply failed on host: 'x.x.x.x
​​​​
Note: It is expected that the NSX installation completes automatically

Environment

VMware vSphere ESXi 7.0.x
VMware vCenter Server 7.0.x
VMware NSX-T Data Center 3.x
VMware vSAN 7.0.x

Cause

The error occurs due to the vSAN Health check failures related to vmknic connections. You see errors similar to the following on the Monitor tab for the cluster in the vSphere Client, under the Skyline Health section:

All hosts have a VSAN vmknic configured

NSX-T expects a healthy vSAN enabled vmknic to be configured when adding a new node.

Resolution

Steps to work around the issue:

  1. Before scaling out the vSAN cluster with NSXT-CVDS and vLCM enabled with a new host, verify there are no cluster health check errors
  2. Add a new host in the Datacenter and put it in maintenance mode
  3. Add the new host to the VDS
  4. Install and config NSX on the new host
  5. Create the vSAN vmknic on the host
  6. Exit the host from maintenance mode
  7. Add the host into the cluster


Note: In step 7, the "Apply NSX Solution" task will still be triggered, but it will succeed after applying the workaround.