vSAN health alert : Hyperconverged cluster configuration compliance - vDS compliance check for hyperconverged cluster configuration
search cancel

vSAN health alert : Hyperconverged cluster configuration compliance - vDS compliance check for hyperconverged cluster configuration

book

Article ID: 430097

calendar_today

Updated On:

Products

VMware vSAN

Issue/Introduction

Symptoms:

  • The IP addresses of the vSAN nodes were changed. Following which, the old port group associated with the previous IP was deleted, and a new port group with a new name was created.
  • After manually re-adding the ESXi hosts to the vSAN cluster, the Quick Start workflow state was impacted and now appears as shown below:

  • The vSAN health appears with below alert,

  • The current Issue shows error with distributed port group missing on the DvSwitch.

Environment

VMware vSAN 8.x

Cause

This issue occurs when port groups that were initially configured via vSAN Quick Start are deleted or recreated.

  • From the log /var/log/vmware/vpxd/vpxd.log:

    ####-##-##T07:25:14.189Z error vpxd[2141808] [Originator@6876 sub=MoCluster opID=21989934-eb] Managed object not found for DVPG {vim.dvs.DistributedVirtualPortgroup:dvportgroup-##83}
    ####-##-##T07:25:14.189Z error vpxd[2141808] [Originator@6876 sub=MoCluster opID=21989934-eb] Managed object not found for DVPG {vim.dvs.DistributedVirtualPortgroup:dvportgroup-##82}

    The events in the vpxd.log indicate that the dvPortGroup ( dvportgroup-##82 and dvportgroup-##82 ) are either missing or unreachable.
  • From the vCenters database in the Quickstart settings table vpx_hci_nw_settings for network configuration the below is seen,

    $select * from vpx_hci_nw_settings;

    dvpg_id, service_type, dvs_id, cluster_id

    ##83       vsan             ####       ####

    ##82       vmotion        ####       ####

  • The dvpg_id {##82,##83} belong to the original vSAN and vMotion port group that was created using Quickstart.

  • Since these two port groups are not deleted the Quickstart service fails to find these port groups and appears broken in the UI.

Resolution

Run the below steps to resolve the issue,

  1. Capture vCenter snapshot or take vCenter db backup.

  2. Open vCenter server SSH
  3. Stop vpxd service, using

    $service-control --stop vmware-vpxd

  4. Login to vCenter database use, 

    $/opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres

  5. To get distributed port group information for clusters that was configured using Quickstart use below query,

    $select * from vpx_hci_nw_settings;

    Ensure that the dvpg_id from the above table matches the dvpg_id from vpxd.log

  6. The stale entries needs to be removed, run below delete command. The dvpg_id will be the id as seen in the vpxd log and the cluster_id will be MOID of the impacted cluster,

    $DELETE FROM TABLE_NAME (VPX_HCI_NW_SETTINGS) where DVPG_ID = <value> AND CLUSTER_ID = <value>; 

  7. Start vpxd service, use

    $service-control --start vmware-vpxd

  8. Re-Launch the vCenter session again and re-login to vCenter UI.

  9. Validate Quickstart, vSAN cluster -> Configure -> Configuration -> Quickstart