Expansion of Management vSphere Cluster is failing on VLAN based AVN
search cancel

Expansion of Management vSphere Cluster is failing on VLAN based AVN

book

Article ID: 312189

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction


Symptoms:

  • Any expansion operations (Add Host/vSAN Stretch Cluster/Expand vSAN Stretch Cluster) executed in the Management Workload Domain fails with following error.  "Attach VLAN Transport Zone to Host Transport Nodes".
     
  • The environment's network topology is VLAN-based AVN configured during VCF 3.x or 4.x Migration with MR1 and MR2 releases and/or ingested AVN as suggested in following article VMware Cloud Foundation 4.x Management Domain Implementation with VLAN-Backed Networks

  • Following snippets are observed in /var/log/vmware/vcf/domainmanager/domainmanager.log
YYYY-MM-DDT<Time> ERROR [vcf_dm,9d0a4a46cfca44fa,0d6f] [c.v.v.c.f.p.n.a.NsxtAddHostVlanHeader,dm-exec-10] Error occurred while generating input for add hosts in NSX-T environment
       
java.lang.NullPointerException: null

YYYY-MM-DDT<Time> ERROR [vcf_dm,9d0a4a46cfca44fa,0d6f] [c.v.e.s.o.model.error.ErrorFactory,dm-exec-10] [GL0M7C]                         

ERROR_OCCURRED_WHILE_GENERATING_INPUT Error occurred while generating input for add hosts in nsxt environment

com.vmware.evo.sddc.orchestrator.exceptions.OrchTaskException: Error occurred while generating input for add hosts in the NSX-T environment

Caused by: java.lang.NullPointerException: null at com.vmware.vcf.common.fsm.plugins.nsxt.action.NsxtAddHostVlanHeader.execute(NsxtAddHostVlanHeader.java:218)
... 18 common frames omitted

YYYY-MM-DDT<Time> DEBUG [vcf_dm,9d0a4a46cfca44fa,e5d3] [c.v.e.s.c.s.a.l.LockingServiceAdapterImpl,dm-exec-12] Execution name ADD_HOST_WORKFLOW,   execution ID 99ad7352-####-####-#
###-########06e, resource type DEPLOYMENT, resource ID null.



Environment

VMware Cloud Foundation 4.x
VMware Cloud Foundation 3.x

Cause

The association between the Management vSphere Cluster and the VLAN Transport Zone in the SDDC Manager Database is not populated.

Resolution

This issue is resolved in VCF 4.4. 

The following workaround can be applied to enable expansion operations on the Management vSphere Cluster in the Management Workload domain if VCF cannot be upgraded to the fixed version.

  1. Login to the NSX-T instance for the Management workload domain with user admin.

  2. Navigate to System > Fabric > Transport Zones

  3. Select the VLAN Transport Zone created by VCF, which the AVN Network Segment is connected to, Make a note of entity_id .

  4. SSH to the SDDC Manager VM as user vcf.

  5. Run the following commands to access the SDDC manager database.

    psql -h localhost -U postgres -d platform


  6. Run the following command to verify the current entries in the entity_and_transport_zone table.

    select * from entity_and_transport_zone;

  7. The VLAN Transport ID should be missing from the entity_id which should be the corresponding vSphere cluster. Copy the entity_id to a notepad:

    platform# select * from entity_and_transport_zone;

                          id                                                                  | creation_time          | modification_time           | entity_id                                                                    | vlan_transport_zone_ids  | overlay_transport_zone_ids-

               40e22d0d-####-####-####-########f21 | 1660897720387 | 1660897720387     | 3636b7cb-####-####-####-########dd1 | <Null>                        |  "d1b4de1d-####-####-####-########381"



  8. Verify that the entity_id missing the VLAN transport id is that of a vSphere cluster. Run the following command to verify the Management cluster is the entity_id

    select id,name from cluster where id='<entity_id>';

  9. Update the database with the missing entry for VLAN transport zone id :

    update entity_and_transport_zone set vlan_transport_zone_ids ='["<Transport_Zone_ID from step 3"]' where entity_id='<entity_id from step 7>';