When deploying Container Service Extension (CSE) clusters in tenant organization Virtual Data Centers (VDCs), the cluster nodes may fail to initialize or complete setup due to missing Internet connectivity or improper DNS configuration in the network used by the virtual machines (VMs).
During the deployment of a TKG cluster using the Container Service Extension (CSE), an issue was observed with DNS resolution on the ephemeral VM. Two DNS servers were configured — a primary internal DNS and a secondary external DNS. However, if the primary DNS is unreachable, the ephemeral VM does not attempt to use the secondary DNS, which results in a failure to proceed with the cluster creation.
The cluster deployment fails with an error during the post-customization phase of the ephemeral VM, indicating a problem with pulling required container images:Type: Error
Resource: EPHEMERAL-TEMP-VM
Detailed Error: [error while bootstrapping the machine [<cluster-name>/EPHEMERAL-TEMP-VM]: unable to wait for post customization phase
[guestinfo.cloudinit kind binary install.status] : [invalid postcustomization phase: [failed] for key [guestinfo.cloudinit.kind.binary.install.status]
Additionally, DNS resolution fails and investigation revealed that the CSE appliance does not have internet connectivity, and DNS queries (e.g., nslookup <vcd-hostname>) are unsuccessful.
VMware Cloud Director 10.x
Container Service Extension 4.x
CSE cluster nodes (VMs) require outbound Internet access during the configuration phase to download necessary software packages and dependencies. A common issue arises when:
The organization VDC network is not routed or not NSX-backed.
The network lacks a Source NAT (SNAT) rule, which is required for outbound access.
Incomplete or incorrect DNS settings are provided.
An unsupported design is attempted, where VMs are connected to both a public and a private network, while CSE nodes support only a single NIC.
To ensure successful CSE cluster deployments, follow the guidelines below:
1. Use a Routed, NSX-Backed Organization VDC Network
The organization VDC network must be routed and NSX-backed.
Configure a Source NAT (SNAT) rule to enable Internet connectivity for VMs attached to the network.
This allows the cluster VMs to reach external repositories and download packages during the provisioning process.
2. Configure DNS Settings
Provide appropriate DNS settings in the routed network:
Primary DNS:
This is the first DNS server VMs connected to this network will use for name resolution.
DNS Suffix: mydomain.#######.net
This suffix is appended to unqualified hostnames to resolve them into Fully Qualified Domain Names (FQDNs).
Note: The DNS server is used only for domain name resolution and does not impact the Internet connectivity of the VM
3. Do not use dual Network Configuration
All CSE VMs (cluster nodes) are designed with a single NIC.
Connecting VMs to both a public and a private network to provide dual connectivity (internal + external) is not supported.
Ensure that the single NIC used by CSE VMs is connected to a public network with outbound Internet access.
Confirm that firewall rules and routing configurations allow outbound access.
Validate DNS resolution within a VM before initiating a CSE cluster deployment.
Below are the documents on Network Requirements on CSE:
https://techdocs.broadcom.com/us/en/vmware-cis/cloud-director/container-service-extension/4-2/installing-configuring-and-upgrading-container-service-extension-as-a-service-provider-4-2-4-2/prerequisites.html
https://techdocs.broadcom.com/us/en/vmware-cis/cloud-director/container-service-extension/4-2/using-vmware-cloud-director-container-service-extension-as-a-service-provider-4-2/network-connectivity-for-organization-virtual-data-centers.html