Telco Cloud Automation 2.x - HA-based cloud native TCA/TCA-CP appliance deployment Day-0 using scripts
search cancel

Telco Cloud Automation 2.x - HA-based cloud native TCA/TCA-CP appliance deployment Day-0 using scripts

book

Article ID: 345722

calendar_today

Updated On:

Products

VMware VMware Telco Cloud Automation

Issue/Introduction

This Script helps  to Create/Delete cloud native TCA  2.x and/or TCA-CP 2.x appliance in HA mode.

Prerequisite

 

- Install Bootstrapper VM on a vC (and optionally vRLI) in a regular or airgap scenario using VMware-Telco-Cloud-Automation-<version>.ova and selecting appliance role as bootstrapper.

- Make sure to have latest photon VM Template uploaded on your vC for consumption by the script to create management and workload clusters. Example: photon-3-kube-v1.21.2+vmware.1 for Tanzu 1.4.0

- Create bootstrapper.json file using bootstrapper_template.json on the Bootstrapper VM by updating following sections:

[ Location on script and json template on Bootstrapper VM:  /opt/vmware/setup_ha/bootstrapper_template.json ] 

Note: Sample of bootstrapper_template.json attached to kb
 

Sizing/System Requirements


Cloud Native TCA and TCA-CP on same cluster (different namespaces - dev only)

- Deployment of both TCA and TCA-CP in same cluster (different namespaces) will need at least 3 worker VMs/nodes in management cluster with following config each: CPU=8 and MEM=16384MB 

APPLIANCE → namespace

TCA → tca-mgr

TCA-CP → tca-system

One appliance TCA or TCA-CP in a cluster

Only TCA or TCA-CP installation can work with 3 worker VMs/ node in management cluster with config of CPU=4 and MEM=8192MB each.

Note: Detailed information on sizing/system requirements engage VMware GSS Team
 


Environment

VMware Telco Cloud Automation 2.0

Resolution

The Bootstrapper virtual machine contains scripts and required templates for setting up a cloud native VMware Telco Cloud Automation appliance in HA mode.

The following table lists their file locations.

Setup Script Path                 :     /opt/vmware/setup_ha/setup_ha.py
Configuration File Template :     /opt/vmware/setup_ha/bootstrapper_template.json
File Location in                     :    Bootstrapper



$ python3 /opt/vmware/setup_ha/setup_ha.py  --config_file ~/bootstrapper.json --skipConfirmations

Note: 

- For detailed steps and usage on setup_ha.py please refer the attached file - setup_ha_usage.txt
- For detailed information on example stages script executed in detailed in the attached sample- script-stage.txt 


 

A. Troubleshooting in case of Failed Cluster creation :

The cluster creation can fail based on several reasons:

-  etcdserver leader change
-  etcdserver timeout
-  Unable to connect to Bootstrapper Service issue
-  Cluster creation timeout issues

 

Note: Testbed issues (As observed):

  • Internet connectivity problem (DNS configuration is incorrect in the bootstrapper VM), due to which the images for the containers cannot be downloaded from the internet
  • For airgap configuration - not able to connect to the airgap server and the airgap server cannot reach out to the cluster nodes
  • First delete the cluster, then proceed fix the testbed issues if there are any and then re-trigger the cluster creation.

Workaround:

Delete Cluster and rerun the script to try again a recreate of the cluster.


NOTE:

1. If Bootstrapper cluster is already created, then before MANAGEMENT cluster, we should always delete the Bootstrapper cluster first.
2. To Delete TCA Cluster, use clusterType=MANAGEMENT
3. To Delete Bootstrapper Cluster, use clusterType=WORKLOAD


Steps to Delete a TCA cluster:

Delete the cluster:

  • use the GET API of the appliance manager to fetch the cluster details
  • Use the id from the API and pass it to the DELETE API
  • Use the status API to monitor the delete cluster
  • Use the force delete option to remove from the local DB to reuse the controlPlaneEndpointIp

B. Troubleshooting in case of Failed Service installation:

The Service installation can fail based on several reasons:

Few known issues as represented and seen in Infrastructure Automation logs and UI extracts:

  1. Helm API failed: release: <service_name> not found
  2. Helm API failed: release name <service_name> in namespace <namespace_name> is already in use
  3. Release "service_name" failed: etcdserver: leader changed
  4. Failed to deploy <service_name>-readiness

 

Workaround:

Uninstall the failed service manually.

Then retry the script to 'deploy specific service'.

Use 'deploy specific service' option (–deploy) for setup_ha.py to retry installing the service. Script always ensure to avoid installing service if it is already installed in a related namespaces.

e.g. to install mongodb command reference:

/opt/vmware/setup_ha/setup_ha.py --deploy mongodb

 


Attachments

script-stage get_app
setup_ha_usage get_app
Sample-bootstrapper_template_json get_app