Provisioning Hosts in a Cell Site Group taking a long time to complete

Products

VMware Telco Cloud Automation

Issue/Introduction

The purpose of this patch is to resolve the DVS lock handling logic in the TCF manager service in order to allow concurrent host addition to the cell site group.

Symptoms:

When adding hosts to a cell site group they remain stuck in a provisioning state for over 3 hours.

Environment

2.0.x

Cause

The root cause for this issue is that TCF manager service maintains an internal lock while adding hosts to a DVS so that this operation is serialized across different domains. The intent was to not overload the vCenter with bulk requests over time.

However, this inadvertently causes delays when one host gets stuck on a reconfigure dvs task for a long time and subsequent hosts that are added will not be able to acquire the lock as it is not free.

Resolution

Resolved in Telco Cloud Automation 2.1.

Workaround:
Apply the patch on the TCA manager to prevent this problem from occurring using the following procedure.

1. Create a Snapshot of the TCA Manager.

2. Download the attached vmware-tcf-manager_patch_1.0.tar.gz patch file:

3. SSH into the TCA Manager as admin.

4. Back up the folder “/opt/vmware/tcf” in tcf-manager container:

5. Issue the following command:
docker exec -u root -it tcf-manager /bin/bash

6. Change directory to /opt/vmware/
cd /opt/vmware/

7. Backup the tcf folder.
cp -R tcf tcf.bak

8. Copy the backup tcf to /home/admin.
scp -r tcf.bak admin@<TCA-manager>:/home/admin

9. Exit from the container and change directory to /home/admin

10. Change ownership on the file.
chown admin:admin tcf.bak

11. Switch user to root and perform the following steps:
su root
systemctl stop tcf-manager
docker rm tcf-manager
docker image rm vmware-tcf-manager

12. Change directory and issue the following command:
cd /common/tca-repository/tcf-manager
curl -ko tcf-manager.tar.gz <path to the patch file>

13. Restart the following services:
systemctl restart tcf-manager-deploy
systemctl start tcf-manager

Additional Information

After applying the patch, the service account information for the pre-deployed CDC/RDC in TCA Automated Infrastructure reverts to administrator. If something other than the administrator SSO user was used for the service account, then the procedure outlined below is required to be followed to restore the configuration.

Both the Service Account username and password will need to be updated in the tcf-manager. Note: If the administrator SSO user was used these steps are not required.

To update the Service Account username, follow this procedure:
1. Run the following command:
docker exec -it tcf-manager /bin/bash

2. Change directory to /opt/vmare/tcf/rest_api/ and backup the tca_web_rest_client.py file

3. Edit tca_web_rest_client.py and update the username in line number 38
vc_sso_username = parser.getVcSsoUsername(central_site_mgmt_domain)

Example:
vc_sso_username = “username”

To update the Service Account Password follow this procedure:
1. Change directory to /opt/vmware/tcf/ and backup the appliance_config.py file
2. Edit appliance_config.py and go to line number 1167 to update the password:
post_body = {"username": self.username, "password": self.password}

Example:
post_body = {"username": self.username, "password": “password” }

3. Issue the following command:
docker restart tcf-manager

Attachments

vmware-tcf-manager_patch_1.0.tar get_app