Recover the TCSA after the power failure
search cancel

Recover the TCSA after the power failure

book

Article ID: 385328

calendar_today

Updated On:

Products

VMware Telco Cloud Service Assurance

Issue/Introduction

Steps to recovery the TCSA when you have the power outage/failure. 

Environment

TCSA 2.4

Resolution

  • Firewall check: 

Make sure the Firewall is disabled on all nodes, check if firewall is disabled on all worker nodes and master nodes & control planes

    ssh <nodeip> systemctl status firewalld.service
  • Keepalive VIP re-instantiation: 

There is a Keepalive VIP used which goes unavailable after the restart 

On k8 installer path run below command

    cd $HOME/k8s-installer/
    export ANSIBLE_CONFIG=$HOME/k8s-installer/scripts/ansible/ansible.cfg LANG=en_US.UTF-8
    ansible-playbook scripts/ansible/prepare.yml -e @scripts/ansible/vars.yml --become --tags vrrp
  • Arango Pods related additional steps:                                                                                                                                                                                                                                                                                                 
    • Delete any pods still stuck in ContainerCreating or Crashloopback.
      kubectl delete pod -n default <pods name>
    • If Arango pods are not coming up, this might require additional steps
    • Pause app arangodb-cluster and kube-arangodb
    • Then try scaling the arangoDB, using the below command
      kubectl scale deployment -n tcsa-system  rango-kube-arangodb-operator --replicas=0
    • Delete stuck arangodb pod and corresponding pv and pvc
    • Remove finalizers from the stuck arangodb pod and corresponding pv and pvc
      kubectl patch <pod/pv/pvc> -n tcsa-system  --type json -p $'- op: remove\n  path: /metadata/finalizers'
    • Unpause app arangodb-cluster and kube-arangodb