Backup and Restore using local physical volumes across clusters

Products

VMware Telco Cloud Service Assurance

Issue/Introduction

Customer needs to be able to backup and restore the TCSA configurations and data to a local physical volume.

Environment

TCSA 2.3 2.4

Resolution

From the Minio console by Downloading and Uploading the Backup Zip file:

The backup can be restored on another cluster on a different cloud platform using Local PVs by downloading the zip file of the backup from the Minio console of your Source K8s cluster (TCSA-2.3.1) and then uploading the same backup folder to the Minio console of your Destination K8s cluster (TCSA-2.4.0).

Copy the Kubeconfig of your Source cluster (TCSA-2.3.1) to your local machine and export it. Port forwarding to the minio app will enable you to login to the minio console.

root@testcpl ~ % kubectl port-forward service/minio 9000:9001 Forwarding from 127.0.0.1:9000 -> 9001
Forwarding from [::1]:9000 -> 9001
Handling connection for 9000
Handling connection for 9000
Handling connection for 9000
Handling connection for 9000

Login to the minio console with the credentials:
```
minio / minio123
```

To download the backup you need to increase the Minio resources. To do that pause the Minio App and edit the stateful set of Minio and increase the pod resources of Minio(Limits can be tripled):

[root@testcpl backup-and-restore]# kubectl edit app minio
paused: true   --→ add this line
serviceAccountName: cluster-admin-sa
syncPeriod: 1m0s

[root@testcpl backup-and-restore]# kubectl get apps -A|grep minio 
minio         Canceled/paused                2d14h      2d16h

[root@testcpl backup-and-restore]# kubectl edit statefulset  minioresources:
limits:
cpu: "3"
memory: 6Gi
requests:
cpu: 200m
memory: 2Gi

Scroll down to your bucket name by default it is (vmware-tcsa-backup) and click on browse. Select the backup directory and click on download option in the right.
Your backup will get downloaded as a Zip file with the name
```
backups.zip
```
Unzip this file and you will see your whole backups directory

Copy the kubeconfig of your Destination cluster to your local machine and export it. Port forwarding to the minio app will enable you to login to the minio console.

root@testcpl ~ % kubectl port-forward service/minio 9000:9001 -n tcsa-system Forwarding from 127.0.0.1:9000 -> 9001
Forwarding from [::1]:9000 -> 9001
Handling connection for 9000
Handling connection for 9000
Handling connection for 9000
Handling connection for 9000

To Upload the backup you need to increase the Minio resources same as that of the Source cluster. To do that pause the Minio App and edit the stateful set of Minio and increase the pod resources of Minio(Limits can be tripled):

[root@testcpl backup-and-restore]# kubectl edit app minio -n tcsa-system
paused: true   --→ add this line
serviceAccountName: cluster-admin-sa
syncPeriod: 1m0s

[root@testcpl backup-and-restore]# kubectl get apps -A|grep minio 
tcsa-system     minio         Canceled/paused                2d14h      2d16h [root@testcpl backup-and-restore]# kubectl edit statefulset minio -n    tcsa-systemresources:
limits:
cpu: "3"
memory: 6Gi
requests:
cpu: 200m
memory: 2Gi

Login to the minio console with the credentials:
```
minio / minio123
```
Create a bucket with the default name of vmware-tcsa-backup if it's not there by clicking on Create bucket.

Click on browse then click on Upload → Upload Folder.
Select the backups directory that you unzipped in Step 5 and click on Upload.
Backup from TCSA-2.3.1 will start getting uploaded to your TCSA-2.4.0 bucket. You can see the status of your upload by clicking on the upload/download symbol.
Once the upload is completed you need to perform a sync backup and restore on your destination cluster (TCSA-2.4.0).
The example files can be found inside the TCSA-2.4.0 Deployer Bundle under
```
/tcx-deployer/ examples/backup-and-restore directory.
```

To do a syncbackup apply the syncbackup.yaml.example file using the kubectl apply command.

[root@testcpl backup-and-restore]# cat syncbackup.yaml.example apiVersion: tcx.vmware.com/v1

kind: SyncBackup
metadata:
 name: sync-backup-tps-test
 namespace: tps-system
spec:
 overrideExisting: false
 filter:
 componentList:
 - postgres
 backupList:
 - test-group-backup231
 pauseIntegrityCheck: true
 overrideNamespace:
 targetNamespace: tps-system
 cluster:
 name: tcsa2.3.1
 storage:
 minio:
 bucket: vmware-tcsa-backup
 endpoint: minio.tcsa-system.svc.cluster.local:9000
 secretRef:
 name: minio-secrets
 namespace: tcsa-system
 accessKey:
 key: root-user
 secretKey:
 key: root-password
apiVersion: tcx.vmware.com/v1
kind: SyncBackup
metadata:
 name: sync-backup-tcsa-test
 namespace: tcsa-system
spec:
 overrideExisting: false
 filter:
 componentList:
 - elasticsearch
 - collectors
 - zookeeper
 - kubernetesResources
 backupList:
 - test-group-backup231
 pauseIntegrityCheck: true
 overrideNamespace:
 targetNamespace: tcsa-system
 cluster:
 name: tcsa2.3.1
 storage:
 minio:
 bucket: vmware-tcsa-backup
 endpoint: minio.tcsa-system.svc.cluster.local:9000
 secretRef:
 name: minio-secrets
 namespace: tcsa-system
 accessKey:
 key: root-user
 secretKey:
 key: root-password

[root@testcpl backup-and-restore]# kubectl apply -f 
syncbackup.yaml.examplesyncbackup.tcx.vmware.com/sync-backup-tps-test created syncbackup.tcx.vmware.com/sync-backup-tcsa-test created

Check the status of syncbackup using the following command:

[root@testcpl backup-and-restore]# kubectl get syncbackups -A NAMESPACE NAME STATUS CURRENT STATE READY AGE  MESSAGE

tcsa-system sync-backup-tcsa-test SUCCESSFUL syncBackup True 101m  synced: 1, skipped: 0, failed: 0
tps-system sync-backup-tps-test SUCCESSFUL syncBackup True 100m  synced: 1, skipped: 0, failed: 0

Check if your backups have been synched to your TCSA-2.4.0 cluster:

[root@testcpl backup-and-restore]# kubectl get backups -A 

NAMESPACE NAME STATUS  CURRENT STATE READY ACROSS CLUSTER BACKUP SIZE (IN BYTES) RETENTION POLICY  (IN DAYS) AGE MESSAGE
tcsa-system test-group-backup231 SUCCESSFUL  backupPostAction True true 3930097 45  101m action completed successfully
tps-system test-group-backup231 SUCCESSFUL  backupPostAction True true 808448 45

a) Once the backups are in Ready state do a restore of these two backups by applying the restore.yaml.example file using the kubectl apply command.

[root@testcpl backup-and-restore]# cat restore.yaml.example apiVersion: tcx.vmware.com/v1

kind: Restore
metadata:
 name: group-restore-tps-test
 namespace: tps-system
spec:
 backupName: test-group-backup231
 restore:
 postgres:
 timeout: 10m
 config:
 endpoint:
 host: postgres-cluster.tps-system.svc.cluster.local
 port: 5432
 adminSecret:
 name: postgres-db-secret
 namespace: tps-system
 dbs:
 - analyticsservice
 - alarmservice
 - collector
 - grafana
 - keycloak
---
apiVersion: tcx.vmware.com/v1
kind: Restore
metadata:
 name: group-restore-tcsa-test
 namespace: tcsa-system
spec:
 backupName: test-group-backup231
 postAction:
 name: postaction
 serviceAccount: cluster-admin-sa
 timeout: 30m
 resource:
 memory: 250Mi
 cpu: 100m
 bash:
 command:
 - /bin/bash
 - -c
 - |
 set ex;kubectl delete pods -n tcsa-system --selector runapiservice;  sleep 200;
 set ex;kubectl delete pod -n tcsa-system --selector=app.kubernetes.io/ name=grafana;
 sleep 10;
 set ex;kubectl exec -it deploy/br-operator -n tcsa-system -- curl -k -s -- show-error --stderr - -H 'Content-Type: application/json' -X POST --data  '{ "isCleanUpgrade": true }' http://apiservice:8080/smartsretcontroller/vsa/smarts/domain/migrate;
 restore:
 collectors:
 config:
 authenticationSecret:
 name: collectors-secrets
 namespace: tcsa-system
 passwordKey:
 key: COLLECTORS_PASSWORD
 usernameKey:
 key: COLLECTORS_USERNAME
 endpoint:
 basePath: /dcc/v1/
 host: collector-manager.tcsa-system.svc.cluster.local  port: 12375
 scheme: http
 timeout: 10m
 elastic:
 authentication:
 name: elasticsearch-secret-credentials
 namespace: tcsa-system
 passwordKey:
 key: ES_PASSWORD
 usernameKey:
 key: ES_USER_NAME
 cleanUpIndices: true
 config:
 endpoint:
 host: elasticsearch.tcsa-system.svc.cluster.local
 port: 9200
 scheme: https
 region: ap-south-1
 indexList:
 - vsa_chaining_history-*
 - vsa_events_history-*
 - vsa_audit-*
 - vsarole,policy,userpreference,mapping-metadata,mnr-metadata  - gateway-mappings
# Uncomment vsametrics for metrics restore and set cleanUpIndices as true

# - vsametrics*
# Uncomment vsa_catalog to restore TCSA 2.4 backup
# - vsa_catalog
# 'removeAndAddRepository: true' and trigger Backup/Restore, to cleanup the  respository.
 removeAndAddRepository: true
 timeout: 30m
 tls:
 caCrt:
 key: ca.crt
 insecureSkipVerify: true
 namespace: tcsa-system
 secretName: elasticsearch-cert
 tlsCrt:
 key: tls.crt
 # Uncomment KubernetesResources to restore configmaps/secrets.  # kubernetesResources:
 # timeout: 10m
 # resources:
 # - groupVersionResource:
 # group: ""
 # version: "v1"
 # resource: "secrets"
 # nameList:
 # - name: "spe-pguser"
 # namespace: "tcsa-system"
 # - groupVersionResource:
 # group: ""
 # version: "v1"
 # resource: "configmaps"
 # nameList:
 # - name: "product-info"
 # namespace: "tcsa-system"
 zookeeper:
 endpoint:
 host: zookeeper.tcsa-system.svc.cluster.local
 port: 2181
 paths:
 - path: /vmware/vsa/gateway
 - path: /vmware/vsa/smarts
# Uncomment the zookeeper path for NCM backup
# - path: /vmware/vsa/ncm
 timeout: 10m
[root@testcpl backup-and-restore]# kubectl apply -f  restore.yaml.examplerestore.tcx.vmware.com/group-restore-tps-test created restore.tcx.vmware.com/group-restore-tcsa-test created

b) Check the status of restore using the following command:

[root@testcpl# kubectl get restore -A NAMESPACE NAME STATUS CURRENT STATE READY AGE  MESSAGE

tcsa-system group-restore-tcsa-test SUCCESSFUL restore True 91m  tps-system group-restore-tps-test SUCCESSFUL restore True 92m

Using the Kubectl cp command by copying the backup from the Minio pods of TCSA-2.3.0 to the Local Machine and then copying the same backup to the Minio Pods of TCSA-2.4.0

Create a directory testbackups inside your Deployment Host or Local machine. Inside testbackups create separate directories for each of the minio pods.
```
[root@testcpl testbackups]# pwd
/root/testbackups

[root@testcpl testbackups]# ls
minio-0 minio-1 minio-2 minio-3
```

Copy the backup from each of the Minio pods in TCSA-2.3.0 individually and store them in separate directories.

[root@testcpl minio-0]# kubectl cp minio-0:/minio/vmware-tcsa backup .
[root@testcpl minio-1]# kubectl cp minio-1:/minio/vmware-tcsa backup .
[root@testcpl minio-2]# kubectl cp minio-2:/minio/vmware-tcsa backup .
[root@testcpl minio-3]# kubectl cp minio-3:/minio/vmware-tcsa backup .

Now copy the same backups from the local machine to the Minio pods of your Destination Cluster (TCSA-2.4.0)

[root@testcpl minio-0]# kubectl cp vmware-tcsa-backup -n tcsa-system  minio-0:/data
[root@testcpl minio-1]# kubectl cp vmware-tcsa-backup -n tcsa-system  minio-1:/data
[root@testcpl minio-2]# kubectl cp vmware-tcsa-backup -n tcsa-system  minio-2:/data
[root@testcpl minio-3]# kubectl cp vmware-tcsa-backup -n tcsa-system  minio-3:/data

Post copying the backups to the minio pods of TCSA-2.4.0, you need to delete all the Minio pods to reinstate the minio state. Wait for the minio pods to come back in Running state and then login to the Minio console and check if your backup is showing up.
```
[root@testcpl minio-0]# kubectl delete pods -n tcsa-system minio-0  minio-1 minio-2 minio-3
```

If you're able to see your backup in the minio console, next you need to perform a sync backup and restore on your destination cluster (TCSA-2.4.0). The example files can be found inside the TCSA-2.4.0 Deployer Bundle under

 /tcx-deployer/examples/backup-and-restore

To do a syncbackup apply the syncbackup.yaml.example file using the kubectl apply command.

[root@testcpl backup-and-restore]# cat syncbackup.yaml.example apiVersion: tcx.vmware.com/v1

kind: SyncBackup
metadata:
 name: sync-backup-tps-test
 namespace: tps-system
spec:
 overrideExisting: false
 filter:
 componentList:
 - postgres
 backupList:
 - test-group-backup231
 pauseIntegrityCheck: true
 overrideNamespace:
 targetNamespace: tps-system
 cluster:
 name: tcsa2.3.1
 storage:
 minio:
 bucket: vmware-tcsa-backup
 endpoint: minio.tcsa-system.svc.cluster.local:9000
 secretRef:
 name: minio-secrets
 namespace: tcsa-system
 accessKey:
 key: root-user
 secretKey:
 key: root-password
---
apiVersion: tcx.vmware.com/v1
kind: SyncBackup
metadata:
 name: sync-backup-tcsa-test
 namespace: tcsa-system
spec:
 overrideExisting: false
 filter:
 componentList:
 - elasticsearch
 - collectors
 - zookeeper
 - kubernetesResources
 backupList:
 - test-group-backup231
 pauseIntegrityCheck: true
 overrideNamespace:
 targetNamespace: tcsa-system
 cluster:
 name: tcsa2.3.1
 storage:
 minio:
 bucket: vmware-tcsa-backup
 endpoint: minio.tcsa-system.svc.cluster.local:9000
 secretRef:
 name: minio-secrets
 namespace: tcsa-system
 accessKey:
 key: root-user
 secretKey:
 key: root-password

[root@testcpl backup-and-restore]# kubectl apply -f  syncbackup.yaml.examplesyncbackup.tcx.vmware.com/sync-backup-tps-test created syncbackup.tcx.vmware.com/sync-backup-tcsa-test created

Check the status of syncbackup using the following command:

[root@testcpl backup-and-restore]# kubectl get syncbackups -A NAMESPACE NAME STATUS CURRENT STATE READY  AGE MESSAGE

tcsa-system sync-backup-tcsa-test SUCCESSFUL syncBackup True  101m synced: 1, skipped: 0, failed: 0
tps-system sync-backup-tps-test SUCCESSFUL syncBackup True  100m synced: 1, skipped: 0, failed: 0

Check if your backups have been synched to your TCSA-2.4.0 cluster:

[root@testcpl backup-and-restore]# kubectl get backups -A

NAMESPACE NAME STATUS  CURRENT STATE READY ACROSS CLUSTER BACKUP SIZE (IN BYTES) RETENTION  POLICY (IN DAYS) AGE MESSAGE
tcsa-system test-group-backup231 SUCCESSFUL  backupPostAction True true 3930097 45  101m action completed successfully
tps-system test-group-backup231 SUCCESSFUL  backupPostAction True true 808448 45

Once the backups are in Ready state do a restore of these two backups by applying the restore.yaml.example file using the kubectl apply command.

[root@wdc-10-214-142-133 backup-and-restore]# cat restore.yaml.example apiVersion: tcx.vmware.com/v1

kind: Restore
metadata:
 name: group-restore-tps-test
 namespace: tps-system
spec:
 backupName: test-group-backup231
 restore:
 postgres:
 timeout: 10m
 config:
 endpoint:
 host: postgres-cluster.tps-system.svc.cluster.local  port: 5432
 adminSecret:
 name: postgres-db-secret
 namespace: tps-system
 dbs:
 - analyticsservice
 - alarmservice
 - collector
 - grafana
 - keycloak
---
apiVersion: tcx.vmware.com/v1
kind: Restore
metadata:
 name: group-restore-tcsa-test
 namespace: tcsa-system
spec:
 backupName: test-group-backup231
 postAction:
 name: postaction
 serviceAccount: cluster-admin-sa
 timeout: 30m resource:
 memory: 250Mi
 cpu: 100m
 bash:
 command:
 - /bin/bash
 - -c
 - |
 set ex;kubectl delete pods -n tcsa-system --selector run=apiservice;  sleep 200;
 set ex;kubectl delete pod -n tcsa-system --selector=app.kubernetes.io/ name=grafana;
 sleep 10;
 set ex;kubectl exec -it deploy/br-operator -n tcsa-system -- curl -k -s  --show-error --stderr - -H 'Content-Type: application/json' -X POST --data  '{ "isCleanUpgrade": true }' http://apiservice:8080/smartsrestcontroller/vsa/ smarts/domain/migrate;
 restore:
 collectors:
 config:
 authenticationSecret:
 name: collectors-secrets
 namespace: tcsa-system
 passwordKey:
 key: COLLECTORS_PASSWORD
 usernameKey:
 key: COLLECTORS_USERNAME
 endpoint:
 basePath: /dcc/v1/
 host: collector-manager.tcsa-system.svc.cluster.local  port: 12375
 scheme: http
 timeout: 10m
 elastic:
 authentication:
 name: elasticsearch-secret-credentials
 namespace: tcsa-system
 passwordKey:
 key: ES_PASSWORD
 usernameKey:
 key: ES_USER_NAME
 cleanUpIndices: true
 config:
 endpoint:
 host: elasticsearch.tcsa-system.svc.cluster.local
 port: 9200
 scheme: https
 region: ap-south-1
 indexList:
 - vsa_chaining_history-*
 - vsa_events_history-*
 - vsa_audit-*
 - vsarole,policy,userpreference,mapping-metadata,mnr-metadata  - gateway-mappings
# Uncomment vsametrics for metrics restore and set cleanUpIndices as true # - vsametrics*
# Uncomment vsa_catalog to restore TCSA 2.4 backup
# - vsa_catalog
# 'removeAndAddRepository: true' and trigger Backup/Restore, to cleanup the  respository.
 removeAndAddRepository: true
 timeout: 30m
 tls:
 caCrt:
 key: ca.crt
 insecureSkipVerify: true
 namespace: tcsa-system
 secretName: elasticsearch-cert
 tlsCrt:
 key: tls.crt
 # Uncomment KubernetesResources to restore configmaps/secrets.  # kubernetesResources:
 # timeout: 10m
 # resources:
 # - groupVersionResource:
 # group: ""
 # version: "v1"
 # resource: "secrets"
 # nameList:
 # - name: "spe-pguser"
 # namespace: "tcsa-system"
 # - groupVersionResource:
 # group: ""
 # version: "v1"
 # resource: "configmaps"
 # nameList:
 # - name: "product-info"
 # namespace: "tcsa-system"
 zookeeper:
 endpoint:
 host: zookeeper.tcsa-system.svc.cluster.local
 port: 2181
 paths:
 - path: /vmware/vsa/gateway
 - path: /vmware/vsa/smarts
# Uncomment the zookeeper path for NCM backup
# - path: /vmware/vsa/ncm
 timeout: 10m
[root@wdc-10-214-142-133 backup-and-restore]# kubectl apply -f  restore.yaml.examplerestore.tcx.vmware.com/group-restore-tps-test created restore.tcx.vmware.com/group-restore-tcsa-test created

Check the status of restore using the following command:

[root@testcpl backup-and-restore]# kubectl get restore -A NAMESPACE NAME STATUS CURRENT STATE READY  AGE MESSAGE

tcsa-system group-restore-tcsa-test SUCCESSFUL restore True  91m 
tps-system group-restore-tps-test SUCCESSFUL restore True  92m

Detailed instructions can be found in the attached BACKUP_AND_RESTORE_USING_LOCAL_PVs_ACROSS_CLUSTERS.docx

Attachments

BACKUP_AND_RESTORE_USING_LOCAL_PVs_ACROSS_CLUSTERS.docx get_app