Data Evacuation Procedure for Maintenance Mode on Kubernetes Clusters
search cancel

Data Evacuation Procedure for Maintenance Mode on Kubernetes Clusters

book

Article ID: 312272

calendar_today

Updated On:

Products

VMware

Issue/Introduction

This knowledge base article outlines the steps required to evacuate application data from one data-store to another within the Kubernetes Cluster managed by Container Service Extension (CSE). The procedure aims to minimize disruption to applications during maintenance activities. It is crucial for customers wishing to bring data-stores into maintenance mode without affecting the continuity of their services.
  1. All the steps related to vSphere Datastore, Storage Policies, Tags, etc must be executed by Infrastructure admin.
  2. VMware Cloud Director’s provider portal and storage policies must be executed by the VCD system admin.


Resolution

Prerequisites

- Access to the CSE Kubernetes Cluster with administrative privileges.
- Availability of a target data-store to which the application data will be evacuated.
- Familiarity with Kubernetes operations and concepts, including Persistent Volumes (PVs), Persistent Volume Claims (PVCs), and Stateful Sets.


Procedure
Disks in Non-Shared Mode
This section describes the procedure needed when evacuating disks that are NOT shared across VMs. In the Kubernetes context, these are ReadWriteOnce(RWO) storage volumes.

Assumptions

  1. All Data-Stores are tagged with the same Storage Profile.
  2. The CSI for Named Independent Disks uses a Kubernetes Storage Class where the above Storage Profile is configured.
  3. The Storage created uses ReadWriteOnce Access Mode.
  4. There is enough capacity available in the destination data-store. 

Step 1: Identify Affected Applications

Begin by identifying all applications that will be affected by the data-store maintenance. This involves listing all Persistent Volume Claims (PVCs) bound to the data-store scheduled for maintenance. To locate the data store, you can check the Kubernetes UI plugin on VCD Tenant portal, and select the cluster or view named disk UI on the tenant portal. Figure 1 displays vCenter Screenshot to display datastore where the PVC is located.
 
image.png

figure 1 Locate Datastore relevant to PVC on vCenter

In this kb article we use the following pod config to make sure that access to pod is not disrupted.

This Pod’s job is to continuously make use of storage. It continually writes into a disk which has been provisioned using CSI for Named Independent Disk. It writes its progress to a file which can be examined for any interruptions. Since there is a continuous ‘fsync’ after a block of 100 bytes is written, we can assume that there is a continuous access to the disk even during the move of the disk across data-stores.

##Sample pod spec to access storage continuously
apiVersion: v1
kind: Pod
metadata:
  name: ubuntudd
spec:
  containers:
    - name: ubuntu
      image: ubuntu:latest
      command: [ "/bin/sh", "-c", "dd if=/dev/urandom of=/mnt/volume1/my_pod1.txt count=10000000 bs=100 oflag=sync status=progress 2>&1 | tee out.txt" ]
      volumeMounts:
      - name: volume1
        mountPath: "/mnt/volume1"
  volumes:
  - name: volume1
    persistentVolumeClaim

claimName: busybox-pvc
  restartPolicy: Never

 

Step 2: Provision a New Data-store


Move the storage using the instructions in  https://docs.vmware.com/en/VMware-Cloud-Director/10.4/VMware-Cloud-Director-Service-Provider-Admin-Portal-Guide/GUID-0ACC5D50-237B-4852-A779-9E3F680364A2.html and Wait until the migration is completed. Check the logs of the pod by executing the command.

kubeclt logs <pod_name>

Step 3: Validation


When checking the logs while data-store migration happens, we observe no data loss when the above specified pod is running.

Conclusion

Following above steps will allow provider admin/Storage admin to evacuate application data to a new data-store with minimal disruption to the applications running on the Kubernetes Cluster. For further assistance or specific queries related to your environment, please contact the support team.