This procedure will allow you to successfully stop and start a Multi Master PKS Cluster:
Part 1: Shutdown the Multi Master PKS Cluster
1. Note the BOSH deployment name of your PKS cluster, it will be in the form of "
service-instance_<cluster uuid>
".
2. Get the BOSH VMs output for the PKS cluster deployment and confirm all services are running on each VM in the cluster.
bosh -d service-instance_xxxxxxxxxx is --ps
Note: If the cluster is not healthy it may cause issues after startup that are not related to this shutdown or startup procedure.
3. Stop the workers by executing the following command:
bosh -d service-instance_xxxxxxxxx stop worker
4. Stop the Masters by executing the following command:
bosh -d service-instance_xxxxxxxxxx stop master
5. Confirm the processes for all VMs are now showing unknown and the status of the VMs shows stopped.
bosh -d service-instance_xxxxxxxxxx is --ps
Part 2: Startup the Multi Master PKS Cluster
1.
BOSH SSH to the first Master VM (
master/0),
sudo -i (switch to root user) and start the
etcd service. Run
monit summary to confirm the
etcd service is now running and exit from the Master VM.
bosh -d service-instance_xxxxxxxxxx ssh master/0
sudo -i
monit start etcd
monit summary
exit
2. Start the next Master VM using the BOSH start command. This will bring all services up on Master index 1 VM.
bosh -d service-instance_xxxxxxxxxxxx start master/1
3. At this stage you will have the
etcd service running on 2 Master VMs which means
etcd has quorum once again. Run the following BOSH commands within the Master instance group to bring up the remaining services.
bosh -d service-instance_xxxxxxxxxx start master/2
Wait for master/2 to start.
bosh -d service-instance_xxxxxxxxxx ssh master/0 "sudo monit stop all"
bosh -d service-instance_xxxxxxxxxx start master/0
4. Confirm that all services for each master VM are now up and running.
bosh -d service-instance_xxxxxxxxxxx is --ps
5. Next start the worker VMs:
bosh -d service-instance_xxxxxxxxxxx start worker
6. The cluster should be back up and running now. Use BOSH
is --ps
to confirm all services are running for each VM.
bosh -d service-instance_xxxxxxxxxx is --ps
7. Confirm the
componentstatus
of Kubernetes shows all 3 etcd services are Healthy
kubectl get componentstatus