There are multiple components in TKGI which operate in a leader/follower mode. In this high availability pattern, the leader is the entry point of requests and is responsible for coordinating tasks with the followers. The components that fall into this category are
In a multi-control plane and worker node environment, tracking down the leader is important for troubleshooting and logs review. For the below components, leader election uses lease API from the coordination.k8s.io API group to identify the leading replica and continuously renew it based on the timestamps monitored by Lease Duration Seconds
kubectl get leases.coordination.k8s.io -A | grep -v node
NAMESPACE NAME HOLDER AGE
kube-system kube-controller-manager ad975454-1101-4a24-b2fa-25705d3b9dc0_faf633cc-0d5a-4b8a-ba45-c85bbbd50024 127m
kube-system kube-scheduler ad975454-1101-4a24-b2fa-25705d3b9dc0_8109191c-1eb4-4d13-967b-1735e19086fb 127m
vmware-system-csi csi-vsphere-vmware-com ad975454-1101-4a24-b2fa-25705d3b9dc0 127m
vmware-system-csi external-attacher-leader-csi-vsphere-vmware-com ad975454-1101-4a24-b2fa-25705d3b9dc0 127m
vmware-system-csi external-resizer-csi-vsphere-vmware-com ad975454-1101-4a24-b2fa-25705d3b9dc0 127m
vmware-system-csi vsphere-syncer ad975454-1101-4a24-b2fa-25705d3b9dc0 127m
The names in the Holder
column are the nodes that are holding the lease. These holder names do not correspond to the Kubernetes node names. The holder names are bosh deployed VMs hostnames.
bosh -d service-instance_aeec33f2-0c07-444f-a20e-3648d3ac18ed ssh master hostname | egrep -v 'subject|to|use'
master/a2cb06fc-c6d2-477c-bdfb-6212591b38c6: stdout | 6e2aa260-2ec5-4537-9133-46192d858a3b
master/31c0f1f6-2104-4479-a4e3-39ed63aadc5c: stdout | f8ad35c5-198d-46c8-bdb7-bbf610b81329
master/9ddb3dfe-a988-4249-a2e7-0ba1ec0ac47b: stdout | ad975454-1101-4a24-b2fa-25705d3b9dc0
As clear from the output above all the leases in this environment are held by a node with hostname ad975454-1101-4a24-b2fa-25705d3b9dc0
which is master/9ddb3dfe-a988-4249-a2e7-0ba1ec0ac47b
.This means the replica running on this node will have the leader for these components. You can bosh ssh
to this node to monitor and check out the logs.
Below command gives us the etcd leader which is master/9ddb3dfe-a988-4249-a2e7-0ba1ec0ac47b
bosh -d service-instance_aeec33f2-0c07-444f-a20e-3648d3ac18ed ssh master/0 "ETCDCTL_API=3 /var/vcap/jobs/etcd/bin/etcdctl endpoint status" | egrep -v 'subject|to|use' | grep true
master/9ddb3dfe-a988-4249-a2e7-0ba1ec0ac47b: stdout | https://master-0.etcd.cfcr.internal:2379, 17f206fd866fdab2, 3.5.4, 5.5 MB, true, false, 4, 28536, 28536,
Identifying ncp leader is achieved by leveraging nsxcli
present on K8s cluster master VMs.
bosh -d service-instance_aeec33f2-0c07-444f-a20e-3648d3ac18ed ssh master "sudo /var/vcap/jobs/ncp/bin/nsxcli -c get ncp-master status" | egrep -v 'subject|to|use' | grep "This instance is the NCP master"
master/31c0f1f6-2104-4479-a4e3-39ed63aadc5c: stdout | This instance is the NCP master