The environment was shutdown as outlined in Shutting down and restarting Tanzu Kubernetes Grid Integrated Edition
On startup, the Master nodes started successfully but startup of the Worker nodes fails while validating the health of etcd:
Task 564189 | 11:05:13 | L executing post-start: master/########-####-####-####-############ (0) (canary) (00:04:10)
L Error: Action Failed get_task: Task 6333b5e9-7f48-490b-6fc5-311eab3f3653 result: 1 of 5 post-start scripts failed. Failed Jobs: etcd. Successful Jobs: bosh-dns, kubernetes-roles, kube-apiserver, pks-nsx-t-ncp.
The etcd process is running on all 3 Master nodes and all 3 members are referenced in the etcd cluster configuration.
# etcdctl member list -w table
+------------------+---------+--------------------------------------+------------------------------------------+------------------------------------------+------------+
| ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER |
+------------------+---------+--------------------------------------+------------------------------------------+------------------------------------------+------------+
| 17f206fd866fdab2 | started | ########-####-####-####-############ | https://master-0.etcd.cfcr.internal:2380 | https://master-0.etcd.cfcr.internal:2379 | false |
| 8f18440d0ccf8bf9 | started | ########-####-####-####-############ | https://master-1.etcd.cfcr.internal:2380 | https://master-1.etcd.cfcr.internal:2379 | false |
| fce4f52fecd850d5 | started | ########-####-####-####-############ | https://master-2.etcd.cfcr.internal:2380 | https://master-2.etcd.cfcr.internal:2379 | false |
+------------------+---------+--------------------------------------+------------------------------------------+------------------------------------------+------------+
But on 2 nodes, it is not healthy and is etcdctl client cannot connect
# etcdctl endpoint --cluster health -w table
{"level":"warn","ts":"2025-07-09T11:21:37.437621Z","logger":"client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00032e000/master-2.etcd.cfcr.internal:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = latest balancer error: last connection error: connection error: desc = \"transport: authentication handshake failed: remote error: tls: internal error\""}
{"level":"warn","ts":"2025-07-09T11:21:37.437542Z","logger":"client","caller":"[email protected]/retry_interceptor.go:63","msg":"retrying of unary invoker failed","target":"etcd-endpoints://0xc00032e780/master-0.etcd.cfcr.internal:2379","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
+------------------------------------------+--------+--------------+---------------------------+
| ENDPOINT | HEALTH | TOOK | ERROR |
+------------------------------------------+--------+--------------+---------------------------+
| https://master-1.etcd.cfcr.internal:2379 | true | 20.010757ms | |
| https://master-2.etcd.cfcr.internal:2379 | false | 5.005s | context deadline exceeded |
| https://master-0.etcd.cfcr.internal:2379 | false | 5.002047351s | context deadline exceeded |
+------------------------------------------+--------+--------------+---------------------------+
Restart etcd on the two unhealthy nodes
monit stop etcd
monit start etcd
etcdctl endpoint --cluster health -w table
If etcd is still unhealthy after the restart, please contact Broadcom Tanzu Support for assistance.