Gather Workload Management Support Bundle
Workload Management support bundles can be retrieved by logging into the VC UI and selecting Menu -> Workload Platform -> Clusters -> Export Logs, with the appropriate cluster selected.
-This works even if the cluster is stuck in a removing, configuring or updating state.
-This includes a vCenter log bundle.
-This does not include esxi logs. If the issue pertains to vSpherePods or an issue with the Guest Cluster VM's themselves, customers should gather esxi logs additionally to upload to their support ticket.
-If the log bundle from the GUI does not work, you can manually gather the logs from the command line by following this kb to ssh into each Supervisor Control Plane and run the following command to gather logs for JUST the Supervisor Control Plane VM that the command was ran on. Then you will need to manually scp the files off of the machine.
root@42184a1e6d3c54eff2384b2736cf2079 [ /usr/bin ]# wcp-agent-support.sh
Gather Guest Cluster(VKS) Support Bundle
This bundle is gathered via a cli tool attached to this kb. This is supported only on MacOS and Linux jumpbox's.
Prerequisites:
1. A linux or macOS jumpbox to run the tool from. If you are a windows-only shop you can use the vCenter or a Supervisor Control Plane VM as your jumpbox to run the bundler from.
Note: To run kubectl commands on vCenter you can pull it from the supervisor cluster by running this command from vCenter as root:
# curl -k https://$(/usr/lib/vmware-wcp/decryptK8Pwd.py | grep IP -m 1 | awk '{print $2}')/wcp/plugin/linux-amd64/vsphere-plugin.zip -o /tmp/vsphere-plugin.zip && unzip -d /usr /tmp/vsphere-plugin.zip
2. The supervisor cluster kubeconfig file present on the system from which the vks-support-bundler command will be run. This can either be copied from another system or generated via running the kubectl vsphere login command.
3. Your current Kubernetes context must be set to the supervisor cluster.
4. When user chooses Guestops channel to gather log, the user must be a member of the Administrator group, as generating the support bundle requires permissions to create users and roles, and to add users to the ServiceProvider group.
5. For cluster with Windows nodes, if users collect logs via ssh channel, they don't need to prepare the admin username and password before log collection. If users collect logs via guestops channel, they must prepare the admin username and password before log collection. There are two ways to prepare it:
Added before the BYOI (Bring Your Own Image): This is the only feasible approach if the customer wants to collect logs when the node network does not work.
Added through ssh:
Use the script named set_windows_adminuser.sh under the attachfile. (eg: ./set_windows_adminuser.sh {cluster-name} {cluster-namespace} {windows-admin-user})
The environment must have a kubeconfig file with admin permissions that allows access to the Supervisor Cluster.
This script should be executed on a machine that has access to the same subnet as the guest cluster.
This script will SSH into the VMs to add a new admin username and password.
Gather Windows logs via Guestops channel -
Below part is specific to a case of Guestops + Windows cluster + Create windows user in advance + VPC environment
apiVersion: v1
kind: ServiceAccount
metadata:
name: vks-support-bundler-sa
namespace: <cluster-namespace>
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: vks-support-bundler-rolebinding
namespace: <cluster-namespace>
subjects:
- kind: ServiceAccount
name: vks-support-bundler-sa
namespace: <cluster-namespace>
roleRef:
kind: ClusterRole
name: view
apiGroup: rbac.authorization.k8s.io
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: vks-support-bundler-secret-role
namespace: <cluster-namespace>
rules:
- apiGroups: [""]
resources: ["secrets"]
resourceNames: ["<clustername>-ssh"]
verbs: ["get"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: vks-support-bundler-secret-rolebinding
namespace: <cluster-namespace>
subjects:
- kind: ServiceAccount
name: vks-support-bundler-sa
namespace: <cluster-namespace>
roleRef:
kind: Role
name: vks-support-bundler-secret-role
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: Pod
metadata:
name: podvm-1
namespace: <cluster-namespace>
spec:
serviceAccountName: vks-support-bundler-sa
containers:
- image: photon:3.0
name: vpc-traffic-podvm-1
securityContext:
runAsUser: 0
command: [ "/bin/bash", "-c", "--" ]
args:
- |
rm -f /etc/yum.repos.d/photon-updates.repo /etc/yum.repos.d/photon-extras.repo
yum install -y jq openssh-server
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
chmod +x kubectl
mv kubectl /usr/bin/kubectl
while true; do sleep 30; done
kubectl apply -f podvm.yaml
kubectl cp set_windows_adminuser.sh -n vks-support podvm-1:/tmp/set_windows_adminuser.sh
kubectl exec -it podvm-1 -n <cluster-namespace> -- /bin/bash
chmod +x ./tmp/set_windows_adminuser.sh
./tmp/set_windows_adminuser.sh <cluster-name> <cluster-namespace> <windows-admin-user>
kubectl delete -f podvm.yaml
vks-support-bundler create \
-k <supervisor-kubeconfig> \
-o <output-dir> \
-c <cluster-name> \
-n <cluster-namespace> \
-v <vc-ip> \
-w <windows-admin-username> \
-u <vc-username> \
-i
6. When choosing the SSH channel to collect logs, users are required to run VKS Support Bundler cli in the same subnets with guest clusters as long as nodes are ping-reachable. For VPC environment, nodes can't be directly SSHed even if users are in same subnets with cluster. Users must run vks-support-bundler from podVM. For an example PodVM file, please check the Additional Information section below.
Flags available for vks-support-bundler:
./vks-support-bundler help create
Create a Kubernetes cluster support bundle
Usage:
vks-support-bundler create [flags]
Flags:
-b, --batch-size int Number of nodes on which parallel collection is triggered (default 5)
--ca-certificate string Path to the endpoint public certificate file
--channel string Communication protocol to use for support bundle collection (guestops, ssh) (default "guestops")
-c, --cluster string Kubernetes cluster to collect support-bundle for
--config string Path to the YAML config file
--controlplane-node-only To collect support bundle only from control plane nodes
-h, --help help for create
-i, --insecure Creates an insecure connection to the VC
-k, --kubeconfig string Absolute path to the kubeconfig (default "/home/<username>/.kube/config")
-l, --log-ns string Comma separated namespaces list whose logs should be included
-n, --namespace string Supervisor Cluster namespace where the Kubernetes cluster resides
-s, --node-stats To include the node stats in the support bundle
-o, --output string Absolute path to the directory where the support-bundle will be stored, e.g. /home/myuser/mybundle
-p, --progress-bar To progress-bar for support-bundle collection per node
-t, --resource-types string Comma separated list of Kubernetes resource types (e.g. pvc,pv)
--skip-create-user Use the provided user to run GuestOps without creating a temporary user
-u, --user string VC User name
-v, --vc string VC IP or FQDN with optional port (default: 443 for HTTPS).
-V, --verbose Collect additional logs to help debug support-bundle collection failures and print to stderr
-w, --windows-admin-username string Windows vm admin username
-e, --windows-event-hours-ago string Specify the collection of windows events within a few hours (default "12")
Guestops channel -
Required flags:
-c, --cluster string Kubernetes cluster to collect support-bundle for
-n, --namespace string Supervisor Cluster namespace where the Kubernetes cluster resides
-o, --output string Absolute path to the directory where the support-bundle will be stored, e.g. /home/myuser/mybundle
-u, --user string VC User name
-v, --vc string VC IP:. By default, 443 is considered as the https port
-w, --windows-admin-username string Windows vm admin username ## This should only be used in a cluster that has Windows nodes.
Example for default (guestops) support bundle where
- .kube/config file lives under ~/.kube/config and has its context set to the supervisor cluster
- 192.0.2.15 is the vCenter ip address
- Admin user is Administrator and the VMware SSO domain is vsphere.local
- Guest cluster name is guestcluster01
- Supervisor Cluster Namespace where the Guest Cluster lives is supcluster01
- Output of the log bundle would be the user's home directory which is ~/
./vks-support-bundler create -k ~/.kube/config -v 192.0.2.15 -u [email protected] -c guestcluster01 -n supcluster01 -o ~/ -i true -p
SSH channel -
Required flags:
-c, --cluster string Kubernetes cluster to collect support-bundle for
-n, --namespace string Supervisor Cluster namespace where the Kubernetes cluster resides
-o, --output string Absolute path to the directory where the support-bundle will be stored, e.g. /home/myuser/mybundle
Example for support bundle collection via ssh channel
- .kube/config file lives under ~/.kube/config and has its context set to the supervisor cluster
- ssh represents support bundle collection via ssh channel
- Guest cluster name is guestcluster01
- Supervisor Cluster Namespace where the Guest Cluster lives is supcluster01
- Output of the log bundle would be the user's home directory which is ~/
./vks-support-bundler create -k ~/.kube/config -c guestcluster01 -n supcluster01 --channel ssh -o
If there is already a service account named "vks-support-bundler-user-{cluster-name}-{cluster-namespace}" or permissions associated with it (role name: "vks-support-bundler-guestops-role-{cluster-name}-{cluster-namespace}" ), the log bundle will fail. Therefore, users must ensure that they clean up this service account and the related role to enable logging collection.
There are two methods to delete them:
1. Automatic Deletion: After executing a binary file, the system will prompt for automatic deletion.
2. Manual Deletion:
To delete a role: navigate through the VC UI to Administration -> Roles, find the specific role: and then click the delete button.
To delete a user account, navigate through the VC UI to Administration -> Single Sign-On -> Users and Groups, find the specific user account, and then click the delete button.
Note:
--resource-types:--log-ns:The version of vks-support-bundler 3.6.0 has following changes:
Flag change:
--channel flag to allow collecting logs via guestops or ssh channelSome improvement:
Added extra data:
Node status:
pvc info |
sudo kubectl --kubeconfig=/etc/kubernetes/admin.conf get pvc --chunk-size=10 -A -o wide -v 9 &> kubectl-pvc.out |
pv info |
sudo kubectl --kubeconfig=/etc/kubernetes/admin.conf get pv --chunk-size=10 -A -o wide -v 9 &> kubectl-pv.out |
NFS network statistics |
sar -n NFS 1 5 > sarnfs1-5.out |
system mount information |
cat /etc/fstab > etc-fstab.out |
all mounted file systems |
findmnt >/dev/null 2>&1 && findmnt > findmnt.out |
mount statistics |
cat /proc/self/mountstats > mountstats.out 2>&1 |
IO stats per device |
iostat -N 1 5 > iostat-N.out |
NFS statistics |
nfsstat 1 5 > nfsstat.out |
pvc info |
sudo kubectl --kubeconfig=/etc/kubernetes/admin.conf get pvc --chunk-size=10 -A -o wide -v 9 &> kubectl-pvc.out |
pv info |
sudo kubectl --kubeconfig=/etc/kubernetes/admin.conf get pv --chunk-size=10 -A -o wide -v 9 &> kubectl-pv.out |
NFS network statistics |
sar -n NFS 1 5 > sarnfs1-5.out |
system mount information |
cat /etc/fstab > etc-fstab.out |
all mounted file systems |
findmnt >/dev/null 2>&1 && findmnt > findmnt.out |
mount statistics |
cat /proc/self/mountstats > mountstats.out 2>&1 |
IO stats per device |
iostat -N 1 5 > iostat-N.out |
NFS statistics |
nfsstat 1 5 > nfsstat.out |
Tuned log |
sudo journalctl -xeu tuned &> journalctl-tuned.out |
Tuned log |
sudo cat /var/log/tuned/tuned.log &> tuned-log.out |
Guestinfo metadata |
vmtoolsd --cmd "info-get guestinfo.metadata" | base64 -d | gunzip > guestinfo-metadata.out |
K8s objects
ippools |
sudo kubectl get ippools --chunk-size=500 -o yaml --kubeconfig ${KUBECONFIG} >> ippools.yaml |
network-attachment-definition |
sudo kubectl get network-attachment-definitions -A --chunk-size=500 -o yaml --kubeconfig ${KUBECONFIG} >> network-attachment-definitions-all-namespaces.yaml |
Example VPC PodVM -
Note:
If the current PodVM image is not pullable, please update it to a customer-pullable photon:3.0 image.
The value of memory/cpu/storage in resources is example and it works for collecting support bundle from cluster with 3 control plane nodes and 150 worker nodes.
apiVersion: v1
kind: Pod
metadata:
name: vks-support-bundler-podvm
namespace: <cluster-namespace>
spec:
containers:
- image: "photon:3.0"
name: vpc-traffic-podvm-1
securityContext:
runAsUser: 0
resources:
requests:
memory: 2Gi
cpu: 500m
limits:
memory: 4Gi
cpu: 2
volumeMounts:
- name: support-bundler-storage
mountPath: /data
command: ["/bin/bash", "-c", "--"]
args:
- |
while true; do sleep 30; done
volumes:
- name: support-bundler-storage
persistentVolumeClaim:
claimName: vks-support-bundler-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: vks-support-bundler-pvc
namespace: <cluster-namespace>
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: <storageclass-name>
Run vks-support-bundler in VPC PodVM -
- Copy vks-support-bundler and supervisor kubeconfig into podvm
kubectl cp vks-support-bundler -n <cluster-namespace> vks-support-bundler-podvm:/tmp
kubectl cp <sv-kubeconfig> -n <cluster-namespace> vks-support-bundler-podvm:/tmp
- kubectl exec into podvm:
k exec -it -n <cluster-namespace> vks-support-bundler-podvm -- /bin/sh
chmod +x vks-support-bundler
------------------------------------------------------------------------------------------------------------------------------
The version of vks-support-bundler 3.5.0 has following changes:
Flag change:
Some improvement:
Add extra data:
------------------------------------------------------------------------------------------------------------------------------
The version of vks-support-bundler 3.4.0 has following changes:
New Features -
Bug Fixes -
CLI Flag Changes -
------------------------------------------------------------------------------------------------------------------------------