This article provides custom args and diagnostics.crsh files to use with the VMware Tanzu Kubernetes Grid (TKG) Crash Diagnostic (Crashd) utility and shows how to use them to gather data from TKG clusters in specific scenarios or use cases.
The attached custom files can:
Note: The scripts referenced in this article have been tested on Ubuntu-based TKG clusters.
Point your Crashd binary to use any of the the attached pairs of args and diagnostics.crsh files based on your TKG diagnostic requirements.
The following is an overview of the procedure. The details of the procedure differ based on your scenario:
1. Select a pair of args and diagnostics.crsh files.
2. Download or copy the referenced args and diagnostics.crsh files to the system Crashd will run from.
3. Set the mentioned parameters in the args file.
Optional: Customize
System commands to run on nodes:
To capture() the output to local files
To run() and display results to local stdout
Kubernetes objects or Kubernetes logs to retrieve from Kubernetes API
4. Run crashd. The following is an example:
crashd run --args-file <args-file> --debug <diagnostics.crsh-file>
1. Install the Crashd (Crash Diagnostics). For more information, see the following TKG documentation: Troubleshooting Tanzu Kubernetes Clusters with Crash Diagnostic
2. Confirm that crashd is working in your environment before using the custom files in the Resolution section.
Note: Using the default args and diagnostics.crsh files provided with the crashd download and configured by referencing the TKG documentation should work first.
Pick and run the crashd custom files that fits your scenario or applies to your project.
Scenario 1:
You need additional system diagnostics from your cluster nodes.
Details:
The following args and diagnostics.crsh files capture system-only diagnostics (no Kubernetes objects) from all cluster nodes based on the current kubeconfig context.
These files also includes example for installing additional system utilities (netstat, etc).
Files:
args_sysdiag_all_hosts.txt
diagnostics.crsh_sysdiag_all_hosts.txt
This diagnostics.crsh file:
Sources the TKG args file. Requires only ssh_pk_file, workdir values.
Uses current kubeconfig context to identify cluster nodes.
Connects to each node.
Executes all capture and/or run commands placed in the diagnsostics.crsh file.
Creates a tar file of outputs.
Steps:
1. Download args_sysdiag_all_hosts.txt and diagnostics.crsh_sysdiag_all_hosts.txt files.
2. Set the ssh_pk_file, workdir, and cluster_config values in the args file.
3. Edit the diagnostics.crsh file and add your own commands as needed.
4. Add to section "## COMMANDS" in the file.
5. Set the kubeconfig context to the cluster of choice, or the one that already exists in default /home/ubuntu/.kube/config:
kubectl config use-context <tkg-cluster-context>
Execute:
Run the following crashd command:
crashd run --args-file <args-file> --debug <diagnostics.crsh-file>
Scenario 2:
You want system diagnostics from specific cluster nodes.
Details:
The following args and diagnostics.crsh files capture system-only diagnostics from a specific list of cluster nodes provided.
Files:
args_sysdiag_custom_hosts.txt
diagnostics.crsh_sysdiag_custom_hosts.txt
This diagnostics.crsh file:
Sources the TKG args file.
Requires ssh_pk_file, workdir, hosts (a list of host IP addresses).
Connects to each node.
Executes all "capture" commands listed in the diagnsostics.crsh file.
Creates a tar file of outputs.
Steps:
1. Download args_sysdiag_custom_hosts.txt and diagnostics.crsh_sysdiag_custom_hosts.txt.
2. Set the ssh_pk_file, ssh_user, and workdir. Set only_target_hosts=yes.
3. Set hosts =<Add comma-delimited list of any node IP>.
4. Edit the diagnostics.crsh file and add your own commands as needed.
Execute:
Run the following crashd command:
crashd run --args-file <args-file> --debug <diagnostics.crsh-file>
Scenario 3:
You only want Kubernetes objects from your clusters and you want to customize your system.
Details:
The following args and diagnostics.crsh files capture only Kubernetes objects from your management cluster or a list of workload clusters. No system diagnostics are captured.
Files:
args-custom_kube_capture.txt
diagnostics.crsh-custom_kube_capture.txt
This diagnostics.crsh file:
Steps:
1. Download args-custom_kube_capture.txt and diagnostics.crsh-custom_kube_capture.txt files.
3. Set cluster target type.
4. Edit the diagnostics.crsh file and:
Update the Kubernetess namespaces to diagnose.
Update the Kubernetes objects to capture.
Refer to examples included in this file.
Execute:
Run the following crashd command:
crashd run --args-file <args-file> --debug <diagnostics.crsh-file>