Notes:
- The metrics collected by this process only contain metrics of the last 2 days;
- The collection operation will invoke a number of queries. This operation could take a while to finish.
Collection steps:
- Ensure the performance issue is actively occurring.
- Enable Verbose and Network Diagnostics mode via vSAN cluster > Configure > vSAN > Services > Performance Service > Edit
- Let the system run for 10 to 15 minutes.
- Collect the host logs from the entire cluster including vCenter logs
- Disable the Verbose and Network Diagnostics mode
- Upload the logs to VMware for analysis
Notes:
- It's highly recommended that the cluster logs are collected before vCenter logs to ensure we get all the required logging from the hosts at the time of the event as host logs tend to wrap faster than vCenter logs.
- When troubleshooting performance issues it's best to collect the Master/Leader host first to ensure we get the perf data at the time of the performance event.
- When dealing with large clusters 20+ hosts and collecting logs via vCenter it's best to collect the host logs in small batches, no more than 5 at a time to ensure the logs don't get corrupt.
If the issue is not actively occurring but is reproducible on demand then do the following:
- Enable Verbose and Network Diagnostics mode via vSAN cluster -> Configure -> vSAN -> Services -> Performance Service -> Edit
- Reproduce the performance issue
- Let the system run for 10 to 15 minutes.
- Collect the ESXi host logs from all hosts in the cluster and also logs from vCenter
- Disable the Verbose and Network Diagnostics mode
- Upload the logs to VMware for analysis
If the performance issue is affecting the log collection, we will need at a bare minimum the logs from the Master host in the cluster, as the bundle of the Master will contain the performance statistics of the cluster.
To determine the Master host in the cluster check the following:
vCenter -> vSAN Cluster -> Monitor -> vSAN -> Health -> Performance Service -> Stats Master Election