Please Note: if administrator found any host into not responding state, for initial investigation, Please open a case with Broadcom technical support for further investigation.
- Ping the Host: Check if the host responds to ping requests.
- Check Management Network: Ensure the management interface is reachable via vCenter or SSH.
- Test vSAN Network: Confirm vSAN VMkernel adapters are online and reachable between hosts.
- Navigate to vCenter > Monitor > vSAN > Health for cluster-wide alerts.
- Check for Hosts disconnected or Network partition warnings.
- Use out-of-band management tools (iLO/iDRAC) to verify power, fans, and NIC status.
- Ensure storage devices (disks, HBAs) are detected and healthy.
- Restart Management Services
- Access the host via ESXi console or SSH: Restarts vSAN management services
/etc/init.d/vpxa restart && /etc/init.d/hostd restart && /etc/init.d/vsanmgmtd restart
- If SSH is unavailable, use the Direct Console User Interface (DCUI).
- Use `df -h` via SSH to ensure the ESXi boot partition is not full.
- Resolve space issues by removing logs or unnecessary files.
- Validate Network Configuration
- Ensure vSAN VMkernel adapters are on the correct subnet and VLAN.
- Verify physical switch port statistics for errors/discards.
- Confirm firewall rules allow vSAN traffic (TCP/UDP 12321, 23451).
- Host Logs (via SSH):
/var/log/vmware/vsan-health.log # vSAN health service
/var/log/vmware/hostd.log # Host management service
- vCenter Logs: Check for cluster reconfiguration events.
- Ensure the host is still part of the vSAN cluster in vCenter.
- Look for recent cluster reconfiguration or partition events.
- vSAN Performance graph: Use the vSAN Performance tool to analyze network latency/throughput.