This article explains the purpose and details of the vSAN 'Data Protection service health' check and provides details on why it would report error state.
VMware vSAN 9.1
Q: What does the 'Data protection service health' health check do?
This health check monitors the overall health of key services involved in protection. This check ensures that all required components (both for local vSAN snapshots and replication) are healthy.
The system automatically checks services like replication, snapshots, and recovery on both vCenter Server and ESX hosts. Depending on whether local snapshot or replication is configured, only the relevant services are evaluated.
Q: What does it mean when it is in an error state?
If DPMS (data protection monitoring service) is unhealthy, ensure the Protection and Recovery appliance isn't blocked by the vCenter server firewall ruleset via the vCenter Server Management interface (URL: https://<vcenter-server-host>:5480).
After ruling out firewall issues, log in to the Protection and Recovery appliance VM and verify network connectivity to the vCenter server using network tools such as ping or traceroute. If the appliance connectivity still doesn't work, check if the DPMS container is running using the following commands:
# /usr/bin/ping <vcenter-server-host>
# /usr/bin/traceroute <vcenter-server-host>
# docker ps
(e.g. to check the status of dp-monitoring-dp-monitoring-1 and snapservice-snapservice-1)
-s option) of these services on vCenter and if required, restart them (-r option) using the following commands via SSH:# /usr/sbin/vmon-cli -s vsan-health# /usr/sbin/vmon-cli -s vpxd# /usr/sbin/vmon-cli -r vsan-health# /usr/sbin/vmon-cli -r vpxd
# /usr/init.d/<service name> status# /use/init.d/<service name> restart