This article provides information about Root Cause Analysis (RCA) requests for single host remediation events.
Neither VMware nor AWS will provide RCAs relating to single host remediation events. This covers both single host failures, as well as single Top of Rack (TOR) failures which impact only a single host per cluster in the SDDC.
The reason single TOR failures are not eligible for RCAs is due to VMC's implementation of Partition Placement Groups (PPG) which allows for TOR failures to theoretically only impact a single host per cluster per SDDC. PPG is explained in depth in [VMC] Understanding Automated Retrofit and Partition Placement Groups (PPG).
VMC on AWS is actively monitored by the Auto-Remediation service, which is responsible for automatically handling any host remediation tasks. Through Auto-Remediation, single host remediation tasks will minimize the impact to Customer Workload VMs, allowing for a healthy host to join the SDDC in place of the failed host. Auto-Remediation is explained in depth here. In the rare instance that Auto-Remediation fails to handle the single host remediation task cleanly, an RCA may be able to be provided relating to why Auto-Remediation failed, but it will not cover the cause of the initial host failure.
Shared Responsibility Model Overview - VMware Cloud on AWS - For more details around what pieces of the environment VMware monitors & manages vs what pieces of the environment the Customer is expected to monitor & manage.
Resiliency Design Considerations and Best Practices for VMware Cloud on AWS