When attempting to remove a crashed and permanently decommissioned Virtual Appliance (VApp) node from a cluster, the operation fails with the error "Cannot connect to host <ip address of crashed node> - host is unreachable via SSH (error code: 255)." This issue persists even after attempting to clear the SSH fingerprint using `remove_failed_node_ssh_fingerprint` alias, preventing any further maintenance actions or deployments on the cluster Environment
The VApp cluster management requires SSH connectivity to a node for its removal, even if the node is crashed and no longer exists. The system attempts to connect to the decommissioned IP, leading to an unreachable host error, and thus preventing the removal process. Standard troubleshooting steps, such as clearing SSH fingerprints, do not resolve the underlying issue of the node being completely offline and unpingable.
A patch (HF_VA-14.5.0-20250715133152-DE641147.tgz.gpg) was provided by engineering to address this specific issue. Applying this hotfix allowed the customer to successfully remove the crashed and permanently decommissioned VApp node from their cluster.
Steps to Resolve:
Note: In cases where direct removal is impossible due to a permanently decommissioned node, this specific patch bypasses the SSH connectivity requirement for removal, enabling the cluster to be managed again.