How to update the SSH host keys on the SDDC Manager
search cancel

How to update the SSH host keys on the SDDC Manager

book

Article ID: 316028

calendar_today

Updated On:

Products

VMware Cloud Foundation

Issue/Introduction

The purpose of this KB is to remediate the incorrect/mismatched Host Keys stored in SDDC manager's several known_hosts files by using a script to remove the existing erroneous entries, and updating them with new ones.

Validation, Deployment, PreChecks and other workflows on the SDDC Manager are failing with errors similar to:

... Unable to create jsch CLI session ...
... com.jcraft.jsch.JSchException: reject HostKey: [Node_FQDN_or_IP] ...
at com.jcraft.jsch.Session.checkHost(Session.java:789)
at com.jcraft.jsch.Session.connect(Session.java:345)


SSH attempts to PSC/VC can fail with the following errors:

100.109: VError: PSC Initilization attempt "9" failed: Failed to initiate PSC: Primary psc init failed and failover psc init also failed: Unable to retrieve iDP Metadata: 500 - "\"Failed to establish SSH session to <VC_FQDN>\""
    at Object.initializationPscError (/opt/vmware/vcf/sddc-manager-ui-app/server/src/errors/VCFError.js:100:5)
    at attemptPSCInitWithRetry (/opt/vmware/vcf/sddc-manager-ui-app/server/src/services/pscUtils.js:108:46)


SSH attempts to ESXi/NSX/vRealize/WS1 Nodes can fail with the following errors:

ERROR [vcf_om,0000000000000000,0000] [c.v.evo.sddc.common.util.SshUtil,Thread-69] Unable to create jsch CLI session:
com.jcraft.jsch.JSchException: reject HostKey: <NSX_Node_FQDN>


Environment

VMware Cloud Foundation 4.x

VMware Cloud Foundation 5.x

Cause

Host Keys can be changed on a node for a variety of reasons, including but not limited to:

  • Restore from a backup
  • Manual rebuild
  • Manual intervention to change the Host Key

As a result of this change key, SDDC Manager is unable to SSH into the node(s) in question to run through the attempted workflow due to a mismatch in what it expects the Host key to be vs what is being presented by the node.

Resolution

The fixHostKeys.py is the newest script to run, and recommended one to use. 
Note: This script works in a FIPS enabled environment. 

  1. Take a snapshot of the SDDC Manager
  2. Download the fixHostkeys.py  script attached to the KB 
  3. Transfer the script to the SDDC manager, or copy the contents to a file on the SDDC Manager.
    • Ideally, we should put the script in /home/vcf/.
  4. SSH to the SDDC Manager with the vcf user, and su root
  5. Execute the script
    • Option 1: python fixHostKeys.py --resourceType < VCENTER | NSX_T_MANAGER | ESXI | NSXT_EDGE >
      • This will prompt with a domain selection menu. On selecting a domain it will remediate host keys for all selected resource type for the selected domain
    • Option 2: python fixHostKeys.py --node <FQDN of a specific node>
      • This will run on a specific node 

Logs are written to:
/var/log/vmware/vcf/fixHostKeys.log

 

In case the fixHostKeys.py script fails to detect and update the host keys, then fallback to fix_known_hosts.sh bash script.

Note: fix_known_hosts.sh does not work in a FIPS enabled environment.

  1. Take a snapshot of the SDDC Manager
  2. Download the fix_known_hosts.sh script attached to the KB:
  3. Transfer the script to the SDDC manager, or copy the contents to a file on the SDDC Manager.
  4. SSH to the SDDC Manager with the vcf user, and su root
  5. Make the script executable
    • chmod +x /tmp/fix_known_hosts.sh
  6. Execute the script
    • ./fix_known_hosts.sh
  7. Provide the FQDN and the IP address of the node for which we need to update the Host Keys for:
  8. Re-attempt the workflow that was failing due to the Host Key error.

 

 

Additional Information

Impact/Risks:
MINIMAL: The script edits 4 known_hosts files, namely:
/root/.ssh/known_hosts
/etc/vmware/vcf/commonsvcs/known_hosts
/home/vcf/.ssh/known_hosts
/opt/vmware/vcf/commonsvcs/defaults/hosts/known_hosts
Since this is a not a major change, the risk is minimal. That being said, since entries are being removed and added, a snapshot of the SDDC Manager is highly recommended, in the event we need to revert to a state from before the script was run.

 
Note : If this KB was applied to fix a SDDC Manager UI issue , please restart sddc-manager-ui-app service

 
You may receive an error when you try to run the script:
bash:  ./fix_known_hosts.sh: /bin/bash^M: bad interpreter: No such file or directory
This error is caused by DOS carriage returns added to the script when copying from a Windows based text editor.  To resolve this problem, run the following command and rerun the script:
sed -i -e 's/\r$//' fix_known_hosts.sh

 
 
 
 
 


Attachments

fixHostKeys.py get_app
fix_known_hosts.sh get_app