Following host reboot, an alarm "Host requires encryption mode enabled alarm" is triggered
Enabling encryption mode on the host fails with the below error
"A general runtime error occurred.<br/>Cannot generate key. CreateKey failed on key provider KMS-xxxxxxxx, error code:QLC_ERR_NEED_AUTH;<br/>Failed. Check log for details."
All of the disks on ESXi vSAN hosts show In CMMDS: false and Encryption shows enabled on all diskgroups.
This can be verified using the command esxcli vsan storage list
When trying to mount the diskgroup manually the following message occurs. "Unable to mount: Host is not in crypto safe state".
The following logs may appear in the /var/log/vsansystem.log.
2018-05-15T22:02:10.373Z info vsansystem[332605D700] [Originator@6876 sub=Libs opID=Main] VsanUtil: Get kms client key and cert, old:1
2018-05-15T22:02:10.373Z info vsansystem[332605D700] [Originator@6876 sub=Libs opID=Main] VsanUtil: GetKmsServerCerts Old KMS certs not found
2018-05-15T22:02:10.379Z info vsansystem[332605D700] [Originator@6876 sub=Libs opID=Main] VsanUtil: Get kms client key and cert, old:1
2018-05-15T22:02:10.379Z info vsansystem[332605D700] [Originator@6876 sub=Libs opID=Main] VsanUtil: GetKmsServerCerts Old KMS certs not found
2018-05-15T22:02:10.379Z info vsansystem[332605D700] [Originator@6876 sub=Libs opID=Main] VsanUtil: Create client context for server 10.#.#.#:5696
{67839} configure_backend_platform() - Configuring dynamic Linux backends
{67839} open_shared_lib() - Loaded crypto from /lib64/libcrypto.so.1.0.2
{67839} open_shared_lib() - Loaded ssl from /lib64/libssl.so.1.0.2
{67839} open_shared_lib() - Loaded qlopenssl from /usr/lib/vmware/vsan/lib64/libqlopenssl.so
{67839} try_openssl_backend() - Configured OpenSSL backend
{67839} peer_verify_cb() - Pending 1558
{67839} peer_verify_cb() - Pending 1424
2018-05-15T22:02:10.423Z error vsansystem[332605D700] [Originator@6876 sub=Libs opID=Main] VsanUtil: Failed to connect to key server, QLC_ERR_NEED_AUTH
2018-05-15T22:02:10.423Z error vsansystem[332605D700] [Originator@6876 sub=Libs opID=Main] VsanInfoImpl: Failed to retrieve key f1feec7b-a9e7-477d-b4e4-2xxxxxxxa from KMS-xxxxxxxx: QLC_ERR_NEED_AUTH
2018-05-15T22:02:10.423Z warning vsansystem[332605D700] [Originator@6876 sub=VsanPluginMgr opID=Main] Failed to retrieve host key from KMS: Failed to retrieve key from key management server cluster xxxxxxxx
2018-05-15T22:02:10.423Z error vsansystem[332605D700] [Originator@6876 sub=VsanSystemProvider opID=Main] Failed to check and load keys: Failed to retrieve key from key management server cluster xxxxxxxx
2018-05-15T22:02:10.423Z info vsansystem[332605D700] [Originator@6876 sub=VsanSystemProvider opID=Main] KEK unavailable, schedule again.
VMware vSAN 7.x
VMware vSAN 8.x
When VSAN encryption is enabled on the cluster for the first time, the host will transition to crypto-safe mode and will be assisgned a key to install as its HostKey. The host will always look for this key, based on the key identifier, when booting up
Since ESXi hosts in vSAN cluster is unable to contact to the KMS server, host fails to enter crypto safe mode and all disk groups are in unmounted state
Verify that the KMS server is up, and can be reached by ESXi hosts, and vCenter. KMS server should NOT be located in the vSAN datastore.
Also check KMS server against the VMware HCL.
Validate and/or check the KMS configuration on ESXi hosts.
All keys, and certificates for KMS should be located here. /etc/vmware/ssl.
Test connection to KMS server with the following command:
openssl s_client -connect <IP or FQDN of KMS server>:5696 -key /etc/vmware/ssl/vsan_kms_client.key -cert /etc/vmware/ssl/vsan_kms_client.crt -debug
NOTE:
There is no way to mount an encrypted disk group without the KMS server that originally encrypted it. It is crucial to always backup all KMS servers and to not keep them housed on the vSAN datastore that is encrypted.
It is recommended to run the KMS server on a secured, non-encrypted, redundant datastore that is highly available.