/storage/log filling up after vCenter upgrade to 7.0u2
search cancel

/storage/log filling up after vCenter upgrade to 7.0u2

book

Article ID: 318150

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

Symptoms:
  • vmware-sps service starts properly
  • On VCSA SSH, "wget localhost:22000/sms/HealthStatus ; cat HealthStatus ; rm HealthStatus" shows SPS health as yellow
  • vCenter > Configure > Storage providers showing as offline
  • vCenter > Configure > Refreshing storage providers gives the error "Storage service not initialised"
/var/log/vmware/vmware-sps/sps.log:
2021-05-25T10:16:53.846+09:00 [pool-35-thread-4] ERROR opId=lro-2-2869eb71 com.vmware.vim.storage.common.VmodlErrorStrings - ProviderLoader initialization is ongoing.
2021-05-25T10:16:53.846+09:00 [pool-35-thread-4] INFO  opId=lro-2-2869eb71 com.vmware.vim.sms.StorageManagerImpl - Timer stopped: queryStorageContainer, Time taken: 1 ms.
2021-05-25T10:16:53.846+09:00 [pool-35-thread-4] ERROR opId=lro-2-2869eb71 com.vmware.vim.storage.common.VmodlErrorStrings - Failed to query StorageContainer for input
(sms.fault.ServiceNotInitialized) {
   faultCause = null,
   faultMessage = null
}


=== OR === 
  • /var/log/vmware/vmware-sps/sps-runtime.log.stderr growing quickly with no limit
  • Seeing the following message in sps-runtime.log.stderr
INFO: Client raised fatal(2) handshake_failure(40) alert: Failed to read record org.bouncycastle.tls.TlsFatalAlert: handshake_failure(40)

Note: The preceding log excerpts are only examples. Date, time, and environmental variables may vary depending on your environment.

Environment

VMware vCenter Server 7.0.x

Cause

The certificate used by SMS to communicate with IOFilter VASA Providers is generated using signature algorithm sha1WithRSAEncryption. This was changed to sha256WithRSAEncryption in 7.0.2.
This is due to the fact that vCenter 7.0.2 is FIPS compliant and having sha1WithRSAEncryption will cause IOFilter VP's get into offline/disconnected state.

Resolution

To rectify this, delete the SMS certificate, unregister the offline providers, and restart the SPS service to register the storage providers with the SHA256 certificate. The below steps can also be used to reset the storage providers for troubleshooting.

Please take an offline snapshot and/or backup of all nodes in the SSO domain. Do not skip this step.



1. Stop the Storage Provider Service: service-control --stop vmware-sps

2. Delete the SMS certificate from VECS: /usr/lib/vmware-vmafd/bin/vecs-cli entry delete --store sms --alias sms_self_signed

3. Start the SPS service: service-control --start vmware-sps
This will have the effect of SMS self-signed certificate re-generated with sha256withRSAEncryption signing algorithm and this certificate will now be FIPS compliant. Regeneration of the cert will also cause all the storage providers to go offline as it's a new cert and no longer trusted; meaning step 5 will unregister them and clear them out.

4. After step 3 above, Wait for some time for SPS service to get into initialized state and health status is GREEN. Use the wget command above to check. Initialisation can take a long time.

5. Once after storage providers list is updated and providers are listed (there will be providers listed with offline/disconnected state), run the python script "unreg_vasa.py" and capture the output (needed only in case if the steps doesn't resolve the issue),

   python unreg_vasa.py -s <VC_IP_ADDRESS>
   
   NOTE: <VC_IP_ADDRESS> above need to be replaced with actual values.
   NOTE: The script waits 5 seconds between unregistering each VP. Please wait for it to finish.

6. Restart vmware-sps service: vmon-cli -r sps
   This will have the effect of all the IOFilter VPs that was unregistered in step 5 to get registered again.


Attachments

unreg_vasa get_app