FDM Agent Crashes Immediately When Enabling vSphere HA After vCenter Upgrade to 8.0U2 or Earlier
search cancel

FDM Agent Crashes Immediately When Enabling vSphere HA After vCenter Upgrade to 8.0U2 or Earlier

book

Article ID: 438349

calendar_today

Updated On:

Products

VMware vCenter Server

Issue/Introduction

  • Following a vCenter upgrade to version 8.0U2 or earlier, the FDM agent crashes immediately upon enabling vSphere HA. 
  • The vSphere HA advanced options include the das.config.vmacore.ssl.sslOptions parameter.
  • The following log messages can be found in the ESXi host logs at /var/run/log/vobd.log and /var/run/log/fdm.log, respectively.

    /var/run/log/vobd.log

    YYYY-MM-DDTHH:MM:SS.Z: [UserWorldCorrelator] #############us: [vob.uw.core.dumped] /opt/vmware/fdm/fdm/fdm(#######) /var/core/fdm-zdump.000
    YYYY-MM-DDTHH:MM:SS.Z: [UserWorldCorrelator] #############us: [esx.problem.application.core.dumped] An application (/opt/vmware/fdm/fdm/fdm) running on ESXi host has crashed (### time(s) so far). A core file may have been created at /var/core/fdm-zdump.000.

    /var/run/log/fdm.log

    YYYY-MM-DDTHH:MM:SS.Z error fdm[#######] [Originator@#### sub=Libs] error [ConfigStore:##########] [2000] Invalid type specified for: ssl_options
    YYYY-MM-DDTHH:MM:SS.Z info fdm[#######] [Originator@#### sub=Libs] info [ConfigStore:##########] ConfigStoreException: [context]#####################################+########################/##################+###[/context]

Environment

VMware vSphere ESXi 8.0U2

Cause

This issue occurs if the advanced HA setting das.config.vmacore.ssl.sslOptions was configured on the cluster prior to the upgrade.

Resolution

  1. Ensure there are valid offline snapshots of the linked vCenter VMs or in case of standalone vCenter, take a snapshot without memory.
  2. Log in to the vSphere Client using an account with administrator privileges.
    • On the affected cluster, go to Configure > vSphere Availability > Advanced Options, delete the das.config.vmacore.ssl.sslOptions parameter, and save.
    • Once saved, return to the vSphere Availability settings and disable vSphere HA.
  3. Verify that the vCenter database (VCDB) has been properly cleaned by running the following command:
    • Establish an SSH session to the vCenter Server using the root account.
    • Connect to the vCenter database (VCDB) by executing the following command : /opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres
    • Verify that the parameter has been completely removed by running this query: select DAS_OPTIONS from VPX_COMPUTE_RESOURCE;
  4. Log in via SSH to each ESXi host within the HA-enabled cluster. (Note: This applies to hosts running ESXi versions greater than 7.0.3)
    • Run the following command to verify that sslOptions has been successfully removed from the ClusterConfig: /opt/vmware/fdm/fdm/prettyPrint.sh clusterconfig | grep -i "sslOptions"
    • Confirm that the command returns no output, indicating the parameter is no longer present.
    • If the sslOptions parameter is still present in the output, reconfigure the vSphere HA task from the vCenter Server to force the configuration update and ensure the parameter is completely removed.
  5. Run the following command on each host in the cluster to verify if the sslOptions parameter is still present within the configstorecli for the fdm service:
    • configstorecli config current get -c ha -g cluster -k fdm_service
  6. If it is still present, remove the sslOptions parameter and apply the correct settings by running the below commands:
    • Run the below command to fetch the current FDM service configuration and save it directly into a temporary JSON file:
      • configstorecli config current get -c ha -g cluster -k fdm_service > /tmp/fdm_update.json
    • Open the file using a text editor (like vi)  and completely delete the "ssl_options" line (and its value).  
      • vi /tmp/fdm_update.json
        • Before: 
          {
             "log": {
                "max_file_size": 0
             },
             "vmacore": {
                "ssl": {
                   "another_setting": "example",
                   "ssl_options": "386531328"
                }
             }
          }
        • After: 
          {
             "log": {
                "max_file_size": 0
             },
             "vmacore": {
                "ssl": {
                   "another_setting": "example"
                }
             }
          }
      • Save the file and exit the editor (in vi, press Esc, type :wq! and hit Enter).
    • Now that the file has been manually stripped of that specific key, apply it back to the config store:
      • configstorecli config current set -c ha -g cluster -k fdm_service -i /tmp/fdm_update.json
    • Verify that sslOptions has been updated correctly by using the following command:
      • configstorecli config current get -c ha -g cluster -k fdm_service
  7. Turn on vSphere HA for the affected cluster. For detailed instructions, refer to the article Disabling and enabling VMware vSphere High Availability (vSphere HA)