Services take a long time or fail to start in vRA HA environment
search cancel

Services take a long time or fail to start in vRA HA environment

book

Article ID: 336941

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Symptoms:
In a vRealize Automation 7.0 and 7.0.1 High Availability environment, you experience these symptoms:
  • Single node takes 30 minutes or more to boot. On the MKS console, you see the message:

    Waiting for elastic search to start

  • Services on a single node fail to register.
  • Services register within a normal time interval when both nodes are powered on at the same time.


Environment

VMware vRealize Automation 7.0.x

Cause

In vRealize Automation (vRA) 7.0 and 7.0.1 environments use vIdentity Manager (vIDM) for authorization and authentication. Part of that application is a service called elastic search, that is embedded on each appliance. This issue occurs because of the retry period for elastic search, which results in services to fail to start and delays the boot process when one node is down. This results in the boot time for a single node to extend to 30 minutes or more and may cause other vRA services to fail because they time out before they can continue.

Resolution

This issue is resolved in vRealize Automation 7.1
 
To resolve the long boot time and service failure:
 
Notes:
  • This process must be implemented on all vRA nodes in the HA environment.
  • Create a snapshot of the Appliances prior to running these steps.
  1. Download the attached 2145773_subsequentboot.hzn.zip file and extract the subsequentboot.hzn file.
  2. Log in to the vRA appliance using an SSH session and root credentials.
  3. Stop the elastic search service with this command:

    service elasticsearch stop

  4. Copy the subsequentboot.hzn file to all the appliance nodes.
  5. Copy the subsequentboot.hzn file to /usr/local/horizon/scripts/ directory.
  6. Start the elastic search service with the command:

    service elasticsearch start

  7. Reboot the appliance with these steps:
    1. Go to the vRA VAMI page (https://vRA_node_FQDN:5480) and log in with root user credentials.
    2. Navigate to the System tab under the Information section
    3. Click reboot

  8. Ensure that the elastic search starts up in approximately 1 minute by reviewing the output of the /var/log/vmware/vcac/catalina.out file.
  9. Verify the main vRA service is up and ready to receive REST calls using the fast heartbeat check located at https://lb_hostname:443/SAAS/API/1.0/REST/system/health/heartbeat.

    Note:The sts-service will not show as registered in the VAMI (https://vRA_FQDN:5480) under the Services tab due to the health request that is used returns in 16 seconds when run manually and the VAMI request times out in 10 seconds, this is a known cosmetic issue. You can check this manually using by navigating to https://lb-hostname:443/SAAS/API/1.0/REST/system/health.

  10. Ensure users can log in to the domain account.



Attachments

2145773_subsequentboot.zip get_app