Container related errors occurs after restarting one or more clustered vRA nodes
search cancel

Container related errors occurs after restarting one or more clustered vRA nodes

book

Article ID: 301132

calendar_today

Updated On:

Products

VMware

Issue/Introduction

Symptoms:
In a vRealize Automation 7.2 clustered environment, when one or more nodes are restarted, the Container service misbehave with symptoms:
  • Inconsistent data collected where some of the Docker host containers are not discovered. It is possible to have situation when on a Docker host there are 10 containers available and in Admiral not all of them are discovered and visible.
  • Randomly failing requests with message similar to:

    javax.net.ssl.SSLHandshakeException: General SSLEngine problem.
     
  • Inconsistent data displayed depending on what node the UI is (internally) requesting the data to.
  • The containers tab does not load properly.


Cause

This issue occurs due to a problem in the clustering of the VMware Admiral instance on the vRealize Automation hosts.

Resolution

This issue is resolved in vRealize Automation 7.3, available at VMware Downloads.

 
To work around this issue if you do not want to upgrade:
  1. Download the file KB2148212_Patch.zip.
  2. Extract the zip file to get patch.sh script.
  3. Copy patch.sh to a working directory on each vRealize Automation node.
  4. Add execute permissions to the script.
  5. Execute bash patch.sh sequentially on each node

Caution: Do not execute the script in parallel on all nodes.

 
Steps that are executed by this patch script:
  • A backup archive of all container related data is created in /tmp directory.
  • The necessary files are extracted to a temporary folder and then an installer script is invoked.
  • At first the xenon service instance is stopped, the necessary files are copied and then Xenon is started back again. Finally the temporary folder is deleted from the system.
 
Patch Output:
 
If output of the patch execution appears as :
 
Node will not start. Available node detected but it is not responsive yet. Try again later.
 
Execute the patch on other node(s) and start xenon service manually once patch execution succeeded on other nodes.
 
The command to start the service is:
service xenon-service start
 
Backup steps:
  • A backup of all container related data is created automatically by the script. No manual actions are required.
 
Rollback steps:
  • Restore the /etc/xenon directory from the backup archive created automatically by the script.
 


Additional Information

To be alerted when this document is updated, click the Subscribe to Article link in the Actions box.

中文简体重新启动一个或多个群集 vRA 节点后发生与容器相关的错误

Attachments

KB2148212_patch.zip get_app