vRealize Automation 8.3 GA deploy.sh is failing on deploying Identity service with errors: "Connection refused"
search cancel

vRealize Automation 8.3 GA deploy.sh is failing on deploying Identity service with errors: "Connection refused"

book

Article ID: 327421

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Symptoms:
Deploying vRealize Automation 8.3 GA fails on the Identity service deployment step with "Connection refused" errors in the Identity service logs while trying to reach vIDM.

Example error:
HealthUtilComponent.onErrorMarkDown:70 - finishConnect(..) failed: Connection refused: {VIDM_HOSTNAME/LB}/<IP>:443
 io.netty.channel.AbstractChannel$AnnotatedConnectException: finishConnect(..) failed: Connection refused: CLEIDMMGMT.mrisoftware.net/<IP>:443
  Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
 Error has been observed at the following site(s):
  |_ checkpoint ⇢ Request to POST https://{VIDM_HOSTNAME/LB}/SAAS/API/1.0/oauth2/token?grant_type=client_credentials [DefaultWebClient]


Note: Wrong IP address of vIDM, in this case: <IP>:443. This is a major sign for the issue described in this article.


Environment

VMware vRealize Automation 8.3.x

Cause

This issue occurs due to the default resolv.conf ndots configuration for all containers in vRA. Ndots value: 5 which, if combined with specific customer DNS server configuration, may result in vRA deployment failure.

An example of a DNS server configuration which is "incompatible" with the vRA ndots: 5 config:
-DNS server is configured to return a default IP address (<IP> for example) in case of an unresolvable hostname instead of an error.

Resolution

This issue is  fixed in version vRA 8.4 GA and later releases.

Workaround:
To workaround this issue, please follow the below steps:
Note: Please make sure that these steps to be applied only once
  1. Execute the following command on ONLY ONE vRA:
vracli cluster exec -- /bin/bash -c "grep -lR \"dnsPolicy: ClusterFirst\" /opt/charts /etc/k8s-config/ |xargs -n 1 sed -i -r '/^ *dnsConfig:/,/ * value: \"1\"/d'"

    This will configure the ndots value for each service container to be "1" in stead of "5".
  1. Update the user-profile service chart manually (this chart is an exception due to the fact that the command above does not take care of it).
Note: In case of an HA environment, we have to update the chart file (deployment.yaml) on ALL 3 vRA nodes.
  • The file to UPDATE is: /opt/charts/user-profile-service/templates/deployment.yaml
  • The following needs to be added in the chart in case it is missing: --->    The indentation is IMPORTANT

     dnsPolicy: ClusterFirst
     dnsConfig:
        options:
        - name: ndots
          value: "1"

    
    Example.
     restartPolicy: Always
     dnsPolicy: ClusterFirst
     dnsConfig:
        options:
        - name: ndots
          value: "1"
     affinity:
       podAntiAffinity:
         requiredDuringSchedulingIgnoredDuringExecution:
           - topologyKey: "kubernetes.io/hostname"
             labelSelector:
               matchLabels:
                 app: identity-service-app
                 environment: new
                 product: prelude
 
    Note: For reference where exactly it should be put in the yalm file, you can check any other service chart:
  1. ​​​​​Run the "/opt/scripts/deploy.sh" script.