Cloud proxy appears offline or shows an outdated life status in Aria Operations UI after a cluster upgrade
search cancel

Cloud proxy appears offline or shows an outdated life status in Aria Operations UI after a cluster upgrade

book

Article ID: 312288

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

After upgrading the cluster from 8.6.x to 8.10.x or from 8.18.0 to 8.18.5 or 8.18.5 to 8.18.6, the cloud proxy shows offline or displays an Outdated Life Status in the Aria Operations UI. The UI continues to show the older cloud proxy 8.6.x or 8.18.x version, while the cprc-cli reports the latest version, and the cloud proxy is not collecting data.

cd /storage/log/var/log

less  haproxy-traffic.log

####-##-##T##:##:##+##:## localhost haproxy[####]: backend PrxyRC_BE has no server available!
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:58764 [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /suite-api/api/auth/token/acquire?_no_links=true HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_FE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_BE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_UNSECURE_FE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_UNSECURE_BE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: ##ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:##### [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/#"POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:##### [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /suite-api/api/auth/token/acquire?_no_links=true HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####[##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:##### [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####
[##/###/###:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####
[##/###/###:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####
 [##/###/###:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: backend PrxyRC_BE has no server available!

Environment

VMware Aria Operations (formerly known as vRealize Operations) 8.x

Cause

The issue occurred because a custom certificate was applied on the VMware Aria Operations (formerly known as vRealize Operations) cluster. Since the cluster was running on 8.6.x or 8.18.x, the certificate was not automatically pushed to the cloud proxy. After the cluster upgrade, the cloud proxy could not renew the certificate, causing its connection to the VMware Aria Operations (formerly known as vRealize Operations) cluster to fail due to the certificate being untrusted.

Resolution

Note: Take a snapshot of the Cloud Proxy(s) before applying the workaround.

Snapshot Creation in VMware Aria Operations

The following steps were used to resolve the issue with custom vROps SSL certificates:

  1. Login to the Cloud Proxy via ssh session, then stop the following services on the Cloud Proxy:
    service vmware-casa stop
    service httpd-north stop
    service haproxy stop
    service collector stop
  2. SSH to VMware Aria Operations (formerly known as vRealize Operations) Primary node, retrieve the contents of the web_chain.pem and cacert.pem files from /storage/vcops/user/conf/ssl and paste them into Notepad++ using:
    cat /storage/vcops/user/conf/ssl/web_chain.pem
    cat /storage/vcops/user/conf/ssl/cacert.pem
    
  3. Go to the directory /storage/vcops/user/conf/ssl in Cloud Proxy using the following command:
    cd /storage/vcops/user/conf/ssl
  4.  Take a backup of the server.ca.pem and server.root.ca.pem files.

    cp server.ca.pem server.ca.pem_bak
    cp server.root.ca.pem server.root.ca.pem_bak
    
  5. Using vi editor, edit server.ca.pem file with the content of web_chain.pem file (copied from Primary node in Step 2), and server.root.ca.pem file with the content of cacert.pem file (copied from Primary node in Step 2).
  6. At this stage, the VMware Aria Operations (formerly known as vRealize Operations) cluster’s web_chain.pem content should be in server.ca.pem on the Cloud Proxy, and the cacert.pem content should be in server.root.ca.pem.
  7. Set proper file permissions:
    chmod 644 server.ca.pem
    chmod 644 server.root.ca.pem
  8. Update file ownership:
    chown admin:admin server.ca.pem
    chown admin:admin server.root.ca.pem
  9. Reboot the Cloud Proxy using below command:
    reboot -f

Additional Information

Impact/Risks:

  1. The Cloud Proxy will appear Offline or displays an Outdated Life Status in the VMware Aria Operations (formerly known as vRealize Operations) UI, and its version will show as 8.6.x or 8.18.x, while the cprc-cli -s output will display 8.10.x or 8.18.x
  2. Data collection from the Cloud Proxy may stop.