Cloud proxy appears offline or shows an outdated life status in Aria Operations UI after a cluster upgrade
search cancel

Cloud proxy appears offline or shows an outdated life status in Aria Operations UI after a cluster upgrade

book

Article ID: 312288

calendar_today

Updated On:

Products

VCF Operations/Automation (formerly VMware Aria Suite)

Issue/Introduction

After upgrading the cluster from 8.6.x to 8.10.x or from 8.18.0 to 8.18.5, the cloud proxy shows offline or displays an Outdated Life Status in the Aria Operations UI. The UI continues to show the older cloud proxy 8.6.x or 8.18.x version, while the cprc-cli reports the latest version, and the cloud proxy is not collecting data.

cd /storage/log/var/log

less  haproxy-traffic.log

####-##-##T##:##:##+##:## localhost haproxy[####]: backend PrxyRC_BE has no server available!
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:58764 [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /suite-api/api/auth/token/acquire?_no_links=true HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_FE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_BE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_UNSECURE_FE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_UNSECURE_BE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: ##ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:##### [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/#"POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:##### [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /suite-api/api/auth/token/acquire?_no_links=true HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####[##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:##### [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####
[##/###/###:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####
[##/###/###:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####
 [##/###/###:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: backend PrxyRC_BE has no server available!

Environment

VMware Aria Operations (formerly known as vRealize Operations) 8.x

Cause

The issue occurred because a custom certificate was applied on the VMware Aria Operations (formerly known as vRealize Operations) cluster. Since the cluster was running on 8.6.x or 8.18.x, the certificate was not automatically pushed to the cloud proxy. After the cluster upgrade, the cloud proxy could not renew the certificate, causing its connection to the VMware Aria Operations (formerly known as vRealize Operations) cluster to fail due to the certificate being untrusted.

Resolution

Note: Take a snapshot of the Cloud Proxy(s) before applying the workaround.

Snapshot Creation in VMware Aria Operations

The following steps were used to resolve the issue with custom vROps SSL certificates:

  1. Open SSH sessions on the VMware Aria Operations (formerly known as vRealize Operations) Master Node and the Cloud Proxy, then stop the following services on the Cloud Proxy:

    # service vmware-casa stop
    
    # service httpd-north stop
    
    # service haproxy stop
    
    # service collector stop


  2. From the VMware Aria Operations (formerly known as vRealize Operations) Master node, retrieve the contents of the web_chain.pem and cacert.pem files from /storage/vcops/user/conf/ssl and paste them into Notepad++ using:

    cat /storage/vcops/user/conf/ssl/web_chain.pem
    
    cat /storage/vcops/user/conf/ssl/cacert.pem
    
    =========
    
    On the Cloud Proxy, back up the server.ca.pem file in /storage/vcops/user/conf/ssl using the following command:
    
     # cp server.ca.pem server.ca.pem_bak
    
     Then create a file named server.root.ca.pem using the following command:
    
     #cp server.ca.pem server.root.ca.pem
    
     Replace the content of server.ca.pem with the content from web_chain.pem, and replace the content of server.root.ca.pem with the content from cacert.pem.


  3. At this stage, the VMware Aria Operations (formerly known as vRealize Operations) cluster’s web_chain.pem content should be in server.ca.pem on the Cloud Proxy, and the cacert.pem content should be in server.root.ca.pem.

  4. Set proper file permissions:

    # chmod 644 server.ca.pem
    
    # chmod 644 server.root.ca.pem


  5. Update file ownership:

     # chown admin:admin server.ca.pem
    
     # chown admin:admin server.root.ca.pem


  6. Reboot the Cloud Proxy using:

    # reboot -f


Additional Information

Impact/Risks:

  1. The Cloud Proxy will appear Offline or displays an Outdated Life Status in the VMware Aria Operations (formerly known as vRealize Operations) UI, and its version will show as 8.6.x or 8.18.x, while the cprc-cli -s output will display 8.10.x or 8.18.x

  2. Data collection from the Cloud Proxy may stop.