After upgrading the cluster from 8.6.x to 8.10.x or from 8.18.0 to 8.18.5 or 8.18.5 to 8.18.6, the cloud proxy shows offline or displays an Outdated Life Status in the Aria Operations UI. The UI continues to show the older cloud proxy 8.6.x or 8.18.x version, while the cprc-cli reports the latest version, and the cloud proxy is not collecting data.
cd /storage/log/var/log
less haproxy-traffic.log
####-##-##T##:##:##+##:## localhost haproxy[####]: backend PrxyRC_BE has no server available!
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:58764 [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /suite-api/api/auth/token/acquire?_no_links=true HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_FE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_BE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_UNSECURE_FE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Proxy PrxyRC_UNSECURE_BE started.
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: ##ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:##### [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/#"POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:##### [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /suite-api/api/auth/token/acquire?_no_links=true HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####[##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:##### [##/###/####:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####
[##/###/###:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####
[##/###/###:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: x.x.x.x:#####
[##/###/###:##:##:##.###] PrxyRC_FE~ PrxyRC_BE/VROPS_# #/#/-#/-#/#### ### ### - - SC-- #/#/#/#/# #/# "POST /casa/authorize HTTP/#.#"
####-##-##T##:##:##+##:## localhost haproxy[####]: Server PrxyRC_BE/VROPS_# is DOWN, reason: Layer6 invalid response, info: "SSL handshake failure", check duration: #ms. # active and # backup servers left. # sessions active, # requeued, # remaining in queue.
####-##-##T##:##:##+##:## localhost haproxy[####]: backend PrxyRC_BE has no server available!
VMware Aria Operations (formerly known as vRealize Operations) 8.x
The issue occurred because a custom certificate was applied on the VMware Aria Operations (formerly known as vRealize Operations) cluster. Since the cluster was running on 8.6.x or 8.18.x, the certificate was not automatically pushed to the cloud proxy. After the cluster upgrade, the cloud proxy could not renew the certificate, causing its connection to the VMware Aria Operations (formerly known as vRealize Operations) cluster to fail due to the certificate being untrusted.
Note: Take a snapshot of the Cloud Proxy(s) before applying the workaround.
Snapshot Creation in VMware Aria Operations
The following steps were used to resolve the issue with custom vROps SSL certificates:
service vmware-casa stop
service httpd-north stop
service haproxy stop
service collector stop
cat /storage/vcops/user/conf/ssl/web_chain.pem
cat /storage/vcops/user/conf/ssl/cacert.pem
/storage/vcops/user/conf/ssl in Cloud Proxy using the following command:
cd /storage/vcops/user/conf/ssl
Take a backup of the server.ca.pem and server.root.ca.pem files.
cp server.ca.pem server.ca.pem_bak
cp server.root.ca.pem server.root.ca.pem_bak
server.ca.pem file with the content of web_chain.pem file (copied from Primary node in Step 2), and server.root.ca.pem file with the content of cacert.pem file (copied from Primary node in Step 2).chmod 644 server.ca.pem
chmod 644 server.root.ca.pem
chown admin:admin server.ca.pem
chown admin:admin server.root.ca.pem
reboot -f
Impact/Risks: