Replace expired internal certificate in vRealize Operations
search cancel

Replace expired internal certificate in vRealize Operations

book

Article ID: 327411

calendar_today

Updated On:

Products

VMware Aria Suite

Issue/Introduction

Symptoms:
  • The vRealize Operations internal certificate has expired.
    • This could be manifested by being unable to log into the Admin UI.
  • The cluster is Offline and you are unable to bring it Online, you see a message similar to:
 "Data Retriever is not initialized yet. Please wait.".
  • The vRealize Operations internal certificate will expire soon.
Note: This article is not applicable or required for VMware Aria Operations 8.12 or later.

Environment

VMware vRealize Operations Manager 6.x
VMware vRealize Operations Manager 7.x
VMware vRealize Operations 8.0.x
VMware vRealize Operations 8.1.x
VMware vRealize Operations 8.2.x
VMware vRealize Operations 8.3.x
VMware vRealize Operations 8.4.x
VMware vRealize Operations 8.5.x
VMware vRealize Operations 8.6.x

Cause

The internal certificate in vRealize Operations is generated upon initial deployment.
Currently, upgrading to later versions of vRealize Operations does not upgrade the internal certificate.

Note: It is not possible or supported to replace the internal certificate with a custom certificate.

Resolution

Identify if Certificate Renewal is Required

First, validate is the certificate renewal is required. If the certificate is not yet expired, the certificate can be checked from a Web Browser.  See the steps below for the most common web browsers:


Alternatively, if the certificate is expired or the UI is inaccessible, the certificate must be checked from the Primary node's command line.


Note: Starting in vRealize Operations 8.0, a pop up is displayed in the UI, warning when certificate expiration will occur.

Mozilla Firefox

  1. Open https://Primary_Node_IP_or_FQDN:6061.

Notes:
  • Replace Primary_Node_IP_or_FQDN with the actual IP or FQDN of the vRealize Operations Primary node.
  • The page displays a Warning: Potential Security Risk Ahead or Secure Connection Failed message; this is expected.
  • The Gemfire service must be running for a certificate to be presented.
  • No web page is expected to load, this is normal behavior; continue with the steps.
  1. Click on Advanced and then on View Certificate.
  2. Check the certificate end date under Period of Validity.


Google Chrome

  1. Open https://Primary_Node_IP_or_FQDN:6061.

Notes:
  • Replace Primary_Node_IP_or_FQDN with the actual IP or FQDN of the vRealize Operations Primary node.
  • The page displays a Your connection is not private or This site can't provide a secure connection message; this is expected.
  • The Gemfire service must be running for a certificate to be presented.
  • No web page is expected to load, this is normal behavior; continue with the steps.
  1. Click on Not secure in address bar then click on Certificate (Invalid).
  2. Check the certificate end date under Valid From.


Microsoft Edge

  1. Open https://Primary_Node_IP_or_FQDN:6061.

Notes:
  • Replace Primary_Node_IP_or_FQDN with the actual IP or FQDN of the vRealize Operations Primary node.
  • The page displays a This site is not secure message; this is expected.
  • The Gemfire service must be running for a certificate to be presented.
  • No web page is expected to load, this is normal behavior; continue with the steps.
  1. Click on Certificate error in address bar then click on View certificate.
  2. Check the certificate end date under Valid To.


Command Line

If the certificate is expired or the UI is inaccessible, the certificate must be checked from the Primary node's command line.

  1. Log into the Primary node as root via SSH or Console.
  2. Run the following command:
/bin/grep -E --color=always -B1 'java.security.cert.CertPathValidatorException: validity check failed|java.security.cert.CertificateExpiredException' $ALIVE_BASE/user/log/*.log | /usr/bin/tail -20

Note:
  • If step 2 returns nothing, certificate renewal is not yet required.
  • If step 2 returned output containing validity check failed, certificate renewal is required immediately.
 

Certificate Renewal

To renew the certificate, install the applicable pak file to generate a new internal certificate.
Depending on if the certificate has expired or not, choose the following applicable steps to install the PAK file.

Internal Certificate Not Expired

If the vRealize Operations internal certificate has not yet expired, install the vRealize Operations Certificate Renewal PAK file while the vRealize Operations cluster is in an Offline state.

Note: Ensure all of the following steps are completed on all nodes in the vRealize Operations cluster unless noted otherwise.

  1. Snapshot the vRealize Operations nodes by following How to take a Snapshot of vRealize Operations.
  2. Download the Certificate Renewal PAK file for your version of vRealize Operations from the Broadcom Support Potal.
Notes:
    • For versions 6.3 to 8.1.1, select version 8.0.0.  This pak is compatible with vRealize Operations versions 6.3 to 8.1.1.
    • For versions 8.2 to 8.3, select version 8.2.0.  This pak is compatible with vRealize Operations versions 8.2 to 8.3
    • For versions 8.4.x to 8.10.x, select version 8.4.0.  This pak is compatible with vRealize Operations versions 8.4 to 8.10.x.
  1. Log into all nodes in the vRealize Operations cluster as root via SSH or Console.
  2. Log into the vRealize Operations Admin UI as the local admin user.
  3. Click Take Offline under Cluster Status.
Note: Wait for Cluster Status to show as Offline.
  1. Click Software Update in the left panel.
  2. Click Install a Software Update in the main panel.
  3. Follow the steps in the wizard to locate and install your PAK file.
  4. Install the certificate renewal PAK file.
  5. Wait for the software update to complete. When it does, the Administrator interface logs you out.
Note: If the cluster does not report the installation as Completed after a long time, compete the 4 steps listed just after step 14.
  1. Log into the vRealize Operations Admin UI as the local admin user.
  2. Clear the browser caches and if the browser page does not refresh automatically, refresh the page.
  3. Click Bring Online under Cluster Status.
Note: The cluster status changes to Going Online. When the cluster status changes to Online, the upgrade is complete.
  1. Run the following commands on all nodes in the vRealize Operations cluster:
  • chown admin:admin -R /storage/vcops/user/conf/ssl/ /storage/vcops/user/conf/ssl_bak/ /storage/db/casa/webapp/hsqldb/
  • chown -h root:root /storage/vcops/user/conf/ssl/web_cert.pem /storage/vcops/user/conf/ssl/web_chain.pem /storage/vcops/user/conf/ssl/web_key.pem
  • chmod guo+r -R /storage/vcops/user/conf/ssl/
  • chmod 444 /storage/vcops/user/conf/ssl/cacert.pem /storage/vcops/user/conf/ssl/slice_*_cert.pem
  • chmod 400 /storage/vcops/user/conf/ssl/cakey.pem /storage/vcops/user/conf/ssl/slice_*_cert.pfx /storage/vcops/user/conf/ssl/slice_*_key.pem
  • chmod 640 /storage/vcops/user/conf/ssl/tcserver.keystore
Note: For version 8.4 and later, also run the following commands on the Primary node and Primary Replica node (if present) and all data nodes:
  • chown postgres:root /storage/vcops/user/conf/ssl/postgres_vcopsrepl_*
  • chmod 600 /storage/vcops/user/conf/ssl/postgres_vcops_key.pk8 /storage/vcops/user/conf/ssl/postgres_vcopsrepl_key.pem
  • chmod 640 /storage/vcops/user/conf/ssl/postgres_vcops_cert.pem /storage/vcops/user/conf/ssl/postgres_vcopsrepl_cert.pem

If the admin UI after a long time does not report that installation of the Certificate Renewal PAK file as completed, complete the following steps.
  1. Log into the Primary node as root via SSH or Console.
  2. Run the following command to update the PAK installation status:
sed -i -e 's/\"initialization_state\"\:\"INITIALIZING\"/\"initialization_state\"\:\"NONE\"/g' /data/db/casa/webapp/hsqldb/casa.db.script
  1. Repeat steps 1-2 on the Primary Replica node (if present).
  2. Run the following command on the Primary and Primary Replica (if present) nodes to restart the CaSA service:
service vmware-casa restart
 

Internal Certificate Expired

If the vRealize Operations internal certificate has already expired, the vRealize Operations Certificate Renewal PAK file will need to be installed manually.  Complete the following steps on the vRealize Operations cluster while the cluster is in an Offline state. 
Note: Ensure all of the following steps are completed on all nodes in the vRealize Operations cluster unless noted otherwise.

  1. Snapshot the vRealize Operations nodes by following How to take a Snapshot of vRealize Operations.
  2. Download the Certificate Renewal PAK file for your version of vRealize Operations from the Broadcom Support Portal.
Notes:
    • For versions 6.3 to 8.1.1, select version 8.0.0.  This pak is compatible with vRealize Operations versions 6.3 to 8.1.1.
    • For versions 8.2 to 8.3, select version 8.2.0.  This pak is compatible with vRealize Operations versions 8.2 to 8.3
    • For versions 8.4.x to 8.10.x, select version 8.4.0.  This pak is compatible with vRealize Operations versions 8.4 to 8.10.x.
  1. Copy the vRealize Operations Certificate Renewal PAK file to the /tmp/ directory on all nodes in the vRealize Operations cluster using an SCP utility.
  2. Log into all nodes in the vRealize Operations cluster as root via SSH or Console.
  3. Run the following command on all nodes in the vRealize Operations cluster to make the necessary directories:
mkdir -p /data/db/pakRepoLocal/vRealize_Operations_Manager_Enterprise_Certificate_Renewal/extracted
  1. Unzip the vRealize Operations Certificate Renewal PAK file by running the following command on all nodes in the vRealize Operations cluster:
unzip /tmp/vRealize_Operations_Manager_Enterprise_Certificate_Renewal-build.pak -d /data/db/pakRepoLocal/vRealize_Operations_Manager_Enterprise_Certificate_Renewal/extracted

Note: Replace build with the build number of the downloaded vRealize Operations Certificate Renewal PAK file.
Exampleunzip /tmp/vRealize_Operations_Manager_Enterprise_Certificate_Renewal-8.0.0.15217416.pak -d /data/db/pakRepoLocal/vRealize_Operations_Manager_Enterprise_Certificate_Renewal/extracted
  1. Stop all services by running the following commands:
service vmware-vcops-watchdog stop
service vmware-vcops stop


Note: For versions 8.3 and later, ensure that all services have been stopped by running the vrops-status command.
If there is a running service please kill it manually.

Example: (vpostgres) is running (3557)
Run this command to terminate the process: kill -9 3557
  1. Check if the is_admin property is set only for the Primary node in casa.db.srcipt.
    1. On all nodes (including Remote Collectors and Witness) run the following command to verify the status of the is_admin property:
sed -nre "/clusterMembership/ s/^[^']+'([^']+)','([^']+)'.*/\2/p" /storage/db/casa/webapp/hsqldb/casa.db.script | python -m json.tool
  1. In the output ""is_admin_node": true" should only be set when the "slice_name": "MASTER".  If true is set for other nodes, complete the following on all nodes (including Remote Collectors and Witness):
    • Run service vmware-casa stop
    • Edit /storage/db/casa/webapp/hsqldb/casa.db.script and ensure "is_admin_node" is set to true for the Primary node, and false for all other nodes.
    • Run service vmware-casa start
  1. The following command needs to be run in a particular order.  Follow each sub-step carefully.
Command: $VMWARE_PYTHON_BIN /data/db/pakRepoLocal/vRealize_Operations_Manager_Enterprise_Certificate_Renewal/extracted/updateCoordinator.py EXPIRED
  1. First, run the command on all Remote Collector nodes (if present) in the cluster, and wait for the task to complete.  Continue to step 8.2.
  2. Next, run the command on all Data nodes, the Witness node (if present), and the Primary Replica node (if present) in the cluster; do not wait for each node to complete, just start the command on all nodes.  Once Waiting for certificate generation to complete appears on the last node, wait roughly 60 seconds, and continue to step 8.3.
  3. Finally, run the command on the Primary node.
The expected behavior is for the command to finish, then shortly afterwards the pending tasks on the Data nodes and Primary Replica node (if present) will complete.

Note: To ensure that the command completes successfully check for the existence of the /var/vmware/_cert_generation_completed file on the Primary node.
  1. Change newly generated certificates permissions on all nodes in the vRealize Operations cluster by running the following commands:
  • chown admin:admin -R /storage/vcops/user/conf/ssl/ /storage/vcops/user/conf/ssl_bak/ /storage/db/casa/webapp/hsqldb/
  • chown -h root:root /storage/vcops/user/conf/ssl/web_cert.pem /storage/vcops/user/conf/ssl/web_chain.pem /storage/vcops/user/conf/ssl/web_key.pem
  • chmod guo+r -R /storage/vcops/user/conf/ssl/
  • chmod 444 /storage/vcops/user/conf/ssl/cacert.pem /storage/vcops/user/conf/ssl/slice_*_cert.pem
  • chmod 400 /storage/vcops/user/conf/ssl/cakey.pem /storage/vcops/user/conf/ssl/slice_*_cert.pfx /storage/vcops/user/conf/ssl/slice_*_key.pem
  • chmod 640 /storage/vcops/user/conf/ssl/tcserver.keystore
Note: For version 8.4 and later, also run the following commands on the Primary node and Primary Replica node (if present) and all data nodes:
  • chown postgres:root /storage/vcops/user/conf/ssl/postgres_vcopsrepl_*
  • chmod 600 /storage/vcops/user/conf/ssl/postgres_vcops_key.pk8 /storage/vcops/user/conf/ssl/postgres_vcopsrepl_key.pem
  • chmod 640 /storage/vcops/user/conf/ssl/postgres_vcops_cert.pem /storage/vcops/user/conf/ssl/postgres_vcopsrepl_cert.pem
  1. Log into the vRealize Operations Admin UI as the local admin user.
  2. Click Bring Offline under Cluster Status.
  3. If the cluster fails to go offline, click Force Offline under Cluster Status.
Note: Wait for the Cluster Status to show as Online.
  1. Click Bring Online under Cluster Status.
Note: Wait for the Cluster Status to show as Online.



Additional Information

Note: After the certificate renewal, vRealize Operations retains the previous key in it's truststore once the new certificate is generated.  vRealize Operations will use both the old and new certificate for validation.
Currently there is no revocation mechanism.

Impact/Risks:
The following instructions are for vRealize Operations 6.3 - 8.10.x.
There is no recovery option for vRealize Operations Manager 6.2.x and earlier.

Upgrading to vRealize Operations 8.6 also upgrades the internal certificate during the upgrade process, except for setups with Cloud Proxies connected. 
The VMware vRealize Operations Certificate Renewal PAK file must be applied on vRealize Operations 8.6 only if the internal certificate has expired and if a Cloud Proxy node was connected to vRealize Operations before upgrade to 8.6.

If you are on vRealize Operations 8.5 or lower with Cloud Proxies connected, you must upgrade to vRealize Operations 8.6 before upgrading to vRealize Operations 8.10 or later.
If you are on vRealize Operations 8.5 or lower, and you do not have Cloud Proxies connected, you can upgrade directly to vRealize Operations 8.10 or later, within the bounds of your Upgrade Path.

After upgrading the vRealize Operations certificate by PAK file, any Cloud Proxies certificates need to be upgraded manually (vRealize Operations 8.4 or later).
This must be done by extracting the root certificate from vRealize Operations by using any browser and upload to Cloud Proxy by following Add CA certs while deploying a cloud proxy in vRealize Operations 8.4 or later.

Before following the resolution below, it is vital to snapshot the vRealize Operations nodes by following How to take a Snapshot of vRealize Operations.
Note: If the certificate has already expired, and the Admin UI is not accessible, steps 1, 2, 10, and 11 can be skipped on the above mentioned KB.

Note: Upgrading vRealize Operations is not a replacement for this article.  The steps below still must be followed.

Attachments

post_upgrade get_app