Best practices to shutdown VCF Operations for Networks Clustered deployments
search cancel

Best practices to shutdown VCF Operations for Networks Clustered deployments

book

Article ID: 314428

calendar_today

Updated On:

Products

VCF Operations for Networks

Issue/Introduction

This article offers the detailed process on how to effectively shut down and start up VCF Operations for Networks deployments, in connection with maintenance activities.  Such maintenance activities may include following:

  • Upgrading the version of VCF Operations for Networks
  • Applying Patches on top of current release deployments
  • Any other maintenance activities apart from upgrades and applying GA/Hot patches (such as expanding a cluster, for example)

Before doing any of these maintenance activities, the virtual machine snapshot process must be used to provide a revert point, in case something goes wrong during the maintenance activity.

Obtaining revert point(s) is done via the virtual machine snapshot technique, by taking snapshots while the VM is in a  powered off state.

For a simple deployment (one platform node), the process is simple -- the platform node and any collector node(s) are shut down and revert point(s) taken using the virtual machine snapshot technique. 

  • Use the Power --> Shut Down Guest O/S action within vCenter -- first for the collector node(s), and then for the platform node. 
  • After the VM(s) are fully powered off, use the Snapshots --> Take Snapshot action to create the desired revert point.

For a clustered deployment (more than one platform node), the script attached to this KB article must be used to shut the VMs down before creating the revert points using the virtual machine snapshot technique. 

The script attached in this KB is applicable for ALL VCF Operations for Networks deployments, unless you are using Aria Suite Lifecycle (vRSLCM) version 8.18 patch 5 specifically.  

  • If you are using any other version of Aria Suite Lifecycle (vRSLCM) other than version 8.18 patch 5 specifically, then you must use this script.

NOTE:  VCF Operations for Networks was formerly named Aria Operations for Networks (AON), and prior to that was named vRealize Network Insight (vRNI).

Environment

  • VCF Operations for Networks 

Resolution

PROCEDURE:

  • WARNING:  Follow the steps below, in the exact sequence.  Do not skip any steps and do not introduce any steps that are not stated.
  1. Download the Script :  Download the script  from the attachment section of this article to your local system.

  2. Check the validity of the downloaded script file by checking the MD5 checksum of the file against the following details: 

    • Filename, size and checksum values details:

      • Filename: vrni-cluster-shutdown-script.sh
      • File size: 14.5 KB
      • Checksum Values:
        • MD5: EAD841F0AF75CEE0D2C4DFEBE19BDB24
        • SHA-1: E5E1FA7DECCA25FDA9188FAE5ED0B182DDA1966E
        • SHA-265: 257DF2738CC48DD3F880F2E725A681952F9D305AF36CE00868E944373A313C2F

  3. Copy the Script to Platform Node1 in a Clustered deployment :

    1. Below is an example of a Secure Copy Protocol command to copy the script to directory /home/support/ in Platform node 1: 
      scp vrni-cluster-shutdown-script.sh support@<platform1-IP>:/home/support/
    2. The above example will request a password which will be the support account password.

    3. Alternatively, a third party tool such as WinSCP can be used to copy the file to Platform1 to directory /home/support/ using the support account.

  4. Steps to take snapshot  (create the revert point):

    1. Manually shutdown all Collector nodes in the deployment from vCenter using  Power > Shut down guest OS.  The sequence does not matter.  

    2. Log into the platform 1 node via SSH using the support account.

    3. Enter the following command to change from the support user to the ubuntu user:
      ub
    4. Enter the command:
      sudo mv /home/support/vrni-cluster-shutdown-script.sh /home/ubuntu/vrni-cluster-shutdown-script.sh
    5. Enter the command:
      sudo chown ubuntu:ubuntu vrni-cluster-shutdown-script.sh
    6. Enter the command:
      sudo chmod +x vrni-cluster-shutdown-script.sh
    7. Enter the following command to shut down all platform nodes in the appropriate sequence:
      ./vrni-cluster-shutdown-script.sh shutdown 127.0.0.1 "/home/ubuntu/vrni-cluster-shutdown-script$(date +%s).log"
    8. Verify using the vCenter GUI that all the platform and collector nodes are successfully powered off.  The SSH session will disconnect as expected behavior.

    9. From vCenter take snapshots of the platforms and collector nodes to create the revert points that can be used if necessary if something goes wrong during the maintenance.

    10. Power on all the platform nodes in the cluster, order should be 1 to N (where N = the highest numbered platform node in your deployment).  

    11. Log into the platform 1 node via SSH using the support account.

    12. Enter the following command to change from the support user to the ubuntu user:
      ub
    13. Enter the following command to start all services on all platform nodes in the appropriate sequence:
      ./vrni-cluster-shutdown-script.sh start-services 127.0.0.1 "/home/ubuntu/vrni-cluster-shutdown-script$(date +%s).log"
    14. After a period of time not exceeding 15 minutes, log into the VCF Operations for Networks GUI using normal credentials, and confirm that no apparent issues are seen on the GUI on all the platform nodes in the cluster.

    15. Power on all the Collector nodes from vCenter(power options) one by one. There is no set order for  powering on the collector nodes.  Following power on, it can take up to 15-20 minutes for a collector to start collecting.  Any error associated with collection is expected to disappear after that time.  

  5. If something goes wrong during the maintenance, follow the steps below to revert using the snapshots taken in Step 4 i) above.

    1. Revert the snapshots of the platforms and collectors. Because the snapshots were taken in a powered off state, the reversion will leave the nodes in that powered off state. 

    2. Power on all the platform nodes in the cluster; order should be 1 to N (where N = the highest numbered platform node in your deployment).

    3. Verify all the platform nodes are powered on in vCenter UI.

    4. Log into the platform 1 node via SSH using the support account.

    5. Enter the following command to change from the support user to the ubuntu user:
      ub
    6. Enter the following command to start all services on all platform nodes in the appropriate sequence:
      ./vrni-cluster-shutdown-script.sh start-services 127.0.0.1 "/home/ubuntu/vrni-cluster-shutdown-script$(date +%s).log"
    7. After a period of time not exceeding 15 minutes, log into the VCF Operations for Networks GUI using normal credentials, and confirm that no apparent issues are seen on the GUI on all the platform nodes in the cluster.

    8. Power on all the Collector nodes from vCenter(power options) one by one. There is no set order for  powering on the collector nodes.  Following power on, it can take up to 15-20 minutes for a collector to start collecting.  Any error associated with collection is expected to disappear after that time.  

 

 

Additional Information

Important Notes:

  • For a simple deployment (only one platform node), the script is NOT needed.

  • Execute the script from platform node1 only. 

Attachments

vrni-cluster-shutdown-script.sh get_app