Best practices to shutdown VCF Operations for Networks Clustered deployments
search cancel

Best practices to shutdown VCF Operations for Networks Clustered deployments

book

Article ID: 314428

calendar_today

Updated On:

Products

VCF Operations for Networks

Issue/Introduction

This article offers the detailed process on how to effectively shut down and start up VCF Operations for Networks deployments, in connection with maintenance activities.  Such maintenance activities may include following:

  • Upgrading the version of VCF Operations for Networks
  • Applying Patches on top of current release deployments
  • Any other maintenance activities apart from upgrades and applying GA/Hot patches (such as expanding a cluster, for example)

Before doing any of these maintenance activities, the virtual machine snapshot process must be used to provide a revert point, in case something goes wrong during the maintenance activity.

Obtaining revert point(s) is done via the virtual machine snapshot technique, by taking snapshots while the VM is in a  powered off state.

For a simple deployment (one platform node), the process is simple -- the platform node and any collector node(s) are shut down and revert point(s) taken using the virtual machine snapshot technique. 

  • Use the Power --> Shut Down Guest O/S action within vCenter -- first for the collector node(s), and then for the platform node. 
  • After the VM(s) are fully powered off, use the Snapshots --> Take Snapshot action to create the desired revert point.

For a clustered deployment (more than one platform node), the script attached to this KB article must be used to shut the VMs down before creating the revert points using the virtual machine snapshot technique. 

The script attached in this KB is applicable for ALL VCF Operations for Networks deployments, unless you are using Aria Suite Lifecycle (vRSLCM) version 8.18 patch 5 specifically.  

  • If you are using any other version of Aria Suite Lifecycle (vRSLCM) other than version 8.18 patch 5 specifically, then you must use this script.

NOTE:  VCF Operations for Networks was formerly named Aria Operations for Networks (AON), and prior to that was named vRealize Network Insight (vRNI).

Environment

  • VCF Operations for Networks 

Resolution

Follow the steps below, in the exact sequence.  Do not skip any steps and do not introduce any steps that are not stated.  

  1. Download the Script :- Download the script  from the attachment section of this article to your local system.
  2. Check the validity of the downloaded script file by checking the MD5 checksum.  
    • Filename, size and checksum values details:

      Filename: vrni-cluster-shutdown-script.sh
      File size: 14.5 KB
      Checksum Values:
      MD5: EAD841F0AF75CEE0D2C4DFEBE19BDB24
      SHA-1: E5E1FA7DECCA25FDA9188FAE5ED0B182DDA1966E
      SHA-265: 257DF2738CC48DD3F880F2E725A681952F9D305AF36CE00868E944373A313C2F

      To obtain the checksum values for what you have downloaded, use any one of the below commands :

      Type md5sum followed by the file name  
      Press Enter
      The md5sum of the file will be displayed

      or 

      Type sha1sum followed by the file name
      Press Enter
      The sha1sum sum of the file will be displayed

      or 

      Type sha256sum followed by the file name
      Press Enter
      The sha256sum sum of the file will be displayed



  3. Copy the Script to Platform Node1 in a Clustered deployment :

    You can use the scp (Secure Copy Protocol) command or any other secure file transfer method to copy the script to the platform 1 node in the directory /home/support..

    For example, using scp:

    scp vrni-cluster-shutdown-script.sh support@<platform1-IP>:/home/support/

    The above example will request the password which will be the support user password. 
    Alternatively, WinSCP tool can be used to copy the file to Platform1 under location /home/support.

  4. Steps to take snapshot  (create the revert point):

    Once the script is transferred to platform 1, execute the below mentioned steps:

      1. Manually shutdown Collector nodes from vCenter using  Power > Shut down guest OS.  The sequence does not matter.  
      2. Log into the platform 1 node via SSH using the support user.
      3. Enter the ub command to change from the support user to the ubuntu user.
      4. Enter the command sudo mv /home/support/vrni-cluster-shutdown-script.sh /home/ubuntu/vrni-cluster-shutdown-script.sh.
      5. Enter the command sudo chown ubuntu:ubuntu vrni-cluster-shutdown-script.sh.
      6. Enter the command sudo chmod +x vrni-cluster-shutdown-script.sh.
      7. Enter the command below to shut down all the platform nodes into a powered off state.

        ./vrni-cluster-shutdown-script.sh shutdown 127.0.0.1 "/home/ubuntu/vrni-cluster-shutdown-script$(date +%s).log"


      1. Verify all the platform and collector nodes are successfully powered off in vCenter GUI.
      2. Take snapshots of the platforms and collector nodes to create the revert points that can be used if necessary if something goes wrong during the maintenance. 
      3. Power on all the platform nodes in the cluster, order should be 1 to N (where N = the highest numbered platform node in your deployment).
      4. Verify all the platform nodes are powered on in vCenter Console. If you see service such as NTP is NOT-IN-SYNC. Retrying in [1/36[ with [signal], this is expected behavior as NTP service is masked when script was executed in step 7 above. No Action is needed to correct NTP, proceed to start the services on all the platforms using the command below: 

        ./vrni-cluster-shutdown-script.sh start-services 127.0.0.1 "/home/ubuntu/vrni-cluster-shutdown-script$(date +%s).log"


    1. After a period of time which usually does not exceed 15 minutes, log into the VCF Operations for Networks GUI as you normally would, and confirm that no apparent issues are seen.
    2. Power on all the Collector nodes from vCenter.
    3. Now login to VCF Operations for Networks GUI and proceed with performing the planned upgrade/maintenance.

  5. If something goes wrong during the maintenance, and you need to restore from the revert point, follow the steps below:

      1. Revert the snapshots of the platforms and collectors. Because the snapshots were taken in a powered off state, the reversion will leave the nodes in that powered off state. 
      2. Power on all the platform nodes in the cluster; order should be 1 to N (where N = the highest numbered platform node in your deployment).
      3. Verify all the platform nodes are powered on in vCenter UI.
      4. Run below command to start the services on all the platform nodes.

        ./vrni-cluster-shutdown-script.sh start-services 127.0.0.1 "/home/ubuntu/vrni-cluster-shutdown-script$(date +%s).log"


    1. After a period of time which usually does not exceed 15 minutes, log into the VCF Operations for Networks GUI as you normally would, and confirm that no apparent issues are seen.  
    2. Power on the Collector VM(s) on from vCenter GUI.  The sequence does not matter.
    3. Verify all the Collector nodes are powered on in vCenter UI.
    4. Log into the VCF Operations for Networks GUI as you normally would.  If you wish to discuss with Broadcom support before attempting the process again, please open a case with the VCF Operations for Networks team using the instructions at Creating and managing Broadcom support request (SR) cases

 

Additional Information

Important Notes:

  • For a simple deployment (only one platform node), the script is NOT needed.
  • Execute the script from platform node1 (never the collector node(s)).

Attachments

vrni-cluster-shutdown-script.sh get_app