qos_processor won't start after upgrade to 9.0.2

book

Article ID: 124969

calendar_today

Updated On:

Products

DX Infrastructure Management NIMSOFT PROBES

Issue/Introduction

After upgraded from 8.5.1 to 9.0.2, qos_processor won't start properly. From the loglevel 5 qos_processor probe the following error is reported prior to the probe failing to start:

Jan 18 12:09:33:155 [Qos Monitor Loader, qos_processor] QosProcessor Coordinator aborting: null
Jan 18 12:09:33:156 [Qos Monitor Loader, qos_processor] java.lang.NullPointerException
at com.nimsoft.slm.qos.processor.qosmessage.QosMonitor.<init>(QosMonitor.java:141)
at com.nimsoft.slm.qos.processor.qosmessage.dao.QosMonitorDao.loadQosMonitors(QosMonitorDao.java:108)

Deleting the probe, its directory and the directory contents then redeploying the probe does not correct this issue.

Cause

After successfully completing the 9.02 UIM Server and UMP upgrade installs, the primary hub robot was stopped to force a fail over to the HA backup hub that was still running the UIM 8.51 core probes (data_engine and qos_processor).  Due to a schema change in the UIM database this is not supported since one of the schema changes was to the creation of the value in the checksum field in the S_QOS_DATA table.  Prior to 9.02, the Qos, source, and target fields of the first instance of a QoS message were used to compute the checksum hash value,  In 9.02, the value of the origin field was added to the hash value.  This causes corruption in the S_QOS_DATA table.

Environment

UIM Server:  9.02
qos_processor:  9.02

Resolution

The minimal preferred upgrade method is as follows:

1. Deactivate the HA probe on the primary HA backup hub. This is to prevent a fail over to the primary HA backup hub.
2. Run the 9.02 UIM installer
3. Run the 9.02 UMP installer
4.  Download the current GA version of the hub and robot_update packages from the Web Archive into the local archive on your primary hub (7.97 at the time of publication of this document)

NOTE:  there is a known memory leak in the 7.96 and 7.97 versions of the robot_update package.  It is highly recommended that the 7.97HF3 (or later) robot_update hot fix be downloaded from the CA UIM Hotfix Index support site, imported into the local archive on the primary hub and deployed instead of the 7.97 GA version

5.  Upgrade the robot on the primary hub (7.97 or 7.97HF3)
6.  Upgrade the hub on the primaruy hub (7.97)
7. Upgrade the robot on the HA backup hub (7.97 or 7.97HF3)
8. Put a copy of the <installPath>\Nimsoft\security\certificate.pem file from the primary hub on the HA backup hub
9. Add the following key to the robot.cfg file on the HA backup hub in the <controller> section:

cryptkey = <absolute path to the copy of>\certificate.pem

10. Restart the robot on the HA backup hub
11. Deploy the 7.97 hub to the HA backup hub
12. Deploy all the the remaining 9.02 core probes from the primary hub to the HA backup hub
13. Activate the HA probe on the primary HA backup hub.

At this point you should be able to fail over the the HA backup hub with no problems.