We migrated last night from Oracle 12c to Oracle 19c database for our UIM environment. We noticed today that the data_engine queue turns yellow approximately every 10 minutes. It queues QOS data for a while then re-connects and then the queue clears again.
Oracle INSTANT client was still running v18 and had not been updated on the Primary hub
data_engine.log errors noticed during the time frame in which the data_engine stopped processing messages and the queue turned yellow are displayed below:
May 5 12:33:39:291 [139921914328832] 0 de: data_engine [QoS] [QoSData] - InsertQosObjectOracle [QoSData] data_engine [QoS] status: -1
OCI_ERROR - ORA-00001: unique constraint (NIMSOFTSLM.UQ_S_QOS_DATA) violated
OCIEnv: 0x0x7f42a400c4a0 OCIAuthInfo: 0x0x7f42a401e2e0 OCISvcCtx: 0x0x7f42a402d1b8
- Error: InsertQosObjectOracle [QoSData] data_engine [QoS] status: -1
May 5 12:33:39:291 [139921914328832] 3 de: allocateOCIErrorHandle OCIHandleAlloc errhp...
May 5 12:33:39:291 [139921914328832] 4 de: OCIPrep - (StmtPrepare) preparing oracle statement for SQL(SELECT S_QOS_DATA_table_id_ASQ.CURRVAL FROM DUAL)
May 5 12:33:39:291 [139921914328832] 4 de: OCIPrep - (OCIAttrGet) getting oracle statement type.
May 5 12:33:39:292 [139921914328832] 3 de: GetCurrentTableID - current tableid: 6646301
After the Oracle INSTANT client was updated on the Primary hub via yum (RHEL 8, Linux) from version 18 to Oracle INSTANT client version 19.9 (latest GA), the data_engine no longer stopped processing QOS data, nor did the queue turn yellow, and the log no longer showed any instances of the error->
OCI_ERROR - ORA-00001: unique constraint (NIMSOFTSLM.UQ_S_QOS_DATA) violated.
When the data_engine was deactivated it would not release its port.
Changes made in the DX UIM PROD environment
-------------------------------------------------------------------------------
data_engine
---------------------
Via Raw Configure...
Increased data_engine loglevel from 3 to 5 for troubleshooting purposes
Hub
-------
Via Raw Configure, we deleted the bulk_size setting for the data_engine ATTACH queue since it was set to 25000 but it should be empty (default).
Increased postroute_reply_timeout from the default of 180 to 300.
postroute_reply_timeout determines how long the hub will wait for a reply from any queue/subscriber after sending messages
For more details, please refer to:
hub configuration - timeout, retry and other settings (explained)
https://knowledge.broadcom.com/external/article/97954
Don't set ATTACH queues manually; for greater message throughput, set the bulk size for GET queues only.