Replication error across Oracle database in VIP Authentication Hub
search cancel

Replication error across Oracle database in VIP Authentication Hub

book

Article ID: 391762

calendar_today

Updated On: 03-24-2025

Products

VIP Authentication Hub

Issue/Introduction

VIP Authentication Hub version 3.3. Deployed across 3 datacenters, 3 oracle databases, configured for Golden Gate replication. Below error is observed, the recurrence is rare but it happens once in a while.

Oracle GoldenGate Delivery for Oracle process started, group <Groupname> discard file opened: 2025-03-17 07:54:26.565746
Current time: 2025-03-17 07:54:28

Discarded record from action ABEND on error 1403

OCI Error No data found (status = 1403), SQL <UPDATE "DB01M"."T_FLOW_STATE" x SET x."DATA_JSON" = EMPTY_CLOB(),x."EXPIRES_AT" = :a2,x."DEVICE_CODE" = :a3,x."USER_CODE" = :a4,x."AZ_CODE" = :a5,x."STATUS" = :a6,x."CLIENT_ID" = :a7,x."REQUEST_DATA_JSON" = :a8,x."ISSUED_AT" = :a9,x."CLIENT_TENANT_ID" = :a10,x."CREATED_DATETIME" = :a11,x."UPDATED_DATETIME" = :a12,x."TENANT_ID" = :a13 WHERE x."FLOWSTATE_ID" = :b0 RETURNING x."DATA_JSON" INTO :dl0>

Aborting transaction on /app/oracle/product/goldengate/trailstore/AH/dirdat/<Groupname>/r1 beginning at seqno 16 rba 231,956,622

                         error at seqno 16 rba 231956958

Problem replicating DB02.DB01M.T_FLOW_STATE to DB01.DB01M.T_FLOW_STATE.
Record not found
Mapping problem with compressed update record (target format) SCN:3455991927.26.2.12371...

Environment

VIP Authentication Hub 3.3

Resolution

The issue may arise if all three processes—sign-in flow, scheduler-driven delete operations, and GoldenGate replication—execute simultaneously. This creates a scenario where a replicated delete request might reach the GoldenGate agent before a replicated update request, leading to potential conflicts.

This is a typical distributed system issue and there are only 2 ways to deal with it - either the data owner (SSP) doesn't make the data reader (GG) deal with it or the data reader (Golden Gate) does not mind ignoring such issues.  

Customers are well advised to adjust this specific Golden Gate replication for iamauth schema to ignore such replication situations. Details about it as follows -

This is how to configure GG to handle this scenario. It comes up all the time and is a known issue in distributed data replication scenarios.

Oracle GoldenGate can be configured to ignore replication errors, including those that occur when data to be replicated no longer exists on the target machine, using the REPERROR parameter and its associated options. 
Here's a breakdown of how you can handle such scenarios:
Understanding the Problem:

  • Missing Data:

Sometimes, data that was intended to be replicated might be deleted or modified on the source database before GoldenGate can replicate the changes to the target.

  • Replicat Errors:

This can lead to errors on the target side, as Replicat might try to apply a DML operation (like an insert or update) on data that no longer exists.

  • Default Behavior:

By default, Replicat will typically abend (stop) processing when it encounters an error, including those related to missing data. 
Solutions:

  • REPERROR Parameter:

The REPERROR parameter allows you to configure how Replicat handles errors, including those related to missing data.

  • DISCARD Option:

You can use the REPERROR DISCARD option to instruct Replicat to ignore the error and continue processing the rest of the transaction.

  • TRANSDISCARD Option:

If you want to discard the entire transaction (including all operations within it) when an error occurs, you can use REPERROR TRANSDISCARD.

  • TRANSEXCEPTION Option:

This option allows you to specify that an error should be treated as an exception and discarded, allowing Replicat to continue processing other operations in the transaction.

  • ABEND Option:

If you want Replicat to stop processing the transaction and abend, you can use REPERROR ABEND.

  • Global vs. MAP Statement:

You can apply REPERROR globally (affecting all tables) or within a MAP statement (affecting specific tables).

From product perspective we can adjust delete LCM to wait some period of time before actually  deleting the expired entity.  This will give the flow described above the opportunity to complete in case it was started just at the end of the expiry period.  Will add such parameter/default in 3.4's scheduler configuration as a way to deal with this highly rare occurrence.